PROJECT: Data Extraction Using Scrapy and Generate Results in HTML
CODE NAME: “SCRAPY DOO”
NOTE: The project details have been shortened in order to comply with the character limit during the submission process. For the full project specifications please see the attached specifications file.
PROJECT OBJECTIVE
To build a news feed by extracting and storing data from several sources utilizing Scrapy and Beautiful Soup to scrape and parse data directly from the website or an RSS feed and by using an official API. The news feed will be generated in real-time by using a custom built server-side data processing and display system that will parse the extracted and stored data and then return filtered results based on user predefined criteria.
FUNCTIONALITY
The news feed will be populated with data from 3 press release (PR) websites and focus specifically on PRs that contain news about NYSE and NASDAQ listed stocks with all other unrelated PRs being disregarded. An existing separate database on the same server will then be accessed to retrieve 8 stock fundamental data points for the stock referred to in the PR and display this corresponding stock fundamental data below the news story.
Along with these 4 PR websites, there will be several other official U.S. Government RSS feeds that will also be scraped for content. These sources will be fully scraped and all the available data extracted and stored, therefore no filters or detailed extraction rules need to be set up as all the data that is displayed in the RSS feed will be used.
In addition there will be a simple form built to allow an admin editor to manually enter news content details and post news stories. This will be a basic html form with several fields and when submitted it will enter this inputted data into the database.
The display system which generates results on an HTML page based on the user predefined filter criteria will be done by a server-side data processing system that will parse the extracted and stored data and then return the filtered results.
PROJECT CORE TECHNOLOGIES AND LANGUAGES
HTML, Bootstrap, JavaScript, CSS, MySQL, PHP, Python, Scrapy, Beautiful Soup, and Swagger will be the core technologies and languages used in this project. If the freelance developer on this project has any additional technologies they would like to utilize, please ask the project manager to ensure they are compatible with the project server environment.
OTHER PROJECT NOTES
A background in finance, investing, stock trading, or stock analysis is helpful with this project, but is not required.
The freelance developer will be required to sign a Non Disclosure Agreement (NDA) and a Source Code License Agreement (SCLA), this is non-negotiable.
This project has a fixed budget and this is non-negotiable. For an experienced and capable developer this entire project should not take more than a few hours at most to complete, therefore this project budget is capped at $300 or $50 an hour at 6 hours of work.
BID SUBMISSIONS
Please communicate the following information when bidding on this project:
1.) A list of your qualifications including years of experience and the programming languages you are familiar with. Also communicate if you have any type of finance, investing, stock trading, or stock analysis experience or interest.
2.) If you are able to complete this project exactly as specified and any issues that you foresee with the construction of this project.
3.) The technologies and languages you plan on using to accomplish the project tasks.
4.) The timeframe in which you could have the project completed.
POINT OF CONTACT
The project manager is open to discussing any ideas or suggestions that the freelance developer has which will ultimately lead to efficiency improvements. The project manager will also communicate directly with the developer via videoconference before any work is initiated to ensure that the entire scope of the project is understood. If the developer needs further detailed specifications or has any questions regarding this project please contact the project manager Michael Clark directly.
Thank you for your interest and time in considering this project.
NOTE: The project details have been shortened in order to comply with the character limit during the submission process. For the full project specifications please see the attached specifications file.
Budget: $300
Posted On: January 03, 2021 23:14 UTC Category: Data Extraction Skills:Web Scraper, Python, Scrapy, Beautiful Soup, Data Scraping, Data Extraction, Bootstrap, JavaScript, MySQL, Swagger, CSS, HTML
Skills: Web Scraper, Python, Scrapy, Beautiful Soup, Data Scraping, Data Extraction, Bootstrap, JavaScript, MySQL, Swagger, CSS, HTML Country: United States
click to apply
Project ID:
3126607
Project category:
Web Scraper, Python, Scrapy, Beautiful Soup, Data Scraping, Data Extraction, Bootstrap, JavaScript, MySQL, Swagger, CSS, HTML