1001 Freelance Projects -- Assignment: Web Crawling of Medication Data

Latest Projects from Freelance Marketplaces

Today is: 02-Aug-2025 14:32 GMT

View this project in detail (Note: you will be redirected to external marketplace)
Project title:	Assignment: Web Crawling of Medication Data
Posted by:	External project from PeoplePerHour
Started:	18-Nov-2024 04:24 GMT
Description:	Here's a refined assignment outline for your software freelancer project, including necessary clarifications and added details for crawling medication data: --- ### Assignment: Web Crawling of Medication Data for Inhaled Drugs (ATC Code R03) #### Project Overview Develop a web scraping tool to extract data on inhaled medications (ATC code R03) from the official medical and pharmaceutical websites of Spain, Argentina, and Chile. The extracted data should be formatted into a CSV file with specific columns as defined in the attached Excel file. #### Target Websites - Spain: [CIMA](https://cima.aemps.es/cima/publico/listaclinica.html) (AEMPS - Agencia Española de Medicamentos y Productos Sanitarios) - Argentina: Relevant national pharmaceutical website(s) - Chile: Relevant national pharmaceutical website(s) Note: The freelancer must identify and ensure access to reliable sources for Argentina and Chile. #### Objectives - Extract all inhaled medication with ATC code R03. - Populate a CSV file with the specified structure (refer to the attached Excel sheet for column specifications). #### Data Structure Each entry in the CSV should include, but is not limited to, the following columns: - `Name` - `Drug Format` - `Dosage` - `Drug Class` - `Drug Type` - `ATC Code` - `Total Puffs per Inhaler` - `Puffs a Day` - `Link` to the source page - `Image URL` of the product - `Video URL` (if available) - Additional data fields based on information structure on the source websites Refer to the attached Excel file for the exact layout and any additional required fields. #### Example of Data Format ```json { "dosage": 322, "drugFormat": "HFA", "drugClass": "ICS", "drugType": "ACLIDINIO BROMURO", "id": "1a478549-31ca-d61f-a9da-35cdc4616230", "image": "https://cima.aemps.es/cima/fotos/thumbnails/formafarmac/12778002/12778002_formafarmac.jpg", "link": "https://cima.aemps.es/cima/publico/detalle.html?nregistro=12778002", "name": "EKLIRA GENUAIR 322 MICROGRAMOS POLVO PARA INHALACION", "puffsADay": 0, "shortName": "ACLIDINIO BROMURO", "totalPuffsPerInhaler": 60, "video": "https://zorgatlasweb.nl/inhalatie/Films_Nederlands/Dosisaerosol.mp4", "atc_code": "R03BB05" } ``` #### Requirements 1. Web Scraping Framework: Use Python libraries such as `BeautifulSoup` and `Selenium` for scraping. Alternatively, any proven web scraping tool is acceptable. 2. Data Cleaning: Ensure consistency in data formatting and structure. 3. Output File: The final output must be a CSV file with columns as defined in the attached Excel template. 4. Documentation: Provide a clear README file explaining: - How to run the scraping tool. - Requirements for execution (e.g., Python version, dependencies). - Any challenges or limitations encountered during the project. #### Important Notes - The tool should handle potential changes in website structure with minimal adjustments. - Data accuracy and integrity are paramount; cross-verify results to ensure reliability. - Respect website terms of service and avoid overloading sites with requests (implement proper rate limiting). - Ensure proper handling of errors and unexpected issues (e.g., CAPTCHA). #### Deadline Specify the project completion date and any milestones for progress updates. #### Deliverables 1. A Python script or package for web crawling. 2. A CSV file with structured data as per the given specifications. 3. README file with instructions for running the script and an overview of the project. #### Contact Reach out for clarification or further questions related to project details. --- Make sure the assignment template is aligned with your needs and modify as needed.
Project ID:	3409093
Project category:
Project budget:
View this project in detail (Note: you will be redirected to external marketplace)

Project	Started
Assistance with Government Contracts for Real Estate Category: Contracts, Legal, Legal Research, Project Management, Research Budget: $12.5 - $25 USD	02-Aug-2025 10:23 GMT
Comprehensive Flipkart Account Management and Setup Category: Bengali Translator, Data Entry, Excel, Writing Budget: ₹750 - ₹1250 INR	02-Aug-2025 10:04 GMT
Tự Động Tải Dữ Liệu .NET Category: .NET, ASP.NET, Automation, C#, Programming, Data Processing, Database Management, HTML, JSON, SQL, Web Scraping Budget: $250 - $750 USD	02-Aug-2025 10:03 GMT
Promote Kutchi Crafts - Website Development Category: SEO, Web Development, Web Design Budget: ₹12500 - ₹37500 INR	02-Aug-2025 10:03 GMT
WE'RE LOOKING FOR INFLUENCERS! Category: Artificial Intelligence, Copywriting, Entrepreneurship, Influencer Marketing, Motivational Speaking, Social Media Copy, Social Media Marketing Budget: $30 - $250 USD	02-Aug-2025 10:03 GMT
Event Search Automation System Development Category: Bash, Data Extraction, PHP, Python, Scripting, Software Architecture, UNIX, Web Scraping Budget: $10 - $30 USD	02-Aug-2025 10:02 GMT
Modern Brand Guideline & Visual Identity Creation -- 2 Category: Brochure Design, Corporate Identity, Covers & Packaging, Graphic Design, Logo Design Budget: $75 - $150 USD	02-Aug-2025 10:01 GMT
Mobile-Only Work Opportunity Category: Customer Service, Data Entry, Internet Research, Time Management Budget: ₹600 - ₹1500 INR	02-Aug-2025 10:01 GMT
Voice Recording Project for AI Training (French) [France] Category: AI (Artificial Intelligence) HW / SW, Audio Production, Audio Services, French Translator, Voice Acting, Voice Over, Voice Talent Budget: $90 USD	02-Aug-2025 10:01 GMT
Dragon Boat Training & Regatta Manager Category: Conflict Resolution, Event Planning, Logistics, Marketing, Project Management, Public Relations, Risk Management, Time Management Budget: €5000 - €10000 EUR	02-Aug-2025 10:00 GMT
Modern Social Media Graphics Category: Adobe Creative Cloud, Adobe Illustrator, Photoshop, Graphic Design, Illustration, Logo Design, Social Media Marketing Budget: ₹12500 - ₹37500 INR	02-Aug-2025 10:00 GMT
Google Ads & Social Media Ads Manager Category: Advertising, Analytics, Facebook Marketing, Google Ads, Instagram Marketing, Internet Marketing, Lead Generation, Social Media Marketing Budget: $15 - $25 AUD	02-Aug-2025 10:00 GMT
Full-Service Marketing Agency for Global Reach Category: Advertising, Content Marketing, Creative Writing, Graphic Design, Logo Design, Marketing, Slogans, Translation Budget: ₹12500 - ₹37500 INR	02-Aug-2025 09:59 GMT
Social Media Promo Video Editing Category: A / V Editing, Adobe Premiere Pro, Final Cut Pro, Video Editing, Video Production, Video Services, Videography Budget: ₹750 - ₹1250 INR	02-Aug-2025 09:58 GMT
Urgent: College-Level Content Proofreading Category: Academic Research, Academic Writing, Content Development, Content Writing, Editing, English (US) Translator, English Grammar, Mathematics, Proofreading, Technical Writing Budget: ₹600 - ₹1500 INR	02-Aug-2025 09:57 GMT

Browse All Projects

Projects by Skills ...
Projects for 'android' Projects for 'ajax' Projects for 'asp' Projects for 'aspnet' Projects for 'cms' Projects for 'cpp' Projects for 'csharp' Projects for 'css' Projects for 'delphi' Projects for 'design' Projects for 'drupal'	Projects for 'excel' Projects for 'facebook' Projects for 'flash' Projects for 'html' Projects for 'java' Projects for 'javascript' Projects for 'joomla' Projects for 'iphone' Projects for 'mysql' Projects for 'photoshop' Projects for 'php' Projects for 'python'	Projects for 'ruby' Projects for 'seo' Projects for 'sql' Projects for 'sysadm' Projects for 'translate' Projects for 'typing' Projects for 'twitter' Projects for 'vbnet' Projects for 'xml' Projects for 'wordpress' Projects for 'writing'
Read RSS feeds ... New!
RSS feed for 'android' RSS feed for 'ajax' RSS feed for 'asp' RSS feed for 'aspnet' RSS feed for 'cms' RSS feed for 'cpp' RSS feed for 'csharp' RSS feed for 'css' RSS feed for 'delphi' RSS feed for 'design' RSS feed for 'drupal'	RSS feed for 'excel' RSS feed for 'facebook' RSS feed for 'flash' RSS feed for 'html' RSS feed for 'java' RSS feed for 'javascript' RSS feed for 'joomla' RSS feed for 'iphone' RSS feed for 'mysql' RSS feed for 'photoshop' RSS feed for 'php' RSS feed for 'python'	RSS feed for 'ruby' RSS feed for 'seo' RSS feed for 'sql' RSS feed for 'sysadm' RSS feed for 'translate' RSS feed for 'typing' RSS feed for 'twitter' RSS feed for 'vbnet' RSS feed for 'xml' RSS feed for 'wordpress' RSS feed for 'writing'

New!
Проекты на русском (Projects in Russian)	Short URL: 1001fp.com	Mobile version: m.1001freelanceprojects.com