We need an experienced Python developer with expertise in web scraping, machine learning, and automation to review, optimize, and run an existing domain crawler. The goal is to improve efficiency, accuracy, and reliability.
This project will be conducted in two stages:
Stage 1: Successfully Run the Existing Crawler Your first task is to set up and successfully run the crawler to validate that you can: - Install and configure all dependencies - Execute the crawler on a dataset of domain names - Generate valid output - Troubleshoot any initial setup issues
Once this is successfully demonstrated, we will move to Stage 2.
Stage 2: Propose a Better Solution or Manage Ongoing Runs - Depending on your expertise and analysis of the current crawler, you will either:
A) Propose & Implement Improvements (Rewrite or Optimize the Crawler) - Identify inefficiencies and suggest a better architecture - Improve performance & accuracy (e.g., async scraping, multiprocessing, better ML model) - Ensure scalability if running on large datasets - Refactor code for long-term maintainability OR B) Run & Maintain the Crawler Monthly - Execute the crawler on a scheduled basis - Monitor accuracy and adjust as needed - Fix bugs & ensure stability - Implement new features & improvements over time - Provide regular reports on data quality and insights
About the current crawler The current crawler is built in Python and classifies domain names based on their content (e.g., parked, active, expired). It utilizes: - Requests / Aiohttp (Web scraping) - Selenium (Browser automation) - BeautifulSoup (HTML parsing) - Langdetect / NLP (Language detection & text classification) - XGBoost (Machine learning classification)
Ideal Candidate: - Strong Python skills (async programming, OOP) - Web scraping & automation expertise (requests, Selenium, BeautifulSoup) - Machine learning experience (XGBoost, Scikit-learn, NLP) - Familiarity with databases (SQL/NoSQL) for storing results - Cloud/DevOps experience (if needed for large-scale deployment) - Ability to work independently and ensure high-quality results - High level in English
Project Timeline & Budget: Deliverables expected within 2-4 weeks Budget: Open to discussion based on experience & estimated work required.
Managed DirectAdmin VPS (Cloud)Hosting Category: Apache, Automation, Cloud Computing, Linux, PHP, System Administration, Technical Support, VPS, Web Hosting, Web Security Budget: $250 - $750 USD
20-Aug-2025 03:59 GMT
AI-Powered SEO Lead Engine Category: AI Chatbot Development, AI Content Creation, AI Development, AI Model Development, Content Marketing, Internet Marketing, Link Building, Marketing, SEO, Web Development Budget: $250 - $750 USD
20-Aug-2025 03:59 GMT
Build Wix Site for Consulting Category: Blog, Graphic Design, HTML, PHP, Web Design, Web Development, Wix Budget: $30 - $250 USD
Digital Product Design Portfolio Category: Adobe XD, Figma, Graphic Design, UI / User Interface, User Interface / IA, UX / User Experience, Web Design Budget: ₹1500 - ₹12500 INR
20-Aug-2025 03:56 GMT
Senior Google & Meta Ads Manager Category: Advertising, Analytics, Digital Marketing, Facebook Marketing, Google Ads, Google Adwords, Internet Marketing, Lead Generation Budget: $25 - $50 USD
20-Aug-2025 03:56 GMT
Animated Moral Shorts for Toddlers Category: 2D Animation, After Effects, Animation, Audio Editing, Caricature & Cartoons, Illustration, Sound Effects, Video Production Budget: $30 - $250 USD
Chartered Accountant Website Development Category: Content Management System (CMS), Graphic Design, HTML, PHP, User Experience Research, Web Design, Web Development Budget: ₹12500 - ₹37500 INR
20-Aug-2025 03:49 GMT
Unlock Microsoft Access File Category: .NET, Coding, Database Development, Database Management, Database Programming, Excel, Microsoft Access, Microsoft Office, Troubleshooting, Visual Basic Budget: $15 - $25 AUD
20-Aug-2025 03:48 GMT
Professional Brochure Design for Business Clients Category: Adobe Illustrator, Adobe InDesign, Brochure Design, Copywriting, Creative Design, Graphic Design, Illustration, Print Design Budget: ₹1500 - ₹12500 INR
20-Aug-2025 03:46 GMT
Finalize Logistics App Backend Category: API Development, API Integration, Backend Development, Database Design, Database Management, Database Programming, Web Application, Web Security Budget: $250 - $750 CAD
20-Aug-2025 03:45 GMT
Kickstarter Campaign for Orphanage Category: Crowdfunding, Event Management, Fundraising, Grant Writing, Marketing, Public Relations, Social Media Marketing Budget: ₹750 - ₹1250 INR