I seek a software developer to create a plagiarism detection application. The tool should have the ability to analyze written work uploaded by users and check for duplicated or improperly attributed content by comparing it to information from various online sources. Specifcially, the program must utilize web scraping techniques to build an internal database of published works that can be referenced when new submissions are received. It then needs to parse the uploaded files, extract text, and compare it on a word for word or phrasal basis to the archive in order to identify matching segments and generate originality reports.
Additional requirements include support for multiple file formats like doc, pdf and odt, configurable sensitivity settings to handle various levels of plagiarism, and a user-friendly interface to display scan results and percent matches found. The successful bidder will have a strong portfolio demonstrating expertise in NLP, web scraping, database management and frontend design skills to create a robust plagiarism detection application as outlined. Experience developing similar tools and APIs would be beneficial.