I need a freelancer to prepare benchmark questions and answers for testing a custom LLM’s reasoning ability.
Scope: Question Set: Collect 500–600 LLM benchmark questions with correct answers. Focus areas: logical, mathematical, commonsense, analytical, and multi-step reasoning. Deliver as JSON or CSV. Python Script: Load questions and send them to an LLM (I'll handle API integration). Compare model answers to correct ones. Output a simple accuracy report. Requirements: Knowledge of LLMs, reasoning datasets, or NLP is preferred. Clean, documented code. Use only open or original questions.
Automatización de AI para Agencia Category: AI Agents, AI Chatbot, AI Consulting, AI Development, Graphic Design, Illustration, Logo Design, Photoshop Budget: $250 - $750 USD
Pagina web con base de datos Category: Graphic Design, HTML, MySQL, PHP, Web Design, WordPress Budget: $50 - $100 USD
25-Jul-2025 22:00 GMT
Meta Data Set Integration & Event Tracking Category: API Integration, C#, Programming, Data Integration, Digital Marketing, Event Management, Google Analytics, JavaScript, Meta Pixel, PHP Budget: $10 - $30 USD
Residential Proxy Service Website Development Category: API Development, C#, Programming, Data Protection, HTML, MySQL, PHP, User Interface / IA, Web Development Budget: $500 - $2000 USD