We are sourcing independent Finance and Insurance Specialists to provide their expertise for an AI benchmark evaluation project. As AI models increasingly generate professional-grade financial and insurance deliverables, their accuracy relies entirely on robust, expert-crafted training data. The objective of this project is to autonomously produce high-quality evaluation tasks, strong prompts, and clear, well-structured rubrics that generate clean, reliable data for model training. Project Deliverables & Scope Operate autonomously to design complex evaluation frameworks and provide structured training data. Expected deliverables include: Task & Prompt Creation: Generating realistic, high-quality prompts that compel the AI model to produce complex, professional-grade deliverables specific to the finance and insurance sectors. Rubric Development: Writing clear, well-structured evaluation rubrics with criteria that are highly specific, non-ambiguous, and easy to score. Benchmark Evaluation Data Generation: Producing clean, reliable training data that directly aids in the evaluation and refinement of AI models handling industry-specific tasks. Quality Assurance & Fact-Checking: Ensuring all generated tasks and scoring criteria reflect strict, real-world industry standards and regulatory realities.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Education Level
No Education Listed
Number of Employees
501-1,000 employees