About The Position

We are sourcing independent Finance and Insurance Specialists to provide their expertise for an AI benchmark evaluation project. As AI models increasingly generate professional-grade financial and insurance deliverables, their accuracy relies entirely on robust, expert-crafted training data. The objective of this project is to autonomously produce high-quality evaluation tasks, strong prompts, and clear, well-structured rubrics that generate clean, reliable data for model training. Project Deliverables & Scope Operate autonomously to design complex evaluation frameworks and provide structured training data. Expected deliverables include: Task & Prompt Creation: Generating realistic, high-quality prompts that compel the AI model to produce complex, professional-grade deliverables specific to the finance and insurance sectors. Rubric Development: Writing clear, well-structured evaluation rubrics with criteria that are highly specific, non-ambiguous, and easy to score. Benchmark Evaluation Data Generation: Producing clean, reliable training data that directly aids in the evaluation and refinement of AI models handling industry-specific tasks. Quality Assurance & Fact-Checking: Ensuring all generated tasks and scoring criteria reflect strict, real-world industry standards and regulatory realities.

Requirements

  • Demonstrable professional expertise within the Finance and Insurance sectors, with a deep understanding of industry standards, terminology, and high-level deliverables.
  • Strong writing and prompt generation skills, with the ability to design highly realistic, complex task scenarios for AI evaluation.
  • Proficiency in rubric generation, specifically the ability to create objective, non-ambiguous scoring criteria that leave no room for subjective interpretation.
  • A meticulous, detail-oriented approach to generating clean, reliable data for system benchmarking.
  • Supply a secure computer and high‑speed internet

Responsibilities

  • Task & Prompt Creation: Generating realistic, high-quality prompts that compel the AI model to produce complex, professional-grade deliverables specific to the finance and insurance sectors.
  • Rubric Development: Writing clear, well-structured evaluation rubrics with criteria that are highly specific, non-ambiguous, and easy to score.
  • Benchmark Evaluation Data Generation: Producing clean, reliable training data that directly aids in the evaluation and refinement of AI models handling industry-specific tasks.
  • Quality Assurance & Fact-Checking: Ensuring all generated tasks and scoring criteria reflect strict, real-world industry standards and regulatory realities.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service