About The Position

We are sourcing independent Healthcare and Social Assistance Specialists to provide their expertise for an AI benchmark evaluation project. As AI models increasingly generate professional-grade medical summaries, patient care workflows, and social service documentation, their accuracy relies entirely on robust, expert-crafted training data. The objective of this project is to autonomously produce high-quality evaluation tasks, strong prompts, and clear, well-structured rubrics that generate clean, reliable data for model training. Project Deliverables & Scope Operate autonomously to design complex evaluation frameworks and provide structured training data. Expected deliverables include: Task & Prompt Creation: Generating realistic, high-quality prompts that compel the AI model to produce complex, professional-grade deliverables specific to healthcare operations, clinical guidelines, and social assistance programs. Rubric Development: Writing clear, well-structured evaluation rubrics with criteria that are highly specific, non-ambiguous, and easy to score. Benchmark Evaluation Data Generation: Producing clean, reliable training data that directly aids in the evaluation and refinement of AI models handling complex health and social care-related tasks. Quality Assurance & Fact-Checking: Ensuring all generated tasks and scoring criteria reflect strict, real-world clinical standards, medical accuracy, and social service regulatory guidelines.

Requirements

  • Demonstrable professional expertise within the healthcare, nursing, social work, or allied health sectors, with a deep understanding of medical terminology, patient care standards, and social assistance frameworks.
  • Strong writing and prompt generation skills, with the ability to design highly realistic, complex healthcare task scenarios for AI evaluation.
  • Proficiency in rubric generation, specifically the ability to create objective, non-ambiguous scoring criteria that leave no room for subjective interpretation.
  • A meticulous, detail-oriented approach to fact-checking medical literature, clinical protocols, and social service policies to generate reliable data for system benchmarking.

Responsibilities

  • Task & Prompt Creation: Generating realistic, high-quality prompts that compel the AI model to produce complex, professional-grade deliverables specific to healthcare operations, clinical guidelines, and social assistance programs.
  • Rubric Development: Writing clear, well-structured evaluation rubrics with criteria that are highly specific, non-ambiguous, and easy to score.
  • Benchmark Evaluation Data Generation: Producing clean, reliable training data that directly aids in the evaluation and refinement of AI models handling complex health and social care-related tasks.
  • Quality Assurance & Fact-Checking: Ensuring all generated tasks and scoring criteria reflect strict, real-world clinical standards, medical accuracy, and social service regulatory guidelines.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service