About The Position

This role focuses on designing rigorous, research-grade computational problems that assess how effectively AI systems can leverage real scientific software tools to solve complex challenges. Unlike traditional annotation roles, this position requires creating original, graduate-level problems rooted in real-world scientific workflows. You will iteratively refine these problems through calibration against state-of-the-art AI models, ensuring the right balance of difficulty, depth, and reasoning complexity.

Requirements

  • Graduate-level expertise (MS or PhD preferred) in a relevant STEM field
  • Hands-on experience using scientific software libraries for real research problems
  • Strong Python programming skills, including building computational workflows and validators
  • Ability to design challenging problems that require deep reasoning rather than surface-level solutions
  • Familiarity with edge cases, limitations, and practical challenges of scientific tools
  • Demonstrated proficiency with at least one relevant scientific library (via research, open-source work, or industry experience)
  • Ability to work independently and iterate based on feedback
  • Comfort working in Linux/terminal environments and remote compute setups
  • Availability of at least 15–20 hours per week

Nice To Haves

  • Experience across multiple domains or tools
  • Background in evaluation frameworks or benchmarking
  • Experience in teaching, pedagogy, or problem-set design
  • Familiarity with reproducible research practices and containerized environments

Responsibilities

  • Design advanced computational problems requiring the use of domain-specific scientific software
  • Create tasks that test both precise execution (multi-step workflows, simulations) and strategic reasoning (experiment design, inference from partial data)
  • Develop problem setups, solution pathways, and validation mechanisms
  • Calibrate and refine tasks based on model performance to achieve target difficulty levels
  • Ensure problems emphasize reasoning strategy over brute-force computation

Benefits

  • Competitive compensation based on expertise and domain specialization
  • Weekly payments via supported global payment platforms
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service