Head of Data Quality - RL Gyms

TuringSan Francisco, CA
2h

About The Position

Turing is looking for a Head of Data Quality, RL Environments to build and lead the quality function for all reinforcement learning (RL) environment and trajectory data used to train and evaluate models at frontier AI labs. You will manage a team of Data Quality Leads who operate like researchers in a frontier AI lab—designing tasks, stress tests, and evaluation protocols for complex RL environments (simulated, real-world, and tool-based). Your role is to set the bar for what “high-quality RL environment data” means and ensure our environments, trajectories, rewards, and evaluations are robust, diverse, and aligned with cutting-edge GenAI and RL research. You’ll bring together: Deep understanding of RL environments, agents, and trajectories, Prior experience with ML/AI / RL / GenAI systems, and Strong organizational and people leadership to create a research-grade quality organization for RL environments and agent interaction data.

Requirements

  • Bachelor’s degree in Computer Science, Mathematics, Engineering, or a related field; or equivalent practical experience.
  • Strong technical background, including experience with: Python as a primary language RL or simulation frameworks (e.g., OpenAI Gym / Gymnasium–style APIs, custom simulators, or game engines)
  • 7+ years total experience in software engineering, ML/AI, RL, simulation, or related fields.
  • 3+ years managing technical teams (e.g., research, data science, RL / simulation, data quality, or engineering).
  • Hands-on experience with ML/AI systems, with a strong preference for: RL, RLHF/RLAIF, or agent-like systems (tool-using, web, or embodied agents) Environment or benchmark design, or large-scale agent evaluation
  • Prior exposure to data annotation / human feedback / human evaluation processes, including: Designing rubrics and tasks for human raters Working with preference data or trajectory labeling
  • High-level understanding of modern GenAI and RL / agents trends, such as: LLM-based agents interacting with tools or environments Reward shaping, curriculum learning, and preference modeling Safety, alignment, and robustness for agents in complex environments
  • Strong grasp of data and environment quality principles: Environment correctness, coverage, and diversity Reward design pitfalls and reward hacking detection Human evaluation quality, calibration, and inter-rater reliability
  • Ability to read ML/RL/AI research papers and translate them into: New environment or task requirements Evaluation and benchmarking strategies Concrete annotation and quality-control workflows
  • Excellent communication and leadership skills; comfortable setting direction and making tradeoff decisions in ambiguous, fast-changing domains.

Nice To Haves

  • Graduate degree (MS/PhD) in Computer Science, Machine Learning, Robotics, or related field.
  • Experience working in or closely with a research lab or frontier AI organization focused on RL, agents, or aligned systems.
  • Direct experience with: Designing RL benchmarks, simulators, or environment suites RLHF/RLAIF pipelines or large-scale human feedback collection Multi-agent or multi-task environments
  • Familiarity with game engines or simulation platforms (e.g., Unity, Unreal, MuJoCo, Isaac, Habitat, or similar).
  • Background in statistics and experimental design, especially for: Human feedback experiments A/B testing of environment or reward variants
  • Experience in high-growth startup or similarly dynamic environments.

Responsibilities

  • Own the RL Environment Data Quality Vision & Strategy
  • Lead & Develop Data Quality Leads
  • Design Research-Grade Evaluation & Quality Systems for RL Environments
  • Translate AI & RL Research Trends into Environment and Data Requirements
  • Partner Across Operations, Product, and Customers
  • Build Tools, Processes, and Documentation

Benefits

  • Amazing work culture (Super collaborative & supportive work environment; 5 days a week)
  • Awesome colleagues (Surround yourself with top talent from Meta, Google, LinkedIn etc. as well as people with deep startup experience)
  • Competitive compensation
  • Flexible working hours

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Number of Employees

1,001-5,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service