Research Engineer - Post Training

TuringPalo Alto, CA
5h$170,000 - $220,000

About The Position

About Turing Based in San Francisco, California, Turing is the world’s leading research accelerator for frontier AI labs and a trusted partner for global enterprises looking to deploy advanced AI systems. Turing accelerates frontier research with high-quality data, specialized talent, and training pipelines that advance thinking, reasoning, coding, multimodality, and STEM. For enterprises, Turing builds proprietary intelligence systems that integrate AI into mission-critical workflows, unlock transformative outcomes, and drive lasting competitive advantage. Recognized by Forbes, The Information, and Fast Company among the world’s top innovators, Turing’s leadership team includes AI technologists from Meta, Google, Microsoft, Apple, Amazon, McKinsey, Bain, Stanford, Caltech, and MIT. Learn more at www.turing.com About the Role Data is the lifeblood of advanced AI. Our R&D Research Engineers build the foundational systems that generate, refine, and evaluate high-quality data at unprecedented scale—directly fueling the next generation of reasoning and coding agents. This role sits at the intersection of data scaling, post-training, and reinforcement learning . You’ll work on core algorithms and pipelines that transform pre-trained models into steerable, versatile systems capable of solving real-world problems. From designing RL environments to crafting data mixtures and reward models, your work will power both cutting-edge research and production-grade AI. Why This Role Matters This is a rare opportunity to work at the frontier of model training—owning the full loop from data generation and reward modeling to reinforcement learning and evaluation. Your work will directly accelerate our mission to develop AI systems that reason, code, and generate new knowledge.

Requirements

  • Strong software engineering and systems-building skills.
  • Deep understanding of machine learning and fine-tuning of large language models (LLMs).
  • Hands-on experience improving model behavior through data-driven methods and reinforcement learning (SFT, PPO, DPO, or similar) .
  • Familiarity with large-scale data generation and evaluation pipelines.
  • Experience developing benchmarks and evaluation metrics for reasoning, coding, or multi-agent systems.
  • Proven ability to design or optimize models for complex challenges such as multi-modality, long-context reasoning, or multi-agent orchestration.

Nice To Haves

  • Experience building or scaling infrastructure for large-scale RL or post-training.

Responsibilities

  • Design and implement large-scale data pipelines for RL, SFT, and post-training workflows to create and refine high-quality datasets.
  • Develop and iterate on simulated or interactive environments to train, evaluate, and stress-test reasoning and coding agents.
  • Advance reinforcement learning algorithms and generalizable reward models to improve model reasoning, coding, and decision-making.
  • Define and improve data quality and evaluation metrics to ensure models are learning from the best possible signals.
  • Implement scalable model evaluation frameworks to measure progress in reasoning, code generation, and agentic capabilities.
  • Collaborate cross-functionally with research, data, multimodal, and product teams to bring cutting-edge research into real-world impact.

Benefits

  • Amazing work culture (Super collaborative & supportive work environment; 5 days a week)
  • Awesome colleagues (Surround yourself with top talent from Meta, Google, LinkedIn etc. as well as people with deep startup experience)
  • Competitive compensation
  • Flexible working hours

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

1,001-5,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service