Research Engineer - Post training & RL

Techire AiSan Francisco, CA
Onsite

About The Position

Want to build the simulated worlds that test what frontier models are really capable of? This is a chance to join a team advancing the science of post-training and scalable evaluation — building reinforcement learning environments that push reasoning, planning, and long-horizon behaviour to their limits. Instead of static benchmarks, you’ll create dynamic simulations that measure real intelligence — not just accuracy. You’ll design new post-training algorithms (RLHF, DPO, GRPO and beyond), develop richer reward models that move past exact-match scoring, and build evaluation frameworks that define how next-generation AI is trained, aligned, and understood. The work combines deep research with hands-on implementation — from writing papers to seeing your methods deployed in live systems. It’s ideal for researchers who care about bridging academic insight and practical impact , helping AI progress beyond metrics that no longer tell the whole story.

Requirements

  • Research experience in post-training, reinforcement learning, or evaluation for LLMs.
  • Strong understanding of transformer models and experimental design.
  • Publication record at leading venues (NeurIPS, ICLR, ICML, ACL, EMNLP).
  • PhD or equivalent research experience in CS, ML, NLP, or RL.

Responsibilities

  • Build simulated worlds to test frontier models.
  • Advance the science of post-training and scalable evaluation.
  • Build reinforcement learning environments that push reasoning, planning, and long-horizon behaviour.
  • Create dynamic simulations that measure real intelligence.
  • Design new post-training algorithms (RLHF, DPO, GRPO and beyond).
  • Develop richer reward models that move past exact-match scoring.
  • Build evaluation frameworks that define how next-generation AI is trained, aligned, and understood.
  • Combine deep research with hands-on implementation.
  • Write papers and deploy methods in live systems.

Benefits

  • 401k
  • Unlimited PTO
  • Relocation assistance
  • Sponsorship available
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service