About The Position

Mirendil is a tech-first company focused on solving core bottlenecks that unlock step-change acceleration across science and technology. Our first goal is to democratize frontier AI R&D across scientific disciplines. We believe accelerating scientific discovery is one of the most powerful ways to improve the future of humanity, and that AI will play a central role in making that possible. We are building a frontier AI research company and training our own models end-to-end. Our work spans areas such as model training, reinforcement learning, reasoning systems, and infrastructure for large-scale experiments. Our team includes researchers and engineers from Anthropic, Google DeepMind, xAI, OpenAI, Microsoft, Apple, and MIT.

Requirements

  • Build the data systems and execution environments that power reinforcement learning at Mirendil.
  • Own those systems end-to-end.

Responsibilities

  • Build and automate data collection pipelines for complex, long-horizon RL tasks.
  • Build robust systems to identify and prevent reward hacking.
  • Build scalable sandboxed execution environments for realistic tasks involving potentially multiple agents, nodes, and users.
  • Design systems to estimate the influence of training environments on production model behavior.
  • Collaborate with teams across the stack to identify potential axes of improvements in production model behavior, and develop training environments to push these axes.

Benefits

  • Competitive benefits
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service