Member of Technical Staff, Model Evaluation

MirendilUnited States, CA
Remote

About The Position

Mirendil is a tech-first company focused on solving core bottlenecks that unlock step-change acceleration across science and technology. Our first goal is to democratize frontier AI R&D across scientific disciplines. We believe accelerating scientific discovery is one of the most powerful ways to improve the future of humanity, and that AI will play a central role in making that possible. We are building a frontier AI research company and training our own models end-to-end. Our work spans areas such as model training, reinforcement learning, reasoning systems, and infrastructure for large-scale experiments. Our team includes researchers and engineers from Anthropic, Google DeepMind, xAI, OpenAI, Microsoft, Apple, and MIT.

Requirements

  • Build the evaluation infrastructure that tells us whether our models are getting better in ways we care about.
  • Own the frameworks, pipelines, and tooling that measure model behavior across capabilities.

Responsibilities

  • Design and build evaluation frameworks that measure model capabilities along realistic axes, beyond standard benchmarks.
  • Build automated eval pipelines and regression-detection systems that run continuously and surface signal quickly.
  • Develop agent-assisted workflows for humans to efficiently inspect model behavior.
  • Instrument training runs with observability tooling so researchers can understand what's changing in model behavior, and why.
  • Partner with post-training and RL teams to close the loop between eval signal and training decisions.

Benefits

  • Competitive benefits
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service