ML Engineer

Techire AiSan Francisco, CA
$250,000 - $400,000Hybrid

About The Position

Define how large-scale AI systems for scientific discovery are actually built, trained, and run in production. This team is building autonomous AI scientists that run full research loops — ingesting large bodies of literature, forming hypotheses, designing experiments, and producing traceable outputs already used across biotech and pharma. The challenge isn’t just model capability. It’s building the systems that allow these models to be trained, evaluated, and deployed reliably at scale. You’ll sit at the intersection of model training and systems — owning the infrastructure, pipelines, and experimentation platforms that make long-horizon reasoning systems possible. This is not research in isolation. It’s building the engine that research runs on. You’ll work closely with the wider team, translating ambiguous scientific problems into systems that can be trained, iterated on, and deployed in real-world environments. The company comes from one of the earliest groups working seriously on AI for science, including early language agents and AI-generated biological discoveries. They’re now pushing further with systems capable of reasoning across thousands of papers and large-scale analyses, and moving toward pre-training their own models end-to-end. The platform is already operating at scale, with tens of thousands of users and millions of queries, and is actively used in scientific workflows today.

Requirements

  • Experience building and scaling ML systems in production
  • Strong background across model training, data pipelines, and deployment
  • Experience with large-scale training or distributed systems
  • Fluency in frameworks like PyTorch, JAX, or similar
  • Strong engineering fundamentals and systems thinking
  • Ability to operate across ambiguity and own problems end-to-end

Responsibilities

  • Building and scaling training pipelines for large-scale LLM systems
  • Developing experimentation platforms that enable fast, reliable iteration
  • Designing data pipelines and systems for observability and reproducibility
  • Improving how training runs are orchestrated, monitored, and debugged
  • Supporting model deployment and inference for complex reasoning systems
  • Working closely with researchers to translate ideas into production systems
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service