Research Scientist: Pretraining

GeneralistSan Francisco, CA

About The Position

You will build the base intelligence layer for robotics. We train large-scale robot foundation models from massive multimodal datasets spanning video, proprioception, action traces, language, and more. You will design and run the core large-scale training efforts that give our models fundamentally new general capabilities across embodiments, tasks, and environments. You will “live and breathe” all forms of robot data.

Requirements

  • Deep experience training large transformer or diffusion models at scale (for generative models e.g. including language models, audio models, or video models)
  • Led or significantly contributed to multi-node, multi-GPU distributed training efforts
  • Worked on scaling laws, optimization dynamics, and large-model failure modes
  • Strong PyTorch fundamentals and comfort debugging at every layer of the stack
  • Care about both empirical rigor and raw iteration speed
  • Excited about building general-purpose robot intelligence from first principles

Responsibilities

  • Designing and executing large-scale pretraining runs for robot foundation models (transformer- and diffusion-based architectures)
  • Defining model architectures, objectives, and training curricula across multimodal robotic data (vision, action, state, language)
  • Developing scalable data mixtures and sampling strategies across petabyte-scale datasets
  • Guiding data collection operations towards new directions, as well as sourcing new datasets
  • Running ablations to understand scaling laws, data quality effects, and architecture tradeoffs
  • Collaborating closely with ML Infra and Systems to push cluster utilization, throughput, and reliability
  • Turning raw robotic interaction data into generalizable model capabilities
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service