Machine Learning Infrastructure Engineer

Mind RoboticsPalo Alto, CA

About The Position

We’re hiring Machine Learning Infrastructure Engineers to build the systems that make large-scale model training actually work. This role is for people who enjoy operating at scale—owning distributed training, core ML infrastructure, and fast iteration loops across hundreds of GPUs. If you’ve built or run large training systems in PyTorch or JAX and care about things like sharding, parallelism, and performance, you’ll feel at home here. You’ll work closely with researchers to remove friction, improve reliability, and make it easier to train, evaluate, and deploy models that show up in real systems.

Requirements

  • Experience building or running large training systems in PyTorch or JAX.
  • Understanding of sharding, parallelism, and performance optimization.
  • Experience operating at scale.

Responsibilities

  • Build the systems that make large-scale model training work.
  • Own distributed training.
  • Own core ML infrastructure.
  • Own fast iteration loops across hundreds of GPUs.
  • Work closely with researchers to remove friction.
  • Improve reliability of ML systems.
  • Make it easier to train, evaluate, and deploy models.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service