ML Infra Engineer

Humble RoboticsSan Francisco, CA

About The Position

We're looking for an ML infrastructure engineer to help design, build, and scale the foundational systems we need to realize our ambitious vision. You'll work on tooling and infrastructure that supports every stage of the ML training flywheel and be an important voice in the technical and organizational decisions that shape our work. From areas spanning vehicle compute to data collection to dataset curation to large-scale model training and deployment, help us build reliable, performant, and secure infrastructure that every team at Humble Robotics can rely on. It's fun here. We are doing cool stuff. The ideal candidate is a first-principles thinker who is comfortable being a broad generalist. Work on every layer of the stack to help make the software iteration loop as fast and efficient as possible. We're a small team, and your input, experience, and knowledge will play a critical role in shaping every system we build, operate, and depend on to achieve our mission.

Requirements

  • Experience building and operating high-availability web services on cloud infrastructure
  • Experience with infrastructure-as-code and configuration management tools (we use Terraform and Ansible)
  • Experience building and maintaining CI/CD pipelines and managing deployments
  • Fluent in security fundamentals including Linux hardening, network security, and cryptographic principles
  • Hands-on experience with cluster scheduling systems for running large-scale batch computation
  • Comfortable reading, writing, and extending non-trivial code (not just scripting)
  • Eligible to work in the United States

Nice To Haves

  • Hands-on experience managing large, high-performance ML training clusters
  • Working knowledge of distributed training frameworks and high-performance networking for ML workloads
  • Prior infrastructure experience at an early-stage autonomous vehicle or robotics company
  • Comfort operating as an early team member—high ownership, low ego, fast iteration

Responsibilities

  • Work on data collection infrastructure that moves sensor data reliably and efficiently from our vehicles into our ML platform
  • Develop batch compute pipelines for cataloging, exploring, and curating raw data into high-quality training sets
  • Design and scale distributed ML training on our GPU clusters
  • Take ownership of performance, observability, efficiency, and security across the full pipeline
  • Partner with the ML team to understand their workflows and translate them into reliable infrastructure that accelerates their work

Benefits

  • base salary
  • benefits
  • equity compensation
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service