ML Infrastructure Architect

Motional
107d$205,000 - $282,000

About The Position

We’re seeking a Machine Learning Infrastructure Architect to lead the technical vision and architecture for the systems that power our entire machine learning lifecycle—from training data set generation to model training, evaluation, and deployment. This is a mission-critical leadership role within the ML Infra team, shaping the infrastructure that supports terabytes of daily sensor data and petabyte-scale datasets essential for autonomous vehicle development. This role is ideal for a senior technologist with a deep background in ML systems and data architecture, who thrives on building for scale, performance, and engineering excellence.

Requirements

  • 15+ years of meaningful software engineering experience, including significant architecture-level ownership in ML infrastructure.
  • Proven experience leading the design of ML platforms that serve large-scale training and inference workloads.
  • Deep technical fluency in distributed storage, high-volume data pipelines, and data compression strategies for ML use cases.
  • Strong knowledge of Linux systems, Python, and C++ or similar performance-oriented languages.
  • Experience operating in hybrid environments: bare metal, HPC, and public cloud (AWS/GCP/Azure).
  • Comfortable owning cross-org initiatives and influencing system-level design across autonomy, simulation, and platform teams.
  • Prior work in robotics, autonomous vehicles, or safety-critical domains strongly preferred.

Nice To Haves

  • Experience building or leading infrastructure at a top-tier ML/AI company or AV program.
  • Background contributing to open-source ML infrastructure projects.

Responsibilities

  • Own the architecture of Motional’s ML infrastructure, enabling scalable storage, curation, and access for 100+ engineers and researchers across autonomy teams.
  • Design and evolve infrastructure to support petabyte-scale machine learning workflows, including multimodal perception data, synthetic data, simulation output, and continuous training pipelines.
  • Architect high-throughput systems for distributed training on large GPU clusters, driving significant improvements in utilization, throughput, and job efficiency.
  • Establish robust data governance, observability, and retention strategies to ensure compliance, reproducibility, and long-term data utility.
  • Collaborate cross-functionally with ML engineers, autonomy researchers, data engineers, and DevOps to ensure tight integration between infrastructure and user workflows.
  • Lead technical strategy and roadmap development for the ML Infra team, incorporating cutting-edge tools and best practices from industry and open source.
  • Mentor and influence engineers across teams, promoting engineering excellence in distributed systems, ML platforms, and autonomy-scale data management.

Benefits

  • Medical insurance
  • Dental insurance
  • Vision insurance
  • 401k with a company match
  • Health saving accounts
  • Life insurance
  • Pet insurance
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service