Staff Machine Learning Engineer, Training Runtime Performance

Nuro•Mountain View, CA

125d•$235,030 - $352,290

About The Position

Nuro is a self-driving technology company on a mission to make autonomy accessible to all. Founded in 2016, Nuro is building the world’s most scalable driver, combining cutting-edge AI with automotive-grade hardware. Nuro licenses its core technology, the Nuro Driver™, to support a wide range of applications, from robotaxis and commercial fleets to personally owned vehicles. With technology proven over years of self-driving deployments, Nuro gives the automakers and mobility platforms a clear path to AVs at commercial scale—empowering a safer, richer, and more connected future.

Requirements

B.S./M.S./Ph.D. in Computer Science, Electrical Engineering, or related technical field (or equivalent experience).
4+ years of professional experience in ML infrastructure, distributed training, or ML systems engineering, scaling models on multi-node, multi-accelerator clusters.
Understanding of training, evaluation, and distillation workflows for billion-parameter models.
Expert-level knowledge in distributed systems and (remote) Python.
Strong skills in profiling, debugging, and optimizing quantized workloads.
Experience with ML compilers and strategies to reduce startup overhead.
Familiarity with model distillation and efficient inference workflows.

Nice To Haves

Previous contributions to open source ML infra projects or research publications in ML systems.
Hands-on experience with Foundation Model infrastructure.
Highly proficient in C++, distributed systems, ML framework internals (e.g., NCCL, Horovod, DeepSpeed, Ray).

Responsibilities

Collaborate with ML practitioners and other infrastructure teams to understand their needs and integrate optimized input pipelines seamlessly into their workflows.
Detect, diagnose, and resolve performance bottlenecks across training, eval, and model distillation workflows.
Optimize training performance, resource utilization, and ensure consistent, reproducible model training outcomes.
Optimize input data pipelines to increase runtime goodput, ensuring accelerators maximize their 'time on task' and minimize idle cycles.
Champion best practices for robust, reproducible, and debuggable ML experimentation.

Benefits

Annual performance bonus
Equity
Competitive benefits package

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Career Level

Senior

Education Level

Master's degree

Number of Employees

501-1,000 employees

Staff Machine Learning Engineer, Training Runtime Performance

About The Position

Requirements

Nice To Haves

Responsibilities

Benefits

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company