Applied ML Validation Manager

General MotorsSunnyvale, CA
1d

About The Position

As an Applied ML Validation Manager on the Software Validation team within the AV organization, you will lead a team focused on building and operating behavior critics and human benchmarking capabilities for ML-driven autonomy systems. Your team will turn subjective human expectations about safe, comfortable, and intuitive driving into rigorous, scalable evaluation frameworks that directly inform model development and release decisions. You will partner closely with autonomy, simulation, safety, and product teams to define how behavior is judged against human drivers and integrate behavior critic signals into validation pipelines, continuous release, and long-term performance monitoring. The Autonomous Vehicle (AV) organization is dedicated to advancing the development of autonomous vehicles through cutting-edge simulation technologies and novel iterative development processes. The Software Validation team focuses on unlocking software launches and continuous release decisions via simulation-led verification and validation strategies, prototypes, and protocols. Our collaborative environment fosters innovation and excellence, allowing us to push the boundaries of what is possible in autonomous vehicle testing.

Requirements

  • 8+ years of experience and MS/PhD in Computer Science, Machine Learning, Robotics, Software Engineering, Data Science, or a related field.
  • 2+ years of people management experience leading engineering, validation, or applied ML teams.
  • Strong programming and data skills in Python and common analysis/ML tooling (e.g., PyTorch).
  • Demonstrated experience designing and operating evaluation/validation pipelines for complex ML systems.
  • Proven ability to define, implement, and track metrics that capture system quality, reliability, safety, or user experience.

Nice To Haves

  • Experience with autonomous driving, robotics, or other safety-critical domains, especially in validation, safety, or systems engineering roles.
  • Demonstrated background with simulation-based validation, including VLM critics, human benchmarking, and scalable evaluation for ML or autonomy systems.
  • Hands-on experience with agentic workflows used to accelerate analyses, automate documentation, or orchestrate complex data and metric pipelines.
  • Track record of building or scaling technical teams and tooling in fast-evolving domains, especially focused on evaluation, automation, and ML observability.

Responsibilities

  • Lead and grow an Applied ML validation team focused on behavior evaluation and human benchmarking for autonomy ML systems.
  • Define the strategy and roadmap for evaluating ML behavior against human-like driving expectations across simulation, replay, and on-road environments.
  • Design, implement, and operate behavior critic frameworks that assess model actions and trajectories, turning qualitative human feedback into structured labels, metrics, and scorecards.
  • Develop and scale human benchmarking programs, including rater guidelines, calibration, and quality controls, to compare ML system performance against expert and typical human drivers.
  • Partner closely with autonomy, simulation, safety, and product teams to integrate behavior critic and human benchmarking outputs into training, offline validation, release gating, and reporting.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service