Senior Software Engineer, Autonomy Evaluation

General Motors•Sunnyvale, CA

1d•Hybrid

About The Position

General Motors is a global leader in advanced driver assistance. With Super Cruise hands-free technology in more than 500,000 Super Cruise–equipped vehicles on the road and over 700 million hands-free miles driven, GM is proving that automation can be trusted, intuitive, and helpful. GM has the global reach to bring cutting-edge advances to everyday drivers at unprecedented scale. Join us to help deliver the next generation of safe and delightful personal autonomous vehicle experiences. The Evaluation team builds and evolves the evaluation ecosystem that powers the development and scaling of GM’s autonomous driving technology. We develop metrics, automated workflows, and analysis approaches that enable data-driven decisions across AV development and verification. Partnering with Autonomy, Simulation, Systems, and Safety teams, we act as system-level integrators and arbiters of end-to-end AV quality. We own large-scale test scenario libraries, continuous evaluation pipelines, and critical risk assessment and release-gating components, treating road testing, data mining, training, and metrics as first-class use cases in a unified analytics framework. By joining this team, you will help shape GM’s core evaluation platforms, turn system-level results into clear feedback for engineering and leadership, and help accelerate validated AV deployment at scale.

Requirements

5+ years of applied experience with robotics or autonomous systems software (e.g., sensors, perception, prediction, planning, or control), data analysis, ML evaluation, or autonomy analytics.
3+ years evaluating dynamic systems using numerical and/or ML approaches, including time-series data, state derivatives, dynamics, and interconnected subsystems.
Strong proficiency developing Python in production team environments, including testing, performance, and code review.
Proficiency with Pandas, NumPy, SciPy, and plotting/visualization libraries for large-scale data analysis and reporting.
Comfort working with C++ codebases, including reading, debugging, and instrumenting core algorithms.
A strong curiosity to question anomalous data and systematically root-cause discrepancies.
Demonstrated technical leadership, including driving architectural decisions, influencing cross-team designs, and owning complex features or services end-to-end.
Bachelor’s, Master’s, or PhD in Computer Science, Robotics, Mechanical or Aerospace Engineering, Machine Learning, Data Science, or a related field, or equivalent practical experience

Nice To Haves

Experience in autonomous driving or field robotics, including visualizing and interpreting results from simulation and field experiments.
Experience evaluating robotics or AV systems using sensor data (e.g., camera, lidar, radar) and large-scale time-series analysis.
Strong intuition for data visualization and the ability to decompose high-dimensional metrics into clear, trustworthy, and consumable views for technical and non-technical audiences.
Familiarity with statistical modeling, experimental design, and hypothesis testing for autonomy or simulation evaluation; fluency with Pandas, NumPy, SciPy, and visualization tools.
Proficiency in C++ and SQL; experience shaping logging, data schemas, and evaluation pipelines for large-scale autonomy testing and performance monitoring.
Experience working with ROS or similar robotics/IPC frameworks, log pipelines, and large-scale experiment databases or evaluation platforms.
Prior development experience with computational geometry, linear algebra, PyTorch, and ML techniques applied to perception, prediction, planning, or control.
Background in modeling agent interaction and contributing to release gating and safety decisions for autonomy systems.
Experience leveraging AI-assisted development and analytics tools to improve productivity and evaluation coverage.

Responsibilities

Architect and implement metrics and analyses to introspect autonomous driving software performance at interfaces across the autonomy stack; partner closely with autonomy developers and systems engineers.
Design and implement analysis algorithms that summarize, aggregate, and cluster metrics produced by simulations and on-road runs of the autonomy stack.
Propose and develop new statistical and ML methods to quantify performance and identify patterns of system and subsystem behavior across diverse scenes and operational domains.
Develop and apply methods to introspect the operation of ML components in the autonomy stack, including evaluation of perception, prediction, and planning models.
Build and maintain autonomy evaluation dashboards and interactive reports that provide clear, explainable insights (e.g., trend analysis, drift detection, scenario coverage) for development, verification, and leadership.
Leverage vision-language models (VLMs) and large language models (LLMs), where appropriate, to classify autonomy performance, identify critical scenarios, and prioritize validation efforts, integrating human-in-the-loop review where needed.
Maintain a high technical standard through thoughtful system design, code reviews, testing, observability, and adherence to software-engineering best practices.
Interface with cross-organizational partners to articulate requirements, resolve handoff issues, and share best practices around evaluation, metrics, and experiment design.