Senior ML/AI Software Engineer – Evaluation Insights

GM•Sunnyvale, CA

18h

About The Position

The Role: GM’s autonomy stack generates far more numerical data than anyone can review manually. The Evaluation Insights team builds tools that turn this data into a single, trustworthy view of performance—accelerating model iteration and improving vehicle safety. As a Senior ML/AI Software Engineer on the Sim Insights team, you will help stakeholders—from AV developers and system engineers to leadership—reach accurate, actionable conclusions about the autonomy stack’s performance. You will apply your expertise in robotic systems, statistical analysis, and data visualization to engineer tools that answer a wide range of questions about AV stack performance. About the Organization: The Evaluation team is dedicated to creating, maintaining, and evolving the evaluation ecosystem that underpins GM’s pursuit of safe, high-performing, and scalable driverless technology. The team delivers trusted metrics, automated workflows, and scalable tools that enable data-driven decision-making at every stage of AV development. Evaluation team members collaborate closely with Simulation, Motion, Perception, and Release teams, acting as system-level integrators and arbiters of end-to-end AV system quality. The organization’s remit includes development of test scenario libraries, deployment of continuous evaluation pipelines, and ownership of critical risk assessment and release gating processes. The team treats road, data mining, training, and metrics as equal use cases for our analytics framework and evaluation goals. By joining this team, you will guide the evolution of core evaluation platforms and frameworks, champion the interpretation and communication of system-level results, and play a central role in accelerating GM’s progress toward safe, validated AV deployment at scale.

Requirements

3+ years applied experience in data analysis, ML evaluation, or autonomy analytics , working with large-scale datasets and statistical methods.
Proficiency with Pandas , NumPy , SciPy, and plotting/visualization libraries.
Bachelor’s or higher degree in Computer Science , Data Science , Mechanical or Aerospace Engineering , or equivalent practical experience.

Nice To Haves

A strong understanding of how to visualize quantitative information effectively and transparently.
The ability to decompose a multi-dimensional space into something consumable.
Experience evaluating robotic systems, including sensor data (camera, lidar, radar) and time-series analysis.
A strong curiosity to question anomalous data and root-cause discrepancies

Responsibilities

Design and implement analysis algorithms that summarize, aggregate, and cluster metrics produced by simulations of the autonomy stack
Build and maintain GM’s primary autonomy evaluation dashboards and reports that provide clear, explainable insights to engineering and leadership, including trend analysis, drift detection, and scenario coverage.
Leverage vision-language models (VLMs) and large language models (LLMs) to classify autonomy performance, mine critical scenarios, and prioritize validation efforts, integrating human-in-the-loop where appropriate.
Maintain a high technical standard through architectural design, code reviews, and by following software-engineering best practices.
Interface with cross-org partners to articulate requirements, resolve handoff issues, and share best practices.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume