About the Team: The Evaluation Foundations team is a part of the Scaling Foundations team in Embodied AI and is responsible for Building the Largest, most Diverse and highest Quality Datasets for model training and evaluation, now and into the future Utilizing our resources in the most Efficient way, including storage and compute but especially GPUs Minimizing the time of iteration and providing the highest quality introspection tools and evaluation signal to shorten the experimental path towards the best model. Why Join Us? Scale up introspection and evaluation tools that work with billions of examples and enable utilization of our large datasets, delivering the maximum value to the model through every additional example. Work with cutting-edge technology and a collaborative, high-impact team of AI/ML engineers, data scientists and engineers who are passionate about leveraging advanced AI techniques to drive innovation for L2, L3 and L4 applications. Contribute to the safety, reliability, and scalability of next-generation autonomous vehicles. Role: As a Principal Engineer in the Embodied AI Scaling Foundations organization, you will be a senior technical leader owning the technical vision and architecture for how we measure and visualize AV model performance. As a full-stack engineer, you will focus on the entire lifecycle of designing, implementing, scaling and iterating on state-of-the-art tools that the entire Embodied AI organization and adjacent teams in the GM AV organization use. You will level up these foundational tools by, Providing high‑quality evaluation signal and introspection tools to shorten the experimental path toward the best model. Owning visualization and metrics presentation for our model evaluation loop. Enabling teams to deeply introspect model behavior and reach actionable next steps in as few steps as possible. You will collaborate closely with modeling and data scaling teams working on sour Compound AI driving models to define how we evaluate them at scale with model-based metrics, end-to-end simulation and accelerate how engineers use that signal and feed back into data, training, and launch decisions. Your work ensures evaluation, data, and infra form a cohesive evaluation flywheel: better signal → better data and training decisions → better models → better signal.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Principal
Education Level
Ph.D. or professional degree
Number of Employees
5,001-10,000 employees