About The Position

About the Team: The Evaluation Foundations team is a part of the Scaling Foundations team in Embodied AI and is responsible for Building the Largest, most Diverse and highest Quality Datasets for model training and evaluation, now and into the future Utilizing our resources in the most Efficient way, including storage and compute but especially GPUs Minimizing the time of iteration and providing the highest quality introspection tools and evaluation signal to shorten the experimental path towards the best model. Why Join Us? Scale up introspection and evaluation tools that work with billions of examples and enable utilization of our large datasets, delivering the maximum value to the model through every additional example. Work with cutting-edge technology and a collaborative, high-impact team of AI/ML engineers, data scientists and engineers who are passionate about leveraging advanced AI techniques to drive innovation for L2, L3 and L4 applications. Contribute to the safety, reliability, and scalability of next-generation autonomous vehicles. Role: As a Principal Engineer in the Embodied AI Scaling Foundations organization, you will be a senior technical leader owning the technical vision and architecture for how we measure and visualize AV model performance. As a full-stack engineer, you will focus on the entire lifecycle of designing, implementing, scaling and iterating on state-of-the-art tools that the entire Embodied AI organization and adjacent teams in the GM AV organization use. You will level up these foundational tools by, Providing high‑quality evaluation signal and introspection tools to shorten the experimental path toward the best model. Owning visualization and metrics presentation for our model evaluation loop. Enabling teams to deeply introspect model behavior and reach actionable next steps in as few steps as possible. You will collaborate closely with modeling and data scaling teams working on sour Compound AI driving models to define how we evaluate them at scale with model-based metrics, end-to-end simulation and accelerate how engineers use that signal and feed back into data, training, and launch decisions. Your work ensures evaluation, data, and infra form a cohesive evaluation flywheel: better signal → better data and training decisions → better models → better signal.

Requirements

  • Familiarity and experience with at least some of the following key technologies Frontend: React/TypeScript, WebGL/WebGPU (for 3D sensor visualization), Streamlit, Jupyter notebooks Backend: Python, high-throughput data streaming, Parquet for efficient data handling. Data/Infra: Spark for stream processing, BigQuery, Kubernetes, and specialized AV data formats (Rosbags, Protobuf).
  • Bachelor’s, Master’s or PhD degree in Computer Science or related field
  • Experience accelerating applied research in the wild and maintaining best practices while working on tight deadlines.
  • Proven experience in building large scale systems that are performant and used by large distributed teams.
  • Excellent communication skills to effectively collaborate with diverse teams and stakeholders.

Nice To Haves

  • Previous experience in Robotics or Autonomous Driving.

Responsibilities

  • Own the architecture and roadmap for evaluation and introspection systems across AV models, ensuring consistent metrics, pipelines, and visualization surfaces.
  • Partner with the broader Embodied AI team and integrate core metrics, scoring functions, and scenario/slice definitions used for regression, launch gating, and safety analysis into evaluation tooling owned by the team.
  • Build and scale evaluation pipelines on modern cloud / GPU infrastructure, with strong observability and cost efficiency.
  • Lead development of visualization and introspection tools that let teams quickly drill from aggregates down to concrete examples, failure modes, and regressions.
  • Partner with Data Consumption/Mining/Quality and Infra Foundations to turn evaluation insights into data and training actions (scenario mining, dataset definitions, training recipes, model selection).
  • Collaborate with simulation and on‑road validation teams to align offline metrics and tools with online behavior and improve correlation between evaluation and real‑world outcomes.
  • Mentor engineers across Scaling Foundations, set best practices for evaluation and introspection, and act as a domain expert for evaluation‑related tooling designs and reviews.

Benefits

  • GM offers a variety of health and wellbeing benefit programs. Benefit options include medical, dental, vision, Health Savings Account, Flexible Spending Accounts, retirement savings plan, sickness and accident benefits, life insurance, paid vacation & holidays, tuition assistance programs, employee assistance program, GM vehicle discounts and more.
  • Company Vehicle: Upon successful completion of a motor vehicle report review, you will be eligible to participate in a company vehicle evaluation program, through which you will be assigned a General Motors vehicle to drive and evaluate. Note: program participants are required to purchase/lease a qualifying GM vehicle every four years unless one of a limited number of exceptions applies.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Principal

Education Level

Ph.D. or professional degree

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service