About The Position

General Motors is a global leader in advanced driver assistance, with Super Cruise hands-free technology in more than 500,000 equipped vehicles on the road and over 700 million hands-free miles driven—demonstrating that automation can be trusted, intuitive, and helpful while reaching everyday drivers at unprecedented scale. Within GM AV, the Model Deployment & Inference Solutions team deploys machine learning models from training frameworks (e.g., PyTorch) onto autonomous-vehicle hardware; our two-fold mission is to build the ML deployment platform that makes model rollouts fast and predictable, and to optimize models so they meet the real-time latency and memory budgets required to run on-vehicle. Our work sits on the critical path for GM’s publicly committed launch of eyes-off (hands-free, eyes-free) autonomous driving in 2028 on the Cadillac Escalade IQ, and we’re hiring engineers to help deliver the next generation of safe, delightful personal autonomous-vehicle experiences. As an early career Engineer on the Model Deployment & Inference Solutions team, you’ll contribute across both sides of our mission: building the ML deployment platform and optimizing models for on-vehicle inference. You’ll work with and learn from senior engineers on real production deployments, platform features, and model-optimization workflows that ship to GM’s Super Cruise fleet at large scale, with structured mentorship and a clear onboarding plan. You’ll also collaborate closely with our sister teams (kernels, compiler, reduced precision, and parity) on the end-to-end path that takes trained models from research frameworks to ultra-efficient, safety-critical inference on the car. This is an early-career / new graduate role designed for candidates who have recently or will be completing their degree by June 2026.

Requirements

  • Recently completed or completing a Bachelor’s or Master’s degree by Spring 2026 in Computer Science, ECE, or a related technical field. (Degree must be completed before your start date.)
  • Strong computer science fundamentals (e.g., data structures, algorithms, operating systems, computer architecture) and solid coding skills in Python and/or C++, demonstrated through coursework, internships, or substantial projects.
  • Hands-on experience in AI/ML (e.g., machine learning, deep learning, computer vision, NLP, or ML systems) via classes, research, internships, or personal projects.
  • Depth in at least one of: computer architecture, operating systems, distributed systems, or compilers.
  • Demonstrated software-engineering experience (internships, coursework, open-source, research code, or competitions) showing good judgment around reliability, correctness, and clean abstractions.
  • Experience with—or strong interest in—using coding assistants/agents (e.g., Cursor, Claude Code, GitHub Copilot) as part of your workflow.
  • Ability to work effectively in collaborative, cross-functional teams and communicate clearly—both in writing and verbally—including explaining technical work partners

Nice To Haves

  • Internship, research, or advanced coursework in ML systems, ML compilers, GPU programming (CUDA, OpenAI Triton), inference optimization, or distributed training/serving infrastructure.
  • Familiarity with PyTorch and modern ML compiler/runtime stacks (e.g., torch.compile, TensorRT, ONNX, Triton Inference Server, vLLM, or equivalent).
  • Exposure to model optimization (quantization, pruning, distillation) or GPU profiling tools (Nsight Systems, Nsight Compute, PyTorch Profiler).
  • Familiarity with workflow/ML platforms such as Airflow, Temporal, Flyte, Ray, or Kubeflow.
  • Experience building agentic or LLM-powered tools or workflows.
  • Open-source contributions related to PyTorch, TensorRT, vLLM, OpenAI Triton, or similar projects.
  • Coursework, projects, or publications touching ML systems (e.g., MLSys, OSDI, ASPLOS, HPCA, NeurIPS systems track).
  • Familiarity with a systems language (e.g., C++) and development in a Linux environment.

Responsibilities

  • Contribute production code across the ML deployment platform, model-optimization workflows, and inference benchmarking/profiling infrastructure.
  • Pair with senior engineers on deployment workflows, performance investigations, model-optimization experiments (e.g., quantization, pruning, distillation), and platform tooling.
  • Build, test, and maintain platform tools (e.g., validators, performance probes, parity and sensitivity analyzers, agentic specialists) with technical guidance and code review support.
  • Investigate and help root-cause production deployment or performance issues; learn and apply the diagnostic playbook for compiler, kernel, runtime, and parity bugs.
  • Collaborate with cross-functional teams across the AV organization; including kernels, compiler, reduced-precision, parity, and model-development groups—to plan and execute model deployments to the AV stack, working under the guidance of senior engineers
  • Participate in code reviews, design discussions, and technical documentation to ensure reliability, correctness, and clear abstractions in a large-scale codebase.
  • Learn and follow secure coding, safety, and compliance practices required for on-vehicle autonomous driving software.

Benefits

  • medical
  • dental
  • vision
  • Health Savings Account
  • Flexible Spending Accounts
  • retirement savings plan
  • sickness and accident benefits
  • life insurance
  • paid vacation & holidays
  • tuition assistance programs
  • employee assistance program
  • GM vehicle discounts
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service