About The Position

About the Team: The AI Validation Platform team owns the cloud-agnostic, reliable, and cost-efficient platform that powers GM’s AV efforts. We’re proud to serve as the infrastructure platform for teams developing autonomous vehicles (L3/L4/L5). Our platform supports the simulated validation of state-of-the-art (SOTA) machine learning models, with a focus on performance, availability, concurrency, and scalability. We enable rapid innovation and development by prioritizing high-impact, ML-centric use cases. About the Role: We are seeking a Senior ML Infrastructure engineer to help build and scale robust Compute platforms for Simulation workflows. In this role, you will focus on scaling, driving efficiency, and high utilization of cutting-edge GPUs, while also leveling up the platform’s reliability. The successful candidate will have experience building and running scalable distributed systems . They will rapidly test and promote ideas, have strong problem-solving skills, and demon strate a bias for action . You will play a key role in shaping the architecture, roadmap, and user experience of a robust service supporting our AI Validation / Simulation needs . The ideal candidate brings experience in designing distributed systems , strong problem-solving skills, and a get-it-done attitude . This is a high-impact opportunity to influence the future of AI infrastructure at GM.

Requirements

  • 4+ years of industry experience, with a focus on high performance backend services.
  • Strong expertise in Go, or other similar coding languages.
  • Experience working with cloud platforms such as GCP, Azure, or AWS.
  • Experience in delivering cross-functional initiatives.
  • Strong communication skills and a proven ability to drive cross-functional initiatives.
  • Ability to thrive in a dynamic, multi-tasking environment with ever-evolving priorities.

Nice To Haves

  • Hands-on experience with Cloud VM services Google Compute Engine.
  • Experience with hardware-in-the-loop validation systems.
  • Experience with high performance computing (HPC).
  • Experience working with or designing interfaces and clients for developer workflows.
  • Familiarity with telemetry, and other feedback loops to inform product improvements.
  • Familiarity with hardware acceleration (GPUs) and optimizations.

Responsibilities

  • Design and implement core platform backend software components.
  • Collaborate with Simulation engineers, ML engineers and researchers to understand critical workflows, parse them to platform requirements, and deliver incremental value.
  • Lead technical decision-making on Compute architecture, cloud capacity provisioning, caching, and auto-scaling mechanisms.
  • Drive the development of monitoring, observability, and metrics to ensure reliability, performance, and resource optimization.
  • Proactively research and integrate frameworks, hardware accelerators, and distributed computing techniques.
  • Lead large-scale technical initiatives a cross GM’s ML infrastructure.
  • Raise the engineering bar through technical leadership and by establishing best practices .

Benefits

  • From day one, we're looking out for your well-being–at work and at home–so you can focus on realizing your ambitions.
  • Learn how GM supports a rewarding career that rewards you personally by visiting Total Rewards resources .

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service