Director, Compute Infrastructure

WayveSunnyvale, CA
30d

About The Position

At Wayve we're committed to creating a diverse, fair and respectful culture that is inclusive of everyone based on their unique skills and perspectives, and regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, veteran status, pregnancy or related condition  (including breastfeeding) or any other basis as protected by applicable law. Founded in 2017, Wayve is the leading developer of Embodied AI technology.  Our advanced AI software and foundation models enable vehicles to perceive, understand, and navigate any complex environment, enhancing the usability and safety of automated driving systems. Our vision is to create autonomy that propels the world forward.  Our intelligent, mapless, and hardware-agnostic AI products are designed for automakers, accelerating the transition from assisted to automated driving. In our fast-paced environment big problems ignite us—we embrace uncertainty, leaning into complex challenges to unlock groundbreaking solutions. We aim high and stay humble in our pursuit of excellence, constantly learning and evolving as we pave the way for a smarter, safer future. At Wayve, your contributions matter.  We value diversity, embrace new perspectives, and foster an inclusive work environment; we back each other to deliver impact.  Make Wayve the experience that defines your career! Director of Compute Infrastructure leading Wayve’s global GPU platform. Own the systems that power AI training and inference at scale, including scheduling, fleet management, reliability, and cost efficiency.

Requirements

  • Proven leadership of large-scale infrastructure/platform teams
  • Experience with GPU-based AI workloads at scale
  • Strong knowledge of scheduling/orchestration (e.g. Kubernetes, Slurm)
  • Distributed systems and reliability expertise
  • Experience defining SLAs/SLOs and operating production systems
  • Strong communication and stakeholder management

Nice To Haves

  • Multi-cloud or hybrid infrastructure experience
  • GPU performance, networking, or cluster topology knowledge
  • Background in AI, robotics, or HPC
  • Experience managing infrastructure cost and vendor strategy

Responsibilities

  • Define GPU compute strategy aligned with research and business goals
  • Own global GPU fleet capacity, utilisation, and cost
  • Build and scale scheduling/orchestration systems for training and inference
  • Deliver reliable, multi-region, data-aware execution
  • Establish SLAs, observability, and operational excellence
  • Drive cost transparency and platform efficiency
  • Partner with research, autonomy, and platform teams
  • Build and lead high-performing engineering teams
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service