About The Position

Airbnb was born in 2007 when two hosts welcomed three guests to their San Francisco home, and has since grown to over 5 million hosts who have welcomed over 2 billion guest arrivals in almost every country across the globe. Every day, hosts offer unique stays and experiences that make it possible for guests to connect with communities in a more authentic way. The Community You Will Join: At Airbnb, our mission is to create a world where anyone can belong anywhere. The AI Compute team is responsible for delivering and operating the GPU platform and infrastructure that powers AI at Airbnb. Airbnb is a member of the Cloud Native Compute Foundation’s end user community and regularly meets with peer companies to discuss cloud native engineering challenges at scale. The Difference You Will Make: As a Senior Software Engineer on the AI Compute team, your role is to serve as the technical leader, overseeing the entire lifecycle of Airbnb’s Kubernetes-based GPU platform. You will be instrumental in boosting the efficiency and effectiveness of the Cloud Infrastructure organization by significantly enhancing all facets of the Machine Learning (ML) engineering experience. This platform is the critical foundation that supports all AI Compute features related to security, networking, developer experience, and operational efficiency. Your team will provide the reliability, scalability, security, and developer experience necessary for ML teams to deliver first class experiences to Airbnb Guests and Hosts around the world.

Requirements

  • BS, MS or Ph.D. in computer science or related field, or equivalent work experience
  • 5+ years of relevant work experience in infrastructure
  • 2+ years of expertise with a public cloud provider (AWS, GCP, Azure) and their infrastructure as a service offering (e.g. EC2).
  • Experience setting technical direction, planning, and successfully executing on large projects spanning multiple teams
  • Kubernetes Experience is required.
  • Passionate about efficiency, availability, quality and developer experience.

Nice To Haves

  • ML Infrastructure (LLM fundamentals, tuning, optimization) Experience is preferred.

Responsibilities

  • Provide technical leadership on high-impact projects
  • Influence and coach a distributed team of engineers
  • Drive reliability, cost efficiency and capability enhancements for GPU fleet
  • Facilitate cross-team alignment on goals, outcomes, and timelines
  • Manage project priorities, deadlines, and deliverables
  • Contribute to and execute the multi-year strategy for Airbnb’s AI Compute Platform
  • Design, develop, test, deploy, maintain, and enhance the Airbnb AI Compute Platform

Benefits

  • This role may also be eligible for bonus, equity, benefits, and Employee Travel Credits.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service