SRE Manager, ML Operations - Apple Ads

AppleNew York City, NY

About The Position

At Apple, we believe in the power of technology to enrich people's lives. Everything we build is designed to empower people, including our advertising platform. We deliver ads in a way that benefits both customers and advertisers — helping people discover content, supporting creators, and protecting and respecting everyone’s privacy. Our technology makes advertising possible on the App Store, Apple News, Stocks, and Apple TV. We help developers and marketers of all sizes drive app discovery across the App Store. Our display ads on Apple News and Stocks let advertisers promote their products alongside trusted content in a brand-safe environment, while supporting publishers and journalists. Sponsorship integrations and experiences in live sports on Apple TV help advertisers connect with captivated audiences. Everything we do is with the unwavering commitment to privacy you expect from Apple. Because when advertising is done right, it benefits everyone. We are seeking a senior engineering leader and experienced professional to lead our Site Reliability Engineering team. This team is responsible for Ad Serving infrastructure that serves as the front door of Apple Ads. You will be an accomplished builder and leader of teams looking to take on your next challenge. You know SRE and you know what it will take to run services at Apple scale with a high degree of operational precision. This role will position you to help craft the future of how we build and run our services on a global scale. You will have the technical skills to go deep and retain the ability to focus on higher-level business and product goals. We hire high quality leaders and engineers with a diverse set of experiences and abilities for positions on Apple.

Requirements

  • 10+ years experience with large scale distributed systems
  • Demonstrable success leading engineering teams - ideally SRE or Production Engineering
  • Knowledge of core operating system principles, networking fundamentals, and systems management
  • Understanding of SRE principals, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts
  • Strong leadership capabilities, with excellent problem-solving and decision-making skills.
  • 5+ years professional experience in an engineering leadership position
  • Bachelors or Master’s degree in computer science or equivalent field with 10+ years of experience
  • Experience managing infrastructure in AWS
  • Experience building and operating large-scale distributed systems or ML systems in production.
  • Experience partnering with Product, ML Platform, Ads Serving, Data Science, and cross-functional stakeholders to deliver complex initiatives
  • Experience managing and optimizing GPU based clusters.

Nice To Haves

  • Prior experience in digital advertising industry is a huge plus.

Responsibilities

  • Lead SRE teams responsible for reliability and performance of ML Platforms and Services.
  • Lead and grow the engineers on your team
  • Advocate best practices of reliability engineering
  • Create a team vision, define goals towards delivering high-quality outcomes.
  • Create a culture of engineering excellence, innovation, and continuous improvement
  • Collaborate with staff engineers and technical leadership on architecture and strategic technical decisions
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service