Staff Site Reliability Engineer - Apple Ads

AppleCupertino, CA
74d$181,100 - $318,400

About The Position

As an SRE in Apple Ads, you will own the health, performance, and scalability of ad-serving infrastructure and associated platform tooling. Your focus will be on leading initiatives that eliminates manual processes, improves service resilience, and enables teams to move faster with confidence.

Requirements

  • 10+ years of experience supporting internet-facing production systems and distributed cloud infrastructure.
  • 3+ years of experience as a technical lead, guiding teams through complex design decisions and setting high benchmarks for reliability, performance, and scalability.
  • Strong programming skills in at least one of: Python, Go, or Java.
  • Proven expertise with AWS-managed infrastructure.
  • Hands-on experience with Linux systems and deep knowledge of its internals.
  • Demonstrated experience with Infrastructure as Code with systems like CloudFormation, Terraform and Crossplane.
  • Demonstrated ability to work cross-functionally and influence product reliability through a combination of technical leadership and user-centered thinking.
  • Passion for operational excellence, automation, and delivering scalable, developer-friendly infrastructure.
  • Strong foundation in SRE concepts: Monitoring, alerting, and observability; Incident response and root cause analysis; Error budgets, SLAs/SLOs, and system reliability.

Nice To Haves

  • Built tools or services that automate platform operations, reduce toil, or improve cost efficiency.
  • Experience managing large scale systems that runs diverse workloads ranging from RPC services to Data related.
  • Hands-on experience troubleshooting distributed systems under real-world load.
  • Clear communication skills and comfort collaborating across engineering, infrastructure, and product teams.
  • Experience using GenAI based solutions to solve reliability challenges.

Responsibilities

  • Build and operate distributed systems using AWS managed services such as EKS, MSK, and ElastiCache.
  • Develop internal tooling and automation frameworks to improve infrastructure reliability, cost-efficiency, and operational visibility.
  • Collaborate with engineering teams to define infrastructure architecture, troubleshoot complex issues, and drive production excellence.
  • Design and manage Infrastructure as Code with Terraform, ensuring repeatable, secure, and scalable deployments.
  • Lead or participate in incident response, postmortems, and continuous improvement cycles to reduce future risk.

Benefits

  • Comprehensive medical and dental coverage.
  • Retirement benefits.
  • A range of discounted products and free services.
  • Reimbursement for certain educational expenses, including tuition.
  • Opportunity to participate in Apple's discretionary employee stock programs.
  • Eligibility for discretionary restricted stock unit awards.
  • Ability to purchase Apple stock at a discount through the Employee Stock Purchase Plan.
  • Potential for discretionary bonuses or commission payments.
  • Relocation assistance.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service