About The Position

As an SRE in Apple Ads, you will own the health, performance, and scalability of ad-serving infrastructure and associated platform tooling. Your focus will be on leading initiatives that eliminates manual processes, improves service resilience, and enables teams to move faster with confidence.

Requirements

  • 10+ years of experience supporting internet-facing production systems and distributed cloud infrastructure.
  • 3+ years of experience as a technical lead, guiding teams through complex design decisions and setting high benchmarks for reliability, performance, and scalability.
  • Strong programming skills in at least one of: Python, Go, or Java.
  • Proven expertise with AWS-managed infrastructure.
  • Hands-on experience with Linux systems and deep knowledge of its internals.
  • Demonstrated experience with Infrastructure as Code with systems like CloudFormation, Terraform and Crossplane.
  • Demonstrated ability to work cross-functionally and influence product reliability through a combination of technical leadership and user-centered thinking.
  • Passion for operational excellence, automation, and delivering scalable, developer-friendly infrastructure.
  • Strong foundation in SRE concepts: Monitoring, alerting, and observability; Incident response and root cause analysis; Error budgets, SLAs/SLOs, and system reliability.

Nice To Haves

  • Built tools or services that automate platform operations, reduce toil, or improve cost efficiency.
  • Experience managing large scale systems that runs diverse workloads ranging from RPC services to Data related.
  • Hands-on experience troubleshooting distributed systems under real-world load.
  • Clear communication skills and comfort collaborating across engineering, infrastructure, and product teams.
  • Experience using GenAI based solutions to solve reliability challenges.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service