SRE

AppleCupertino, CA

About The Position

This is a rare and exciting opportunity to work on some of the world's most impactful internet services including the  AppStore,  Music, Books,  Podcasts, and  Fitness+ within Apple Services Engineering. These are revenue-critical, globally scaled services used by billions of devices worldwide. As an SRE, your mission is to ensure these services are always available, performant, and ready for growth. You will solve complex problems at scale, develop deep troubleshooting expertise, and keep a relentless focus on availability, latency, performance, and capacity. You will drive a culture of operational excellence by replacing toil with automation and by building systems that are resilient by design.

Requirements

  • Bachelor's degree in Computer Science, Engineering, or equivalent practical experience
  • 3+ years of experience designing, analyzing, and troubleshooting large-scale distributed systems
  • 2+ years of experience leading technical projects and providing engineering leadership
  • Strong programming or scripting skills (e.g., Python, Go, Java, or shell scripting)

Nice To Haves

  • Master's degree in Computer Science or Engineering
  • Experience with observability tooling, SLOs/SLIs, and capacity planning at scale
  • Familiarity with cloud infrastructure, container orchestration (e.g., Kubernetes), and CI/CD pipelines
  • Proven track record of driving automation to reduce operational toil in high-traffic production environments

Responsibilities

  • Own the full service lifecycle rom inception and architecture through deployment, operations, and continuous improvement
  • Partner with cross-functional teams to prepare services for production through system design consulting, deployment strategy, capacity planning, and production readiness reviews
  • Monitor, measure, and maintain service health across availability, latency, and performance dimensions
  • Drive sustainable scalability through automation and tooling, reducing manual operational overhead
  • Participate in on-call rotations, lead production incident response, and contribute to blameless postmortems
  • Establish and champion operational excellence practices across the organization
  • Collaborate closely with Software Engineering, Program Management, Security, and Infrastructure teams to embed reliability throughout the software development lifecycle
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service