About The Position

The Service Reliability Engineer (SRE) role in Apple Services Engineering requires a mix of strategic engineering and design along with hands-on, technical work. This SRE will configure, tune, and fix multi-tiered systems to achieve optimal application performance, stability and availability. We manage jobs as well as applications on bare-metal and cloud computing platforms to deliver data processing for many of Apple’s global products. Our teams work with exabytes of data, petabytes of memory, and tens of thousands of jobs to enable predictable and performant data analytics enabling features in Apple Music, TV+, Appstore and other world class products. If you love designing, running systems that will impact millions of users then this is the place for you!

Requirements

  • At least 3+ years in a Site Reliability Engineering (SRE), DevOps role
  • BS degree in computer science or equivalent field with 5+ years or MS degree with 3+ years experience, or equivalent
  • 3+ years of running services in a large scale *nix environment
  • Understanding of SRE principles and goals along with prior on-call experience
  • Extensive experience in managing the applications on AWS & Kubernetes
  • Deep understanding and experience in one or more of the following - Hadoop, Spark, Flink, Kubernetes, AWS

Nice To Haves

  • Fast learner with excellent analytical problem solving and interpersonal skills
  • Experience supporting Java applications
  • Experience on Big Data Technologies
  • Experience working with geographically distributed teams and implement high level projects and migrations
  • Strong communication skills and ability deliver results on time with high quality

Responsibilities

  • Support java based applications & Spark/Flink jobs on Baremetal, AWS & Kubernetes
  • Ability to understand the application requirements (Performance, Security, Scalability etc.) and assess the right services/topology on AWS, Baremetal & Kubernetes
  • Build automation to enable self-healing systems
  • Build tools to monitor high performance & alert the low latency applications
  • Ability to troubleshoot application specific, core network, system & performance issues
  • Involvement in challenging and fast paced projects supporting Apple's business by delivering innovative solutions
  • Monitor production, staging, test and development environments for a myriad of applications in an agile and dynamic organization
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service