Tools and Process Site Reliability Engineer

Apple•San Diego, CA

46d

About The Position

Imagine what you could do here! At Apple, new insights have a way of becoming extraordinary products, services, and customer experiences very quickly. Intelligent people, inspiring vision, and innovative technologies are the norm. We have reinvented entire industries with our products and apply the same energy to ensure we leave the world better than we found it. Apple is seeking a Site Reliability Engineer to join the Hardware Technologies Tools and Process team to design, deploy, and scale the web applications that are used for operational needs for the organization. This role is responsible for expanding our infrastructure capabilities to introduce more automation, maintain a high bar for reliability, and ensure the security of the software created by collaborating with our developers. You will stay ahead of the curve by identifying future areas of growth for the platform. Additionally, this role has broad exposure to multiple areas as the team uses Apple's internal software and services and develops custom solutions to address gaps and novel business needs. These tools and processes provide support across the organization, impacting teams involved in all product lines and transforming the way people work at Apple. We are seeking a proactive and driven new member of our Tools and Process Team — join us! DESCRIPTION As a Site Reliability Engineer, you will be responsible for the operational excellence of multiple cloud-based applications, emphasizing deployment, security, scalability, and reliability on AWS and Apple infrastructure.

Requirements

Bachelor’s Degree in Computer Science, Computer Engineering, or equivalent practical experience.
2+ years of experience supporting large-scale production applications in an SRE, DevOps, or Systems Engineering role.
Proficiency in Python, with a strong ability to automate complex tasks and build tooling.
Strong experience with container orchestration, specifically Kubernetes (EKS or self-hosted).
Experience implementing Infrastructure as Code (IaC) using Terraform, Pulumi, or similar frameworks.
Working knowledge of Linux systems, including kernel/system tuning and networking fundamentals.
Experience building and maintaining fully automated CI/CD pipelines.

Nice To Haves

Experience managing hybrid cloud infrastructure (On-Premise data centers interacting with AWS, GCP, or Azure).
Familiarity with the Python (Django) and JavaScript (React) ecosystems, and how to support them in production.
Experience with "GitOps" workflows and tools such as ArgoCD or Jenkins.
Strong understanding of database reliability (Postgres, RDS, Redis) and monitoring (Prometheus, Grafana, ELK stack).
A collaborative approach to engineering, with experience mentoring team members and fostering a culture of security and reliability.
Experience with large-scale data and supporting Generative AI solutions.

Responsibilities

responsible for the operational excellence of multiple cloud-based applications, emphasizing deployment, security, scalability, and reliability on AWS and Apple infrastructure

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume