Overview At Chick-fil-A, Site Reliability Engineering is a technical function which mixes in influence. Across our 3000+ North American stores, cloud, and private data centers, SREs work with our DevOps teams to introduce and hone SRE principles, establish reliability goals, and develop tooling for operational observability. We are a small team working through many different patterns to bring observability to everyone. SREs at Chick-fil-A collaborate across teams and roles, feed learnings back into the organization, and learn all the ways technology is used along the way. The team is focused on tooling and enablement rather than traditional SRE roles. Our Flexible Future model offers a healthy mix of working in person and virtually, strengthening key elements of the Chick-fil-A culture by fostering collaboration and community. Responsibilities Work independently with DevOps teams to refine running production systems Building on-call processes Creating Incident Management and Response procedures Instrumenting for observability and coaching on best practices Monitoring SLIs Work to varying degrees with DevOps teams Provide consultation on SRE best practices Give guidance on specific topics Oversee groups of dedicated engineers Embed directly with teams Work with teams to define SLOs and error budgets Ensure services and systems meet availability needs of customers Document learnings to share with the broader engineering teams Ensure clear communication around SRE objectives Collaborate broadly across the entire engineering organization Oversee other SREs to bring best practices or learnings from across the organization to them Build internal tooling around operational observability Bring a strong mindset of continual improvement An aversion to toil and automatable tasks Advocate for SRE as a part of engineering culture Act as a conduit for Architecture, Security, Developer Experience, and Common Engineering Keep abreast of industry changes and evaluate for implementation Design and develop software solutions Serve as a model developer in programming languages like Java, Go, and Python Exercise skills in infrastructure and deployment services like AWS and Kubernetes as well as areas like application security, data analytics, and machine learning
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Number of Employees
5,001-10,000 employees