The Service Reliability Engineer (SRE) designs, builds, and operates reliability practices and technical capabilities that ensure critical engineering and enterprise services are available, performant, secure, and resilient. This is a hands-on, non-manager role focused on improving service reliability through observability, incident response, automation, and engineering excellence. This role partners closely with Product Owners, development teams, infrastructure/platform engineering, Quality/Validation, Security, and Enterprise Architecture to define reliability targets, implement operational controls, and maintain documentation appropriate for regulated environments. The SRE helps standardize operational patterns across environments (dev/test/prod), including monitoring baselines, access controls, runbooks, change management, and deployment readiness. Key outcomes include establishing and measuring Service Level Indicators/Objectives (SLIs/SLOs), improving alert quality and troubleshooting speed, reducing incident frequency and Mean Time to Recovery (MTTR), and enabling safe, repeatable releases through automation and operational readiness. The SRE identifies reliability risks and technical gaps, recommends scalable and resilient designs, implements reusable operational tooling, and participates in Agile ceremonies and on-call support aligned to the team’s ways of working.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior