This role involves leading the Site Reliability Engineering (SRE) practice, focusing on daily operations, incident management, and technical oversight to ensure system resilience and reliability. The Practice Technical Manager will also engage in cross-functional leadership, translating business priorities into reliability roadmaps and supporting customer-facing discussions. The core responsibilities are divided into three main areas: Team Leadership & Operational Management, Technical Oversight, and Cross-Functional Leadership. In Team Leadership, the focus is on running daily operations, maintaining a healthy on-call program, overseeing incident management, establishing operational KPIs, coaching SREs, and ensuring documentation is current. Technical Oversight includes providing architecture-level guidance on resilience and observability, validating SLIs/SLOs, reviewing reliability design work, participating in high-severity incidents, and ensuring engineering quality for IaC, CI/CD, and Kubernetes operations. Cross-Functional Leadership involves acting as a primary point of contact for internal stakeholders, translating business priorities into reliability roadmaps, aligning teams around shared reliability objectives, and supporting customer-facing conversations.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
No Education Listed