We’re looking for a strategic and hands-on Senior Manager of Site Reliability Engineering to lead our SRE team in delivering resilient, scalable, and high-performing systems. This role is central to our mission of operational excellence and customer satisfaction. You’ll guide a team of talented engineers, champion automation, and collaborate across disciplines to ensure our infrastructure supports business growth and innovation. A day in the life... Lead & Inspire Build and mentor a high-performing SRE team. Foster a culture of ownership, innovation, and continuous learning. Drive Reliability Ensure the availability and performance of critical services through proactive monitoring, incident response, and root cause analysis. Automate Everything Reduce manual toil by implementing automation across deployment, recovery, and scaling processes. Monitor & Observe Define and execute observability strategies using New Relic, Splunk, and other tools to detect and resolve issues before they impact users. Collaborate & Align Partner with engineering, product, and operations teams to align reliability goals with business priorities. Plan for Scale Lead capacity planning and performance tuning for services running on AWS EKS and other cloud-native platforms. Measure & Improve Establish and track SLOs, SLAs, and error budgets. Continuously refine processes to improve system reliability and team efficiency.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level