This role leads the design, reliability, and scalability of the enterprise cloud platform on AWS, operating at the intersection of Site Reliability Engineering, platform engineering, and cloud architecture. It defines the SRE operating model and platform vision, establishing reliability north-star architectures and internal developer platform standards that provide paved roads for secure, observable, and scalable services. The position owns mission-critical production systems and drives organization-wide initiatives such as zero-touch operations, governance-by-default, and resilience posture. It sets and implements best practices across infrastructure, CI/CD, observability, and cost optimization, while integrating reliability with security and compliance requirements. Working closely with engineering teams, this role improves system performance and developer productivity, mentors Staff and Principal engineers, and partners with executive leadership on risk management, customer commitments, and regulatory readiness for critical systems.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
No Education Listed