Architect Operational Processes: Design and implement scalable and automated operational processes for incident management, change execution, security operations, capacity planning, monitoring, and disaster recovery. Drive Reliability Engineering: Collaborate with Operations and Development teams to ensure that operational workflows align with reliability and scalability goals. Operational Excellence: Define and implement KPIs and SLAs for operational performance, and develop continuous improvement programs to meet and exceed them. Automation First: Lead efforts to automate repetitive and manual operational tasks using tools, scripts, and platforms to improve efficiency and reduce risk. Incident Management Leadership: Develop and refine incident management and response strategies, ensuring rapid resolution and root cause analysis for critical issues. Capacity and Performance Management: Architect and implement systems to monitor, predict, and optimize infrastructure utilization across a global scale. Cross-Functional Collaboration: Partner with engineering and product teams to ensure operational readiness for new services and features. Mentorship and Knowledge Sharing: Act as a thought leader and mentor within the operations team, sharing best practices and driving operational excellence across the organization. US Citizenship AND active TS/SCI w/Poly US Government Security Clearance required.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Principal
Education Level
No Education Listed
Number of Employees
5,001-10,000 employees