Are you a customer-obsessed, engineering-minded program leader who thrives in high-stakes, regulated environments? Do you want to build a new function from the ground up, one that prevents customer outages before they happen and transforms how Microsoft supports its most sensitive cloud customers? Join Advanced Cloud Engineering & Supportability (ACES), a global Azure engineering support organization within Azure Engineering Operations (EngOps). ACES delivers engineering-led, world-class support across Azure's Government and Sovereign cloud portfolio, including US Government (Fairfax), and National Partner Clouds in France (Bleu), Germany (Delos), and Singapore (Merlion). We are building a new Gov Customer Resiliency function within ACES that brings proactive reliability engineering in-house for Government customers. This is not reactive support, this is about changing the probability, blast radius, and recovery time of customer outages through engineering-led detection, readiness, and prevention. The role involves leading two interconnected workstreams under ACES Sovereign & Government: Gov Customer Resiliency (60%) and Sovereign Cloud Operations & Readiness (40%). For Gov Customer Resiliency, you will build and operate a new function from scratch, starting with a named high-profile Government customer and scaling to a portfolio of 3-5 top Gov/Azure Engineering Direct customers. This function brings proactive resiliency capabilities in-house for Government customers under Sovereign & Government business. You will own the full resiliency lifecycle: proactive detection and monitoring, incident and crisis management coordination, post-incident RCA and problem management, architecture and DR guidance, and parity closure between Government and Commercial cloud environments. This is a build + run role, with initial shadowing and codification of the operating model, then ownership and scaling. For Sovereign Cloud Operations & Readiness, you will drive support readiness, operational maturity, and customer experience strategy across Microsoft's Sovereign Cloud portfolio (Bleu, Delos, Merlion). This includes readiness frameworks for new Sovereign cloud launches, escalation flow design, CRI playbooks, Sev handling standards, cross-cloud staffing models, and compliance-aligned operational processes and playbooks. You will partner closely with Sovereign delivery leadership, Azure engineering, and regional National Cloud Operating Entity (NCOE) partners to ensure Sovereign clouds are support-ready, compliant, and capable of delivering exceptional customer outcomes from Day 1. This role is strategic as it sits at the intersection of two of ACES' most significant investments: Gov Customer Resiliency brings proactive reliability engineering in-house for Government customers, moving the organization from reactive support to engineered prevention, and Sovereign Cloud Readiness ensures Microsoft's most compliance-sensitive cloud environments are support-ready from Day 1, protecting customer trust. The person in this role will build a new function, run it customer-facing, and scale it across the most critical cloud environments Microsoft operates, defining how Microsoft supports its highest-trust customers.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Principal