The Azure Compute team designs and operates a fault‑tolerant, distributed system built on commodity datacenter hardware to provide infrastructure for hosting cloud applications in virtual machines (VMs). The team delivers an experience in which resources appear limitless, highly elastic, and consistently available. Within Azure Compute, the Availability Platform team is dedicated to helping every Azure virtual machine meet a service-level agreement (SLA) of 99.99 percent or higher. This work relies on data-informed decisions, automated repair workflows, and services that monitor the health of millions of Azure machines. The team also uses artificial intelligence (AI) and machine learning to develop predictive failure models that enable proactive live migration, reducing customer impact and enhancing platform resilience. The team is also exploring generative artificial intelligence to improve diagnostics, automate root cause analysis, and accelerate issue resolution. Collaboration with data scientists and researchers supports ongoing improvements to self-healing capabilities across the platform. As a Software Engineer II, you will contribute to long-term investments in both people and technology through comprehensive designs, iterative development, high-quality implementation, and rapid response to customer feedback. Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Number of Employees
5,001-10,000 employees