About The Position

Come build community, explore your passions, and do your best work at Microsoft. This opportunity allows you to bring your aspirations, talent, and excitement for the journey ahead while contributing to technology that operates at global scale. At Microsoft, we are learn-it-alls rather than know-it-alls. Our culture embraces a growth mindset, inspires excellence, and empowers teams to bring their best each day. If you’re excited to learn, collaborate, and make an impact, you’ll feel at home here. As a Software Engineer within Azure Compute, you will help design, build, and operate the foundational cloud infrastructure that powers Microsoft Azure. Our teams develop mission-critical services that run on millions of machines worldwide, supporting technologies such as Virtual Machines, Compute Nodes, Serverless Containers, VM Scale Sets, Images, Agents and Extensions, and the Compute Control Plane. You’ll work on systems that directly impact availability, reliability, performance, security, compliance, and scalability across Azure. From capacity planning and availability zone resiliency to backend performance tuning and billing infrastructure, your work will help keep the cloud running smoothly for customers around the world. In this role, you’ll design and implement extensible, maintainable backend services within large-scale distributed systems. You’ll collaborate with engineers and stakeholders across Azure to define requirements, incorporate feedback, and deliver high-quality solutions backed by strong testing, telemetry, and debugging practices. Whether your interests lie in systems programming, concurrency, distributed services, REST APIs, partitioned and replicated systems, or emerging areas like ML/AI-enabled infrastructure, you’ll have opportunities to explore new technologies and learn from experienced senior and principal engineers. If you enjoy working on technically deep problems with real-world impact at global scale, Azure Compute is one of the best places to start and grow your engineering career. Learn more about Azure Compute technologies: https://azure.microsoft.com/en-us/products/virtual-machines https://azure.microsoft.com/en-us/products/container-instances https://azure.microsoft.com/en-us/products/image-builder https://learn.microsoft.com/en-us/azure/virtual-machines/azure-compute-gallery https://learn.microsoft.com/en-us/azure/quotas/quotas-overview https://azure.microsoft.com/en-us/explore/global-infrastructure/availability-zones https://learn.microsoft.com/en-us/python/api/overview/azure/compute?view=azure-python https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/overview Microsoft’s mission is to empower every person and every organization on the planet to achieve more. We come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals—building a culture of inclusion where everyone can thrive at work and beyond.

Requirements

  • Bachelor's Degree in Computer Science, or related technical discipline with proven experience coding in C++, Rust, C#, Python, and/or Java, OR equivalent experience.
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.

Nice To Haves

  • Bachelor's Degree in Computer Science, or related technical field AND 1+ year(s) technical engineering experience with coding in C++, Rust, C#, Python, and/or Java, OR Master's Degree in Computer Science or related technical field with proven experience coding in C++, Rust, C#, Python, and/or Java, OR equivalent experience.
  • Professional and/or academic experience with cloud computing or cloud infrastructure, including platforms such as Azure, AWS, or GCP, and understanding of cloud fundamentals such as compute, networking, and storage.
  • Professional and/or academic experience in distributed systems, cloud computing, and/or cloud infrastructure.
  • Academic and/or professional experience with Linux and/or Windows.
  • Professional and/or academic experience with VS Code, GIT, network monitoring tools, and/or debuggers.
  • Experience with one or more of the following: PowerShell scripting RPC frameworks (gRPC) Kernel concepts Secure Coding REST APIs Partitioned and replicated services ML/AI development (ML models, fine tuning, MCP servers) Backend debugging and performance tuning

Responsibilities

  • Collaborates with stakeholders to break down work items into actionable tasks, provide estimates, and escalate risks or delays as appropriate. Supports feature deployments across Azure Compute services, considering customer and service impact while following safe deployment and operational best practices.
  • Works with partners and teammates to define feature requirements and incorporate feedback into design iterations. Establishes feedback loops using customer metrics and telemetry to drive continuous improvement across large-scale distributed systems.
  • Learns and applies coding standards and engineering best practices through code reviews, developing maintainable and extensible backend code with guidance from senior engineers. Contributes code for products, services, or features, reusing existing components where appropriate.
  • Uses debugging tools, logs, and telemetry to proactively and reactively diagnose issues in backend and systems-level services. Contributes to performance tuning, reliability improvements, and quality-focused engineering efforts across partitioned and replicated systems.
  • Supports identification and documentation of dependencies for features, gaining exposure to service interactions and backend architecture. Contributes to architectural discussions, design documentation, and technical validation efforts, including testing hypotheses and integrating automation.
  • Participates in quality assurance activities, including augmenting test plans, expanding coverage, and supporting automation. Learns how security, compliance, and reliability considerations influence design and operational decisions at scale.
  • Participates in live service operations and acts as a Designated Responsible Individual (DRI) for monitoring services and responding to degradation or incidents for scoped scenarios. Follows established playbooks to help restore service health within SLA expectations.
  • Develops and applies best practices for building scalable and secure systems. Learns about global and local regulatory requirements, customer scaling needs, and cross-team collaboration required to operate cloud infrastructure at global scale.
  • Ensures solutions meet Microsoft standards for security, privacy, safety, and accessibility. Leverages developer tools and automation in build, test, and deployment workflows, and proactively seeks opportunities to improve availability, reliability, efficiency, observability, and performance across Azure Compute services.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service