About The Position

The Compute team, central to Azure, is experiencing rapid growth, focusing on building and managing fault-tolerant distributed systems atop commodity datacenter hardware. This infrastructure is designed to host customer applications, providing millions of virtual machines for workloads in the cloud. The team cultivates a collaborative environment, encouraging the development of ideas and empowering engineers to create innovative solutions. This role presents a significant opportunity to contribute to a highly strategic initiative for Microsoft, emphasizing the delivery of customer value in mission-critical environments within a growth-oriented culture. Microsoft's overarching mission is to empower individuals and organizations globally to achieve more, fostering a workplace culture built on respect, integrity, accountability, and psychological safety.

Requirements

  • Master's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 4+ years technical experience in software engineering, network engineering, or systems administration OR equivalent experience.
  • Candidates must be able to meet Microsoft, customer and/or government security screening requirements.
  • Active U.S. Government Top Secret Clearance with access to Sensitive Compartmented Information (SCI) based on a Single Scope Background Investigation (SSBI) with Polygraph.
  • Ability to meet Microsoft, customer and/or government security screening requirements are required pre-offer and post-hire for this role.
  • Successful verification of the stated security clearance to meet federal government customer requirements.
  • Pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
  • Verification of U.S. citizenship due to citizenship-based legal restrictions.

Nice To Haves

  • Doctorate Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration OR Master's Degree in Computer Science, Information Technology, or related field AND 6+ years technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 8+ years technical experience in software engineering, network engineering, or systems administration OR equivalent experience.
  • 3+ years technical experience working with large-scale cloud or distributed systems.
  • Experience writing scripts and functional programming code to automate tasks, using languages such as Python, JavaScript, or Shell scripting.
  • Experience developing end-to-end technical expertise in the architecture, code, features, and operations of specific products as required to implement improvements in product availability, security, quality, observability, reliability, efficiency, observability, and/or performance.
  • Experience driving code/design reviews with the engineering teams that develop and/or manage those products and shares learnings and recommendations across engineering teams working on related products within their organization and other organizations as relevant.
  • Knowledge of distributed systems.
  • Highly effective written and oral communication skills.

Responsibilities

  • Acts as a Designated Responsible Individual (DRI) and guides other engineers by developing and following the playbook, working on call to monitor system/product/service for degradation, downtime, or interruptions, alerting stakeholders about status and initiates actions to restore system/product/service for simple and complex problems when appropriate.
  • Proactively seeks new knowledge and adapts to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of service fabric services while also driving consistency in monitoring and operations at scale.
  • Drives development of design documents for a product, application, service, or platform.
  • Creates, implements, optimizes, debugs, refactors, and reuses code to establish and improve performance and maintainability, effectiveness, and return on investment (ROI).
  • Leverages subject-matter expertise of product features and partners with appropriate stakeholders (e.g., project managers) to drive a workgroup's project plans, release plans, and work items.
  • Take full ownership of assigned services, actively contributing to its enhancement across all cloud environments.
  • Ensure the service maintains parity with the commercial cloud, delivering high support standards for customers.
  • Participate in the service lifecycle, including design, development, deployment, and maintenance.
  • Collaborate with cross-functional teams to uphold the highest standards of quality and performance.
  • Engage in continuous improvement initiatives to enhance the service's capabilities and user experience.
  • Identify opportunities for automation and optimization within the cloud to better support customers.
  • Design and implement automation solutions to streamline operations, reduce manual effort, and improve overall service delivery.
  • Focus on optimizing existing systems and processes to boost performance and customer satisfaction.
  • Embody our culture and values.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service