Senior Software Engineer - Live Site Reliability

MicrosoftVancouver, BC
CA$114,400 - CA$203,900

About The Position

Microsoft is a place where passionate innovators collaborate to solve complex challenges and create technology that empowers people and organizations around the world. The Azure Data engineering team is responsible for building and operating Microsoft’s data platform, including services such as Microsoft Fabric, Azure SQL Database, Azure Cosmos DB, Azure Data Factory, Azure Synapse Analytics, Azure Event Grid, and Power BI. These services enable customers to ingest, process, and analyze data at scale. The Data Integration team builds capabilities that enable customers to move, transform, and prepare data efficiently across systems. This role focuses on live site reliability, operational excellence, and automation within the Customer Data Integration (CDI) organization. We are hiring a Senior Software Engineer to design and build systems that improve incident response, automate operational workflows, and enhance the reliability of data integration services used by millions of customers.

Requirements

  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  • Hands-on experience managing live site operations, including log analysis, incident response, and telemetry-based diagnostics

Nice To Haves

  • Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  • Experience working with Live Site operations, including incident triage, monitoring, alerting systems, and production support for large-scale services
  • Experience building automation systems using AI/ML techniques or large language models (LLMs)
  • Experience with incident management, telemetry analysis, and operational monitoring systems
  • Experience with Microsoft Azure, Power BI, Microsoft Fabric, or related cloud services
  • Experience authoring troubleshooting guides (TSGs) or analyzing incident patterns
  • Understanding of service-level agreements (SLAs), escalation processes, and customer communication practices
  • Experience collaborating with globally distributed engineering teams
  • Experience participating in an on-call rotation supporting production services

Responsibilities

  • Provide on-call support for customer-facing services, including monitoring, investigation, severity assessment, and coordination with engineering teams for incident resolution
  • Design and develop automation systems to support incident triage, including correlation of alerts with deployments, feature changes, and known issues
  • Build tools for incident lifecycle management, including summarization, reporting, and documentation to improve operational efficiency
  • Develop and maintain classification and routing systems for incoming incidents using telemetry, service metadata, and historical patterns
  • Analyze operational metrics such as time-to-triage and incident resolution effectiveness; identify trends and drive improvements through automation and process enhancements
  • Partner with cross-functional engineering teams to improve reliability, reduce operational overhead, and enhance service quality
  • Contribute to the design, development, and improvement of distributed systems and cloud services as part of the broader CDI engineering scope
  • Demonstrate Microsoft’s culture and values in day-to-day work and collaboration

Benefits

  • The typical base pay range for this role across Canada is CAD $114,400.00 - CAD $203,900.00 per year.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service