About The Position

The Staff Site Reliability Engineer (Azure) is responsible for designing, building, and evolving cloud-native, containerized infrastructure on Microsoft Azure that powers our data products and services. This role plays a critical part in advancing our platform maturity by supporting cross-functional squads, leading complex technical initiatives, and ensuring the availability, security, scalability, and reliability of our data ecosystem. As a Staff Engineer, you will bring deep expertise in Azure cloud architecture, Azure infrastructure implementation, systems design, networking, databases, and modern data technologies. You will also contribute hands-on experience with complex technology adoption, infrastructure automation, and high-scale distributed systems, with a strong emphasis on building and operating secure, resilient, and scalable solutions in Microsoft Azure environments. This role requires demonstrated experience architecting, implementing, and optimizing Azure-based platforms and services, including cloud networking, compute, storage, identity and access management, observability, and container orchestration. The ideal candidate will be capable of leading the design and delivery of enterprise-grade cloud solutions using Azure-native and hybrid-cloud patterns, and of driving best practices for reliability, security, and operational excellence across the data platform.

Requirements

  • 5 or more years of relevant work experience with a Bachelors Degree or at least 2 years of work experience with an Advanced degree (e.g. Masters, MBA, JD, MD) or 0 years of work experience with a PhD
  • Advanced experience designing and operating large‑scale, cloud‑native infrastructure (AWS preferred).
  • Strong hands-on proficiency with Infrastructure as Code (Terraform), including building reusable modules and platform-level components.
  • Deep understanding of Kubernetes and container orchestration for data platforms and distributed systems.
  • Knowledge of CI/CD systems, pipeline design, automation, and secure deployment practices.
  • Strong competencies in systems design, networking, distributed systems, and reliability engineering principles (SLOs, error budgets, incident response).
  • Understanding of database technologies including SQL, NoSQL, and data storage patterns.
  • Experience with observability stacks (Prometheus, Grafana, OpenTelemetry, ELK/EFK, Datadog, or similar).
  • Proficient in automation using Bash, Python, or Ansible-like tools.
  • Working knowledge of software engineering practices (version control, testing, code reviews, design patterns).
  • Proven ability to lead complex, cross-functional technical initiatives from design to production rollout.
  • Demonstrated experience driving technology adoption and platform modernization across multiple teams.
  • Comfortable navigating ambiguity, making high-quality architectural trade-offs, and advocating for long-term technical investments.
  • Excellent problem-solving skills with a track record of reducing toil, eliminating technical debt, and improving system reliability.
  • Excellent written and verbal English communication skills.
  • Ability to collaborate with data engineers, platform engineers, SREs, security teams, and product teams in a fast-paced environment.

Nice To Haves

  • 6 or more years of work experience with a Bachelors Degree or 4 or more years of relevant experience with an Advanced Degree (e.g. Masters, MBA, JD, MD) or up to 3 years of relevant experience with a PhD
  • Bachelor’s degree in Computer Science, Engineering, or related field (Desirable but not mandatory).

Responsibilities

  • Designing, building, and evolving cloud-native, containerized infrastructure on Microsoft Azure.
  • Supporting cross-functional squads.
  • Leading complex technical initiatives.
  • Ensuring the availability, security, scalability, and reliability of our data ecosystem.
  • Contributing hands-on experience with complex technology adoption, infrastructure automation, and high-scale distributed systems.
  • Building and operating secure, resilient, and scalable solutions in Microsoft Azure environments.
  • Architecting, implementing, and optimizing Azure-based platforms and services, including cloud networking, compute, storage, identity and access management, observability, and container orchestration.
  • Leading the design and delivery of enterprise-grade cloud solutions using Azure-native and hybrid-cloud patterns.
  • Driving best practices for reliability, security, and operational excellence across the data platform.
  • Defining and implementing infrastructure standards, best practices, and architectural guidelines.
  • Participating in and improving on-call processes, incident management, and post-incident reviews.
  • Producing clear technical documentation, architectural proposals, and decision records.

Benefits

  • Medical
  • Dental
  • Vision
  • 401(k)
  • FSA/HSA
  • Life Insurance
  • Paid Time Off
  • Wellness Program
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service