Director, Site Reliability Engineering (SRE)

IonQPleasanton, CA
4d$192,979 - $252,659Onsite

About The Position

We are looking for a Director of SRE. As a Director of SRE, you'll be part of a cross-functional team whose mission is to lead IonQ on its journey to build the world's best quantum computers to solve the world's most complex problems. In this role, you will build and lead SRE/DevOps organizations operating multi-tenant SaaS at scale on AWS, Azure, and GCP. You will be responsible for production ownership of availability, latency, incident response, and capacity management while implementing an SRE operating model using SLOs/SLIs and error budgets. Your leadership will bridge the gap between cloud infrastructure architecture and AI-ready operations to ensure a secure-by-default platform for our product teams.

Requirements

  • At least 15 years of experience building and leading SRE/DevOps organizations operating multi-tenant SaaS at scale on AWS, Azure, or GCP.
  • Deep technical knowledge of cloud infrastructure architecture, networking, containers, and secure-by-default platform guardrails.
  • Proven ability to run production for global enterprise/federal customer bases, including tenant isolation and data residency considerations.

Nice To Haves

  • AI-ready operations experience for networking SaaS, including streaming telemetry pipelines and closed-loop automation.
  • Experience with Juniper Mist AI or similar large-scale networking SaaS platforms is strongly preferred.
  • Knowledge of AI-native networking concepts such as service-level expectations (SLEs) and proactive anomaly detection.
  • Security and resilience mindset aligned to Zero Trust designs and continuous telemetry policy enforcement.
  • Hands-on experience operating SaaS products for networking/security domains where customer impact is tied to network behavior.
  • Executive communication strength, with the ability to present SLO posture, incident learnings, and risk to leadership.

Responsibilities

  • Build and lead SRE/DevOps organizations operating multi-tenant SaaS at scale on AWS/Azure/GCP, including production ownership for availability, latency, incident response, DR, and capacity management.
  • Architect cloud infrastructure focusing on networking (VPC/VNet, routing, private connectivity), compute, containers/orchestration, and data platforms.
  • Implement SRE operating models using SLOs/SLIs and error budgets to balance reliability and delivery velocity.
  • Drive CI/CD and release engineering leadership, ensuring safe progressive delivery (canary/blue-green), automated rollbacks, and measurable deployment health.
  • Scale Infrastructure-as-Code (IaC) and platform automation through "golden pipelines," standardized modules, and secure-by-default guardrails.
  • Lead cross-functional execution across Product, Engineering, Security, Support, and Customer Success while setting clear ownership boundaries.
  • Own organizational planning, including hiring, team topology, on-call models, budget, and vendor strategy.
  • Establish a culture of operational excellence through blameless postmortems, corrective-action tracking, and toil reduction.

Benefits

  • comprehensive medical, dental, and vision plans
  • matching 401K
  • unlimited PTO and paid holidays
  • parental/adoption leave
  • legal insurance
  • home technology stipend
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service