About The Position

In this pivotal role, you'll define and execute the reliability engineering roadmap while managing a team responsible for ensuring system stability across cutting-edge infrastructure and AI-native architectures. Your impact will bridge the gap between engineering efficiency and operational excellence, paving the way for scalable growth and enhanced service delivery. This position demands a visionary leader with a track record of transforming reliability within innovative technology environments. You will leverage your extensive experience to create a forward-looking vision that meets organizational goals while ensuring compliance and security.

Requirements

  • 15+ years of engineering experience, with 7+ years in leading reliability or infrastructure teams.
  • Proven track record managing organizations of 40+ engineers across multiple teams.
  • Demonstrated experience evolving reliability operating models for scalable businesses.
  • Expertise in regulated sectors where compliance and data sensitivity are critical.
  • Strong understanding of SRE principles, including SLOs and incident management.
  • Technical command of AWS, Terraform (IaC), and modern observability stacks.
  • Experience owning cloud infrastructure budgets and cost management.
  • Familiarity with AI/ML workloads and their reliability requirements.
  • Executive presence for engaging with the C-suite on risk management.

Responsibilities

  • Define and execute the reliability engineering roadmap, aligning with enterprise growth.
  • Balance centralized platform capabilities with distributed ownership for scalability.
  • Establish SLO/SLI/error budget frameworks for feature velocity and system stability.
  • Lead infrastructure cost management and capacity planning to meet enterprise commitments.
  • Develop and scale a multi-disciplinary team while fostering a culture of ownership.
  • Drive continuous improvement through DORA metrics and incident trend analysis.
  • Empower developers with self-service tooling and clear documentation.
  • Act as the primary engineering interface for compliance and security requirements.
  • Collaborate with executives to position reliability as a key enabler for success.

Benefits

  • A dynamic, rapidly growing organization focused on helping businesses thrive.
  • Comprehensive Medical, Dental, & Vision Insurance for full-time employees.
  • Competitive and fair pay commensurate with experience.
  • Maternity and paternity leave policies for full-time employees.
  • Short and long-term disability coverage.
  • Opportunities to learn from a dedicated leadership team.
  • Top-of-the-line company swag for team members.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service