Director, Site Reliability & Operations

SwitchflyColorado Springs, CO
$165,000 - $185,000Remote

About The Position

Switchfly is hiring a Director of Site Reliability & Operations to own the operational backbone of a platform that processes travel and loyalty transactions for some of the world’s largest airlines and financial institutions. This is not a role for someone who manages by dashboard and delegates by email. We need a technically engaged leader with a full tool belt — someone who understands our platform deeply enough to participate in the hard conversations, and who owns outcomes rather than activities. This role enables 50+ developers to ship secure, PCI-compliant releases at least weekly — supporting the DevOps culture that makes that pace sustainable. We need a leader who supports delivery velocity, spends the reliability and change budget strategically, and drives toward faster, safer delivery. Security and compliance set the floor; velocity is the ambition. And as AI reshapes how software is built and operated, we expect this leader to help us embrace it thoughtfully — not lock it out. You will lead a team of SRE, DevOps, DBA, network, security, and corporate IT professionals, working in close partnership with engineering leadership across a PCI Level 1-compliant, 24x7 enterprise platform.

Requirements

  • You have a full tool belt — technically engaged, platform-curious, and willing to log into systems, participate in firefights, and develop genuine understanding of what you’re operating
  • 7+ years in SRE, DevOps, cloud infrastructure, or security engineering, with 4+ years leading technical teams
  • Deep AWS experience across compute, networking, storage, and managed services in a production enterprise environment
  • Hands-on familiarity with the security and compliance discipline — you’ve operated in PCI, SOC 2, or equivalent regulated environments and understand what compliance actually requires versus what it looks like on paper
  • You operate as a business stakeholder, not just a technical function — comfortable working within SAFe or similar delivery frameworks to get security and reliability work into the roadmap alongside feature development
  • You manage through credibility and technical engagement, not just title — your team respects you because you understand their work
  • You are inquisitive, direct, and outcome-oriented — you form opinions, communicate them clearly, and own what happens next
  • Your colleagues are inspired to follow your lead

Responsibilities

  • Own site reliability and availability for our cloud-hosted platform — 24x7 uptime, monitoring, alerting, anomaly detection, and incident response programs
  • Drive security outcomes across the platform — tracking findings from SonarQube, penetration tests, and vulnerability tools, and acting as a business stakeholder to get remediation work scoped, prioritized, and into the engineering delivery pipeline
  • Own PCI Level 1 compliance currency — when standards evolve, you understand the requirement, translate it into engineering terms, and drive adoption; you don’t just surface the factoid
  • Participate as a business stakeholder in SAFe planning — bringing security, compliance, and reliability work into the engineering delivery pipeline alongside feature development
  • Lead infrastructure patching and maintenance — OS, database, and system-level currency within our AWS environment, coordinating monthly maintenance windows and CI-driven image refresh cadence
  • Manage and develop an internationally distributed team across SRE, DevOps, DBA, network, security, and corporate IT functions
  • Own the AWS cost and capacity budget — monitoring spend, optimizing resource utilization, and making strategic tradeoff decisions in partnership with engineering leadership
  • Partner with engineering directors to define the boundary between infrastructure and application-layer security, and ensure nothing falls between the cracks
  • Own vendor outcomes across our cloud and tooling ecosystem — holding partners accountable and ensuring contracts reflect our operational needs
  • Guide personal and career development of your people
  • Foster a culture where reliability and security are shared team values, not external mandates

Benefits

  • Discretionary Time Off (DTO) – Take time off when you need it. We trust our employees to manage their time responsibly while meeting business needs.
  • 15 Company-Paid Holidays – Including a company-wide break from Christmas Eve through New Year’s Day.
  • Comprehensive Benefits Package – Switchfly offers a full suite of health benefits, with the company covering an average of 87% of employee premiums.
  • 401(k) with Company Match – We support long-term financial wellness with a competitive retirement plan.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service