Director, Engineering Operations

TKOAustin, TX
Hybrid

About The Position

On Location is a global leader in premium experiential hospitality, offering ticketing, curated guest experiences, live event production and travel management across sports, entertainment, fashion and culture. On Location provides unrivaled access for corporate clients and fans looking for official, immersive experiences at marquee events, including the Olympic and Paralympic Games, FIFA World Cup 2026, Super Bowl, NCAA Final Four, and more. An official partner and/or service provider to over 150 iconic rights holders, such as the IOC (the Milano Cortina 2026 and Los Angeles 2028 Olympic Games), FIFA, NFL, NCAA, UFC, WWE, and PGA of America, the company also owns and operates a number of its own unique experiences. On Location is a subsidiary of TKO Group Holdings, Inc. (NYSE: TKO), a premium sports and entertainment company. TKO Group Holdings, Inc. (NYSE: TKO) is a premium sports and entertainment company. TKO owns iconic properties including UFC, the world’s premier mixed martial arts organization; WWE, the global leader in sports entertainment; and PBR, the world’s premier bull riding organization. Together, these properties reach 1 billion households across 210 countries and territories and organize more than 500 live events year-round, attracting more than three million fans. TKO also services and partners with major sports rights holders through IMG, an industry-leading global sports marketing agency; and On Location, a global leader in premium experiential hospitality.

Requirements

  • 10+ years in software engineering operations, site reliability engineering, platform or DevOps leadership supporting 24x7 systems
  • Experience leading and improving team performance measuring against DORA metrics
  • Proven track record leading incident response and postmortems, with measurable reductions in MTTD, MTTI, and MTTR and decreases in MTBF
  • Hands-on experience implementing observability and SLO/SLI frameworks
  • Strong background with CI/CD, trunk-based development, automated testing strategies, and release orchestration
  • Security-by-design mindset, experience with IRP/SIRP operations and DevSecOps practices
  • Excellent stakeholder management; effective and concise communication skills with both technical and non-technical audiences
  • Ability to lead and execute through ambiguity and high-demand, high-stakes events

Nice To Haves

  • Experience with Datadog, feature flagging platforms, and progressive delivery
  • History of operating platforms supporting large-scale events, ticketing, payments, or high-traffic commerce
  • Experience applying AI in technical operations and/or software delivery
  • Background in capacity planning, performance engineering, and chaos testing
  • Familiarity with regulated environments and audit processes; comfortable publishing operational evidence and controls
  • Experience shaping org-level SDLC, STLC, and SSDLC standards across internal and partner teams

Responsibilities

  • Own end-to-end engineering operations across RTB: intake, triage, prioritization, change/release governance, incident response, and post-mortems
  • Drive AI-enabled operational efficiency and automation across the SDLC, STLC, and SSDLC
  • Establish comprehensive observability with golden signals, SLIs/SLOs, anomaly detection, auto-remediation, and cost/capacity insights
  • Define and uphold SLOs for critical domains and guest journeys (checkout, inventory sync, fulfillment, payments)
  • Standardize Datadog logs, metrics, traces, and RUM/synthetics to accelerate detection and root-cause analysis
  • Continuously measure and improve delivery performance through DORA metrics
  • Enforce release discipline: balanced planned vs. unplanned releases, readiness criteria, rollback playbooks, and event blackout windows
  • Support major events with elevated operational rigor: dry runs, performance testing, strict change controls, enhanced monitoring, and clear comms protocols
  • Partner with Business Operations, Technical Product, and Solutions Architecture to maintain a single, aligned view of priorities, dependencies, and SLAs
  • Lead post-event and incident post-mortems to drive continuous improvement of SOPs, runbooks, response protocols, and reliability
  • Mature incident and security response in close partnership with TechOps and Security & Compliance (IRP/SIRP)
  • Continuously reduce technical debt across performance, security, and maintainability
  • Foster learning, blameless culture with KPI/OKR-driven improvements and transparent communication
  • Publish clear weekly and monthly operational health and stability reporting

Benefits

  • health care
  • retirement
  • vacation
  • other paid time off
  • additional offerings
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service