About The Position

The Senior Manager, DevOps Platform Engineering – Observability & Developer Experience, leads a global DevOps platform engineering team responsible for defining and executing enterprise observability strategy and advancing developer platform capabilities. This role establishes enterprise-wide visibility into system health and customer-impacting services, enabling proactive detection, rapid resolution, and prevention of service disruptions. By advancing automation-first platforms and modern observability practices, this leader improves availability, reduces incident impact, strengthens customer trust, and elevates enterprise reliability maturity. The position requires strong executive presence and the ability to communicate platform health, risk posture, reliability trends, and strategic investment priorities to senior and executive leadership.

Requirements

  • 5+ years of progressive people leadership experience
  • 8+ years in DevOps, Platform Engineering, Reliability Engineering, or enterprise platform administration
  • Experience operating in a continuous support environment
  • Experience with enterprise application performance monitoring and observability solutions
  • Experience with enterprise developer platforms
  • Executive-level communication and presentation skills
  • Experience with cloud platforms and modern software delivery practices

Responsibilities

  • Lead a global DevOps platform engineering team with enterprise accountability
  • Define and execute multi-year strategy and roadmaps for observability and developer platforms
  • Establish platform-as-a-product operating models, governance standards, and adoption frameworks
  • Drive enterprise adoption of reliability and observability standards across engineering teams
  • Champion automation, self-service capabilities, and engineering enablement at scale
  • Present strategy, reliability performance, and business-impact outcomes to executive leadership
  • Define and institutionalize enterprise standards for logging, metrics, distributed tracing, alerting, and event correlation
  • Improve visibility into customer-impacting services and reduce time to detection and resolution
  • Own the strategy and optimization of enterprise application performance monitoring and observability platforms
  • Partner with engineering teams to ensure monitoring and reliability are embedded into application architecture and delivery practices
  • Translate observability insights into actionable improvements that prevent service disruptions and reduce operational risk
  • Oversee governance and administration of enterprise developer platforms
  • Improve automation, integration, and workflow efficiency to increase engineering productivity and reliability
  • Ensure compliance, security alignment, vendor management, and cost optimization
  • Maintain continuous operational support for critical enterprise platforms
  • Govern major incident response, escalation processes, and on-call standards
  • Establish and track service level objectives, reliability indicators, and customer-impact metrics
  • Drive ongoing resilience, automation, and operational maturity improvements
  • Develop and mentor a high-performing global team
  • Provide coaching, performance management, and career development
  • Foster a culture of ownership, accountability, reliability, and continuous improvement
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service