About The Position

We are seeking a hands-on Platform Reliability Manager to lead the reliability, operations, and continuous improvement of several business - critical Platforms used across Marketing, HR, and Business Operations. This role sits within Adobe Technology Services and is responsible for ensuring these Platforms are dependable, observable, and well - operated at scale. You will lead a small, globally distributed team setting operational standards, guiding incident response, and partnering closely with both internal Partners and external SaaS providers. The ideal candidate brings a strong foundation in operating reliable platforms, along with proficiency to communicate clearly above the technical layer and hold internal and external Vendors accountable to the consistency and service levels Adobe expects.

Requirements

  • Experience managing and developing a global, distributed reliability team
  • Strong understanding of observability, incident management, and operational standard methodologies
  • Experience crafting or enforcing change and deployment processes that balance speed with stability
  • Demonstrated ability to manage vendor relationships, including setting expectations, reviewing performance, and driving accountability during incidents or service degradation
  • Familiar with employing AI effectively through context curation and documentation to achieve high velocity and quality in execution
  • Bachelor's degree in engineering or information systems
  • 10+ years of experience in a similar

Responsibilities

  • Own the reliability and operational health of enterprise operational efficiency platforms, with a mix of end - to - end ownership and shared operational responsibility
  • Lead and develop a geographically dispersed team across North America and Europe, including managing an on - call rotation
  • Establish and evolve a standard operational model for change management and incident response across platforms
  • Drive operational rigor through strong observability practices, including metrics, alerting, and insight into platform health
  • Lead response to major incidents, ensuring clear communication, effective coordination, root cause identification, and durable remediation
  • Act as the primary operational point of contact for SaaS platform vendors, holding providers accountable for reliability, incident response, and service commitments
  • Communicate platform health, risks, and tradeoffs in business - relevant terms to functional partners and leadership
  • Detail operational standards as context for AI (and leverage AI) to improve reliability practices

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Manager

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service