About The Position

At Docker, we make app development easier so developers can focus on what matters. Our remote-first team spans the globe, united by a passion for innovation and great developer experiences. With over 20 million monthly users and 20 billion image pulls, Docker is the #1 tool for building, sharing, and running apps—trusted by startups and Fortune 100s alike. We’re growing fast and just getting started. Come join us for a whale of a ride! Docker supports customers using the largest and most popular container registry service in the world today, Docker Hub. Millions of users - community developers, open source projects and Independent Software Vendors - push and pull Docker container images billions of times through Docker Hub. We are seeking a Senior Escalation & Incident Manager to own the end-to-end experience for our most complex and critical customer issues. In this role, you sit at the junction between customer support, engineering, and product — ensuring that escalated issues and service incidents receive the urgency, consistency, and executive-level communication they demand. You will help build and improve the frameworks and standards that govern how escalations and incidents are handled, and serve as the voice of the customer when critical issues threaten to erode trust or impact retention critical issues threaten to erode trust or impact retention.

Requirements

  • 6+ years of experience in escalation & incident management, SRE, or production operations in a cloud/SaaS environment
  • Proven experience leading high-severity incident response in complex distributed systems
  • Experience working in 24/7 on-call or escalation environments
  • Familiarity with compliance or security incident response
  • Experience building or scaling incident management programs
  • Strong understanding of: Cloud platforms (AWS, GCP, Azure), Observability tools (logs, metrics, tracing)
  • Exceptional communication skills with the ability to remain calm under pressure
  • Experience influencing cross-functional teams without direct authority
  • Ability to communicate effectively with both technical teams and executive stakeholders
  • Strong focus on process improvement and operational rigor
  • Data-driven approach to identifying trends and driving improvements

Responsibilities

  • Escalation/Incident Management & Resolution Own the escalation lifecycle from intake to resolution — ensuring cases are triaged accurately, prioritized by business impact, assigned to the right resource, and driven to closure without falling through the cracks. Maintain hands-on involvement in the most critical escalations, providing guidance, coordinating engineering resources, and managing stakeholder communication in real time.
  • Team Mentorship & Development Mentor, grow, and support a global team of Support Leaders and Engineers. Partner to set clear expectations for case quality, handling, and customer communication standards. Coordinate and train cross-functional teams to triage, mitigate, and resolve escalations & incidents quickly.
  • Customer & Executive Communication Serve as a primary point of contact for enterprise customers and internal stakeholders during high-severity escalations and incidents. Craft and deliver clear, confident written and verbal updates. Manage expectations with precision — knowing when to reassure, when to escalate urgency internally, and when to bring in executive sponsorship.
  • Engineering & Product Partnership Build strong working relationships with Engineering and Product to ensure escalated issues and incidents receive timely attention and appropriate prioritization. Advocate for customer-impacting bugs and systemic issues in roadmap and sprint planning conversations. Establish feedback loops that translate escalation patterns into actionable product and reliability improvements.
  • Process Design & Standards Help define and maintain the escalation/incident criteria, process flow, SLA/SLO commitments, and communication protocols that govern how issues/incidents are handled. Ensure playbooks are current, consistently followed, and refined after major incidents or escalations. Partner with Product and Engineering to produce and deliver post-incident root cause analysis documentation.
  • Metrics & Operational Health Own the KPIs that reflect escalation and incident team performance — Report regularly to Support and Engineering leadership with trend analysis and actionable recommendations. Use data to make the case for tooling improvements, staffing adjustments, or process changes.
  • Voice of the Customer Synthesize escalation data and direct customer feedback into structured insights for Product, Engineering, and Customer Success. Identify recurring themes that indicate deeper systemic issues — whether in the product, documentation, onboarding, or support process — and champion resolution at the organizational level.

Benefits

  • Freedom & flexibility; fit your work around your life
  • Designated quarterly Whaleness Days plus end of year Whaleness break
  • Home office setup; we want you comfortable while you work
  • 16 weeks of paid Parental leave
  • Technology stipend equivalent to $100 net/month
  • PTO plan that encourages you to take time to do the things you enjoy
  • Training stipend for conferences, courses and classes
  • Equity; we are a growing start-up and want all employees to have a share in the success of the company
  • Docker Swag
  • Medical benefits, retirement and holidays vary by country
  • Remote-first culture, with offices in Seattle and Paris

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

251-500 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service