Senior Incident Manager

ZendeskWashington, DC
1d$145,000 - $217,000Hybrid

About The Position

As a Senior Incident Manager at Zendesk, you will respond to production incidents and coordinate engineering response efforts as they occur within your region. This includes coordinating all activities associated with the engineering Incident Management process. You will support teams across the organization, primarily Engineering, Product Development and Customer Advocacy, in responding to, investigating, managing and resolving product incidents.

Requirements

  • BS/BA in relevant field or equivalent experience and a minimum of 5+ years of directly related experience in a SaaS or hosted application service provider environment
  • Expert knowledge of Incident and Problem Management ITIL terms and practice
  • Familiarity with overall ITIL terminology and practice
  • Experience facilitating review of technical incidents, documenting actions, and encouraging cooperative problem solving
  • Strong communication and business acumen and the ability to ensure a consistently high level of customer satisfaction
  • Understanding of IT operational processes, software development paradigms, and common SaaS provider architecture
  • Enthusiasm for working in a high paced environment while remaining analytical and detail-oriented
  • Proactive, with excellent decision making skills and ability to identify, prioritize, and articulate the highest impact tasks
  • Independently meet and own deliverables and drive your work to completion within specified timelines from start to finish
  • Exceptional communication skills both written and verbal, with a strong attention to detail
  • Collaborative, upbeat work ethic where you can take ownership and have fun

Responsibilities

  • Be the Incident Commander driving response and resolution Severity 1 - 4 incidents
  • Support the response for Severity 0 incidents
  • Participate in on call rotation
  • Drive down Median Time to Respond by analyzing response and driving improvements
  • Drive data analysis to identify areas for improvement and underlying problems to support reliability improvements
  • Drive the identification of underlying problems and sheppard them through the Proactive Problem Management process
  • Contribute to Incident Management reporting to ensure transparency to audiences across Zendesk
  • Make sure our documentation and training remains up to date
  • Provide Incident Management and Proactive Problem Management training across Zendesk
  • Support Engineering teams root cause analysis
  • Be a mentor to all Incident Managers
  • Ensure the Incident Management process keeps up with current trends to remain best in class in our industry
  • Run and facilitate the incident response as the Incident Commander
  • Respond to incidents as they occur within your region when on call or as needed
  • Create and maintain incident documentation and data
  • Coordinate incident response logistics
  • Work with customer facing counterparts to ensure communications are detailed and timely
  • Draft incident report and assign to appropriate parties
  • Facilitate post mortem incident review forums which include a global audience
  • Manage Remediation Item process including identification, ticket creation, tracking, follow up and reporting
  • Facilitate prioritization of incident remediation items to ensure that identified open issues are resolved
  • Consistently work to keep related remediation projects moving forward
  • Drive more detailed root cause analysis for high severity incidents that cross engineering teams.
  • Support immediate ongoing investigations and escalate on high risk discoveries
  • Manage Root Cause Analysis process and provide details for reporting
  • Ensure all incident details are accurate and fully documented across the incident report, tickets and system status dashboards
  • Contribute to weekly reporting to ensure transparency to various audiences across Zendesk
  • Refine/Improve Incident processes to drive team towards SLO goals and to ensure repeatable customer experience
  • Assist in Risk Assessment Activities
  • Support and backup other Incident Managers
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service