FLEX Senior Service Availability Manager

Marriott InternationalBethesda, MD
120d$52 - $80Hybrid

About The Position

The Sr. SRE Service Availability Manager plays a key role in ensuring the peak performance and availability of our Enterprise IT infrastructure and services. This position combines proactive site reliability engineering with adept incident command to lead our efforts in minimizing service disruptions and enhancing our technology landscape. With a focus on automation, cloud technologies, and continuous process improvement, the ideal candidate brings a mix of technical expertise and leadership skills, aimed at delivering exceptional service reliability. This role demands a proactive problem-solver with extensive experience in IT operations and a passion for innovation, ready to tackle challenges in a dynamic, 24x7x365 environment.

Requirements

  • 7+ years of experience in an information technology environment.
  • 5 years of experience in information technology focused on IT Operations that include troubleshooting complex network, server, storage, and/or application issues.
  • 3 years minimum operations experience involving incident, problem, change, and release management that included leading calls and documenting outcomes.
  • Undergraduate degree or equivalent experience/certification.
  • Ability to cover shifts in a 24x7x365 environment and on-call responsibilities.
  • Proficiency in scripting languages (Python, Shell) and familiarity with automation tools (such as Ansible, Jenkins).
  • In-depth knowledge of cloud platforms (AWS, Azure, GCP), infrastructure as code, and containerization technologies.
  • Strong leadership qualities, including decisiveness, and the ability to motivate teams, along with the ability to manage stressful situations calmly and effectively.
  • Strong experience in incident command or incident management in a technology environment.
  • Excellent problem-solving, organizational, and analytical skills.

Nice To Haves

  • ITIL Foundations v3 Certification.
  • Demonstrated experience with ITSM suites, e.g., ServiceNow.
  • Demonstrated experience with various monitoring, performance, or capacity tools.
  • Experience with continuous integration/continuous deployment (CI/CD) pipelines and DevOps practices.
  • Ability to create constructive relationships, influence, and communicate with varying levels of associates and management.
  • Ability to solve complex, cross-functional issues.
  • Strong knowledge of Server, Storage, Network, Middleware, Application and Cloud technologies.
  • A high degree of curiosity and a drive to seek more efficient ways of delivering service.

Responsibilities

  • Serve as Incident Commander during major incidents, leading response efforts to restore services and minimize impact on business and consumer operations.
  • Design and implement automation tools to reduce manual intervention, improve system performance, and prevent incidents.
  • Develop and maintain comprehensive monitoring and alerting frameworks to detect and address anomalies before they escalate to incidents.
  • Collaborate closely with development, operations, and support teams for continuous improvement of service reliability and incident response processes.
  • Conduct thorough post-mortems to analyze incidents, identify root causes, and implement preventative measures to avoid recurrence.
  • Effectively communicate incident status, impact, and post-incident reports to stakeholders at all levels of the organization.
  • Stay informed on the latest industry trends, technologies, and practices in site reliability engineering and incident management.

Benefits

  • Medical, dental, vision coverage
  • Health care flexible spending account
  • Dependent care flexible spending account
  • Life insurance
  • Disability insurance
  • Accident insurance
  • Adoption expense reimbursements
  • Paid parental leave
  • 401(k) plan
  • Stock purchase plan
  • Discounts at Marriott properties
  • Commuter benefits
  • Employee assistance plan
  • Childcare discounts
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service