Alibaba Cloudposted about 2 months ago
$104,400 - $171,000/Yr
Full-time • Mid Level
Sunnyvale, CA

About the position

The mission of the Cloud Intelligence Group SRE (Site Reliability Engineering) Team is to ensure the stability of production environments, enterprise-grade cloud data reliability, and service continuity for the Cloud Intelligence Group. Our greatest challenge lies in guaranteeing uninterrupted business operations for cloud-based customers and achieving availability that exceeds 99.99%. Our goal is to establish a systematic stability assurance framework that integrates technology and management, including but not limited to developing stability standards and metrics, driving major stability governance campaigns, building a stability-focused technical platform, executing production incident management, ensuring stability for large-scale customer events, and on-call responsibilities.

Responsibilities

  • Daily operations and maintenance of applications, databases, and middleware, as well as troubleshooting and answering customer inquiries.
  • Collaborating with R&D to develop critical support plans based on customer business requirements during peak periods, including preparation during the standby period, on-duty support during critical periods, and post-standby review.

Requirements

  • Bachelor's degree in a relevant field.
  • 5 years of work experience in Site Reliability Engineering or a related field.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service