Site Reliability Engineer

RingCentralDenver, CO

About The Position

Say hello to opportunities. If you’re looking to be part of what’s next in communication, you’re in the right place. At RingCentral , we believe the best customer experiences happen when humans and AI work together. Our agentic voice AI portfolio—AIR, AVA, and ACE—brings together automation, assistance, and insights across the entire conversation lifecycle. The result? More seamless, intelligent experiences for businesses everywhere. With $2.5B+ in ARR and $250M invested in R&D annually, we’re building the future of AI-powered business communications. This is where you and your skills come in. We’re currently looking for: An experienced Site Reliability Engineer (SRE) to join the RingCentral Collaboration team. As a SRE, you will be responsible for maintaining and improving uptime and availability across several of our services. You will play a crucial role in ensuring the reliability, performance, and availability of our services by identifying potential issues, and proactively resolving them. The ideal candidate should have a background in various service observability platforms as well as experience with containerization using Kubernetes, message queuing systems like Kafka, and SQL/NoSQL databases. Programming experience is desired for the role.

Requirements

  • Proven experience as an SRE or similar role of 6+ years.
  • Problem-solving and troubleshooting skills.
  • Linux in-depth knowledge.
  • Knowledge of one of the programming languages (see Preferable technology stack).
  • Experience with cloud platforms.
  • Knowledge of one or more of the configuration management tools.
  • Ability to work in a diverse multicultural environment, communicating with globally distributed teams.
  • Team player with self-start ability and strong drive to dig deeply and solve problems.
  • Fluent in spoken and written English.

Nice To Haves

  • B.S in Computer Engineering, Computer Science, or equivalent experience with 4+ years of related experience
  • Proven experience with influencing the software engineering of cloud/SaaS services
  • Familiarity with AI, LLM, and various related technologies
  • Deep understanding of the DevOps Lifecycle and application of it within organizations
  • Deep understanding of SRE principle & fundamentals

Responsibilities

  • Collaborate with development and operations teams to integrate monitoring solutions into the software development lifecycle and operational processes.
  • Define, propose, and drive efforts to continually improve monitoring, troubleshooting, and self-healing for our services.
  • Design and implement redundancy, failover mechanisms, and load-balancing strategies to ensure system reliability.
  • Conduct risk assessments and identify potential points of failure in the infrastructure and propose solutions to fix it.
  • Respond to (on-call) and take actions to mitigate incidents and outages.
  • Be on top of capacity requirements in a growing environment.
  • Actively work with various teams’ codebases to extend observability and improve uptime.
  • Represent the team in global incidents resolution, and participate in on-call rotation

Benefits

  • Comprehensive medical, dental, vision, disability, life insurance
  • Health Savings Account (HSA), Flexible Spending Account (FSAs) and Commuter benefits
  • 401K match and ESPP
  • Paid time off and paid sick leave
  • Paid parental and pregnancy leave and new parent gift boxes
  • Family-forming benefits (IVF, Preservation, Adoption etc.)
  • Emergency backup care (Child/Adult/Pets)
  • Employee Assistance Program (EAP) with counseling sessions available 24/7
  • Free legal services that provide legal advice, document creation and estate planning
  • Employee bonus referral program
  • Student loan refinancing assistance
  • Employee perks and discounts program
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service