Staff Site Reliability Engineer

ZscalerSan Jose, CA
14dHybrid

About The Position

About Zscaler Zscaler accelerates digital transformation so our customers can be more agile, efficient, resilient, and secure. Our cloud native Zero Trust Exchange platform protects thousands of customers from cyberattacks and data loss by securely connecting users, devices, and applications in any location. Here, impact in your role matters more than title and trust is built on results. We believe in transparency and value constructive, honest debate —we’re focused on getting to the best ideas, faster. We build high-performing teams that can make an impact quickly and with high quality. To do this, we are building a culture of execution centered on customer obsession, collaboration, ownership and accountability. We value high-impact, high-accountability with a sense of urgency where you’re enabled to do your best work and embrace your potential. If you’re driven by purpose, thrive on solving complex challenges and want to make a positive difference on a global scale, we invite you to bring your talents to Zscaler and help shape the future of cybersecurity. Our Engineering team built the world’s largest cloud security platform from the ground up, and we keep building. With more than 100 patents and big plans for enhancing services and increasing our global footprint, the team has made us and our multitenant architecture today's cloud security leader, with more than 15 million users in 185 countries. Bring your vision and passion to our team of cloud architects, software engineers, security experts, and more who are enabling organizations worldwide to harness speed and agility with a cloud-first strategy. We’re seeking a highly skilled and experienced SRE Platform Engineer to join our SRE Cloud Platform Engineering Team. Reporting to the Director of Cloud Engineering, you will be responsible for:

Requirements

  • Bachelor’s degree in Computer Science or equivalent practical experience
  • 5+ years of experience in Cloud-SRE, DevOps, or Systems Engineering with a focus on software development
  • Proficiency in Linux systems and cloud platforms (GCP, AWS, Azure)
  • Experience with automation/tools: Kubernetes, Terraform, Ansible, Docker, Jenkins
  • Strong problem-solving and collaboration skills, with a proactive approach to teamwork

Nice To Haves

  • Familiarity with observability tools like OpenTelemetry, Prometheus, and Grafana Stack
  • Experience with PostgreSQL and time-series analytics databases (e.g., Clickhouse, Redis)
  • Knowledge of MLOps and Generative AI applications within SRE environments

Responsibilities

  • Designing and maintaining scalable infrastructure solutions to support Zscaler’s global cloud services
  • Enhancing observability practices across infrastructure and applications through monitoring, logging, tracing, and automated incident responses
  • Developing automation tools for deployment, patching, scaling, and infrastructure management
  • Building portals for SRE dashboards, service level indicators/agreements (SLI/SLO/SLA), and metrics that support data-driven decision-making
  • Partnering with product, operations, and security teams to integrate features, tools, and updates across the platform

Benefits

  • Various health plans
  • Time off plans for vacation and sick time
  • Parental leave options
  • Retirement options
  • Education reimbursement
  • In-office perks, and more!
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service