Senior Lead Site Reliability Engineer

ZoomSan Jose, CA
$146,700 - $339,300Hybrid

About The Position

As a Senior Lead Site Reliability Engineer, you can anticipate opportunities to work on our hybrid systems across the globe. You will be responsible for installing, configuring, and monitoring new systems within a network of global data centers. Additionally, you will patch and maintain thousands of physical and cloud systems worldwide. To streamline operations, you will develop automation to reduce repetitive tasks and analyze and address performance bottlenecks. Furthermore, you will update and troubleshoot user access permissions, resolve network connectivity issues, and maintain system firewalls.

Requirements

  • 10+ years in SRE, production engineering, or large-scale systems administration
  • Have experience of Linux system administration (systemd, cgroups, networking, filesystems, performance analysis)
  • Demonstrate coding ability with at least one programming language e.g. Python
  • Have experience with configuration management (Ansible), IaC (Terraform, Packer), CI/CD pipelines (Jenkins, GitLab), container orchestration (k8s, Docker) and observability platforms.
  • Have experience with incident response for mission-critical environments.
  • Possess a security -first mindset (TPM, secure boot, identity, secrets management).
  • Demonstrate networking expertise: BGP, load balancing, DNS, TLS, traffic engineering.
  • Have experience with chaos engineering and resilience testing.
  • Have experience with distributed storage systems such as Ceph

Nice To Haves

  • Occasional weekend work may be required
  • Ability to work across the globe or multiple time zones

Responsibilities

  • Providing technical direction for cross-team initiatives and major incidents.
  • Mentor SRE's and developers; define best practices and design patterns.
  • Partner with Security, Networking, and Platform teams on architecture roadmaps.
  • Influence vendor and hardware strategy for on-prem and cloud workloads.
  • Design self-healing platforms using automation, chaos engineering, and fault-tolerant patterns.
  • Optimize Linux systems at scale: performance tuning, kernel parameters, networking, storage, and security hardening.
  • Define best practices and advocate for them across the company.
  • Excellent communication skills and experience driving cross team projects as a technical lead.
  • Able to participate in on-call shifts and incident management and work after hours/weekends for application releases/deployments.

Benefits

  • award-winning workplace culture
  • variety of perks, benefits, and options to help employees maintain their physical, mental, emotional, and financial health
  • support work-life balance
  • contribute to their community in meaningful ways
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service