Principal DevOps Engineer

Zoom•San Jose, CA

62d•Hybrid

About The Position

We are seeking a Principal Meeting DevOps Engineer who combines deep technical expertise with broad system understanding. This engineer should be capable of diving into a wide range of services and identifyingsystemic issues across architecture, CI/CD flow, and containerization environments. This role requires technical leadership, analytical skill, and cross-team collaboration to drive reliability, scalability, and modernization. About the Team At Zoom, we’re building the next generation of Cloud and Colocation (Colo) infrastructure that powers seamless communication and collaboration for millions of users worldwide.

Requirements

15+ years in DevOps, SRE for large-scale, production systems.
successful hands-on background in Linux systems, networking, and distributed systems.
Possess experience operating and design low-latency, high-throughput backend services at global scale.
Knowledge of media or real-time communication systems (e.g., MMR, WebRTC).
Recognize knowledge of TCP/IP, routing, DNS, load balancing, and packet capture tools.
Familiarity with colocation data center operations, including hardware provisioning and troubleshooting.
Demonstrate experience with Terraform, Ansible, Kubernetes, Docker, and modern CI/CD pipelines.
successful problem-solving, debugging, and systems-level design skills

Responsibilities

Leading deep-dive investigations across diverse services and environments.
Working on real time media systems to web, team chat and AI to uncover architectural or operational bottlenecks.
Designing and implementing improvements in deployment pipelines, orchestration frameworks, andCI/CD automation to increase reliability and release velocity.
Working closely with product and service owners to enhance containerization strategy, improve resource efficiency, and reduce operational friction.
Partnering with the Meeting DevOps and Cloud Infra teams to modernize hybrid infrastructures panning colocation data centers, AWS, OCI, and other cloud providers.
Driving system observability, fault isolation, and resilience engineering, ensuring services meet strict availability and latency SLAs.
Providing technical mentorship to DevOps engineers and influence best practices in automation, monitoring, and release engineering.
Champion a culture of data-driven reliability through postmortems, SLIs/ SLO's, and continuous performance optimization.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume