Principal Engineer

ZoomSan Jose, CA
7hHybrid

About The Position

Zoom is seeking a Principal Engineer to help advance the reliability and operational maturity of our DevOps teams. This role will also support broader engineering organizations where their work intersects with production operations. This is a senior leadership role focused on improving how our teams operate systems at scale. It emphasizes shifting away from reactive patterns toward more proactive and sustainable approaches informed by SRE principles. The role emphasizes execution, follow-through, and practical change, rather than planning or architecture ownership. It reports to the Head of DevOps and works closely with senior and executive leadership. About the Team With eight specialized departments, the engineering team functions as a highly collaborative, diverse powerhouse. Each department mission is to deliver seamless and innovative communication solutions. These range from software development and machine learning to quality assurance teams that work to create and maintain Zoom's user-friendly interfaces and robust infrastructure. The team continues to push the boundaries of communication technology, bringing people together regardless of their physical distance.

Requirements

  • 15+ years of experience in engineering, DevOps, or infrastructure-adjacent roles
  • Motivate teams responsible for operating large-scale production systems
  • Have a track record of helping teams improve how they operate, not just what they deliver
  • Prioritize building and implementing low touch, automation-first operating models for services at scale
  • Provide influence across teams without direct authority
  • Deliver pragmatic, steady, and effectiveness in ambiguous situations
  • Able to communicate operational issues, tradeoffs, and improvement opportunities clearly to both engineers and leaders
  • Champion working in regulated or compliance-sensitive environments

Responsibilities

  • Working closely with DevOps teams to understand current operating practices and identify areas where reactive behaviors increase operational risk or operational toil
  • Helping teams adopt automation-first, low-touch operating models that reduce manual intervention and human error
  • Supporting teams in applying systematic learning from failures, translating incident root causes into durable improvements to systems, tooling, and operating practices
  • Promoting convergence of infrastructure and operational practices across DevOps teams where shared approaches improve reliability and reduce fragmentation
  • Supporting teams operating in shared and regulated environments, including Zoom for Government, by helping establish clear, understandable operational expectations
  • Contributing to improved capacity planning and cost-aware reliability tradeoffs, adding structure and follow-through to existing efficiency efforts
  • Engaging with external vendors and partners where their systems or services materially affect Zoom’s reliability, cost, or compliance posture
  • Providing clear, thoughtful updates to senior leadership that summarize operational health, trends, and areas for improvement, including human-driven failure modes
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service