About The Position

You can expect to contribute to Zoom’s Big Data Infrastructure platform, designing and operating open‑source compute engines on Kubernetes. You will help build reliable and scalable systems that power analytics, machine learning, telemetry, and product insights across Zoom. The Big Data Infrastructure team is responsible for running open‑source data engines—such as Spark, Flink, and Trino—on Kubernetes. The team owns engine runtimes, automation, observability, multi‑tenant operations, and data lake integrations that support Zoom’s global analytics needs.

Requirements

  • Have experience running workloads on Kubernetes in production
  • Possess hands‑on expertise with Spark, Flink, or Trino
  • Build infrastructure through Terraform, Helm, and GitOps practices
  • Operate cloud environments (ideally AWS EKS)
  • Demonstrate understanding of distributed systems performance and architecture
  • Show solid debugging and root‑cause analysis skills
  • Drive improvements in platform reliability and scalability
  • Collaborate effectively across cross‑functional teams
  • Learn new open‑source engines and tools quickly
  • Communicate clearly in technical discussions and design reviews

Responsibilities

  • Designing Kubernetes infrastructure to run distributed compute engines
  • Building automation and IaC modules using Terraform and Helm
  • Implementing multi‑tenant resource isolation, RBAC, and secure access patterns
  • Operating engine runtimes and control plane components on EKS
  • Integrating data ingestion systems and object‑storage data lakes
  • Managing table operations such as compaction, retention, and partition management
  • Monitoring engine and cluster performance using modern observability tools
  • Debugging distributed jobs across Spark, Flink, and Trino runtimes
  • Automating CI/CD workflows and engine upgrade processes
  • Collaborating with data, ML, and SRE teams on platform improvements
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service