Tubi-posted 2 months ago
Senior
Hybrid • San Francisco, CA
501-1,000 employees
Broadcasting and Content Providers

The Infrastructure team at Tubi builds and operates the core platforms that power our services at scale. We provide reliable, scalable, and developer-friendly systems for compute, networking, observability, and deployment. As a Senior Software Engineer in the infrastructure team, you will be responsible for ensuring reliable service delivery and efficient traffic management across large-scale Kubernetes environments. You will design and implement traffic strategies, build and optimize release pipelines, and leverage Infrastructure as Code (IaC) to manage cloud resources with consistency and traceability. In this role, you'll collaborate with cross-functional teams to deliver scalable, high-performing cloud solutions, working closely with application developers and gaining exposure to a wide range of technologies, including live-streaming, customer customization, and large-scale video transcoding pipelines.

  • Manage and scale multi-cluster Kubernetes deployments, ensuring high availability, performance, and reliability.
  • Design and implement traffic strategies (e.g., canary releases, blue/green deployments, A/B testing, gradual rollouts) using Istio/Envoy or similar service mesh technologies.
  • Build and maintain CI/CD pipelines, automate deployments and rollbacks, and improve release efficiency and reliability.
  • Use Terraform and other IaC tools to provision and manage cloud infrastructure, ensuring consistency and auditability.
  • Establish monitoring, logging, and tracing solutions; troubleshoot and resolve production issues quickly to maintain system stability.
  • Write and maintain clear technical documentation (system architecture, release processes, traffic policies, runbooks, best practices) to enable effective onboarding and collaboration.
  • Partner with developers, SREs, and platform teams to design scalable release and traffic strategies, and drive adoption of engineering best practices.
  • 5+ Years experience in IaC with a Cloud Provider (AWS)
  • 3+ Years of experience with production Kubernetes Clusters
  • Hands-on experience managing Kubernetes in production environments.
  • Strong understanding of service mesh technologies (Istio, Envoy, or similar).
  • Expertise in CI/CD workflows and tools such as ArgoCD, FluxCD, GitHub Actions, or Jenkins.
  • Solid foundation in Linux, networking, and containerization.
  • Strong technical writing skills-able to produce clear, structured documentation for both technical and non-technical audiences.
  • Strong problem-solving skills, with proven experience in high-pressure incident response.
  • Excellent communication and collaboration skills, with a mindset for driving engineering efficiency and quality.
  • Experience operating large-scale, multi-cluster Kubernetes environments.
  • Deep understanding of release strategies and traffic routing algorithms.
  • Previous experience as an SRE or Release Engineer in high-availability systems.
  • Programming skills in Go, Python for automation tooling.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service