DevOps/SRE Team Lead

Telestream
Remote

About The Position

Telestream is a leading provider of digital media tools and software solutions for the broadcast, streaming, and media industries. We empower content creators and distributors to produce and deliver high-quality video content while optimizing operations and maximizing revenue. Our teams work diligently to innovate and support world-class services, and we are seeking a DevOps/SRE Team Lead with proven, hands-on Kubernetes expertise to drive the reliability and scalability of our video processing infrastructure and oversee a small team of SRE’s and DevOps Engineers. This is a deeply technical lead role, requiring real-world experience administering production Kubernetes clusters—not theoretical familiarity. You will own CI/CD pipelines, infrastructure automation, and cloud platform operations in a fully remote environment where independent execution is essential. If you have built, broken, and fixed things in Kubernetes at scale, while managing and mentoring a team, we want to hear from you. Location: US Remote. Candidates must be legally authorized to work in the United States. This role is not eligible for employer-sponsored work authorization or visa sponsorship of any kind, now or in the future. Our process includes a live, hands-on technical interview conducted via shared terminal and screen share. You will be asked to work through real Kubernetes and infrastructure scenarios in real time—no take-home exercises, no slides. Candidates who are comfortable with the skills listed above will do well. Candidates who are not, will find this stage difficult to navigate. We value people who are direct about what they know and what they’re still learning.

Requirements

  • Bachelor’s degree in computer science, Engineering or equivalent
  • 5-8+ years of experience in DevOps/SRE, with 2-3+ years in a leadership role.
  • Hands-on experience building and maintaining CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, or equivalent) with direct integration into Kubernetes deployment workflows
  • Production-level experience with infrastructure as code (Terraform required; CloudFormation or Pulumi a plus), including managing cloud-hosted Kubernetes clusters (EKS, GKE, or AKS)
  • Experience with monitoring, logging, and observability tooling in Kubernetes environments (Prometheus, Grafana, Datadog, ELK/EFK stack, or equivalent); ability to build dashboards and alerts from scratch, not just consume existing ones
  • Demonstrated, hands-on Kubernetes experience in production environments: cluster administration, Helm chart authoring and management, RBAC configuration, persistent storage, horizontal/vertical pod autoscaling, and diagnosing and resolving real production failures (CrashLoopBackOff, OOMKilled, networking issues, etc.)
  • Strong troubleshooting skills with the ability to diagnose infrastructure and application issues live, under pressure, without reference materials—this is evaluated directly in our interview process
  • Proficiency in scripting languages (Python, Go, Bash, or PowerShell); ability to write and own automation scripts, not just modify existing ones
  • Strong communication, conflict resolution, and the ability to influence without authority
  • Excellent communication and collaboration skills

Responsibilities

  • Design, deploy, and administer production Kubernetes clusters, including workload scheduling, namespace management, RBAC, network policies, and cluster upgrades
  • Design and maintain continuous integration/deployment pipelines to automate testing and deployment, including Kubernetes-native delivery workflows using Helm and ArgoCD or equivalent
  • Track software performance, fixing errors, troubleshooting systems, implement preventative measures to ensure smooth workflows
  • Implement and manage infrastructure.
  • Utilize Terraform or CloudFormation for IaC management
  • Optimize cloud resources by implementing cost-effective solutions
  • Collaborate with various teams to ensure smooth deployment
  • Monitor and create new processes based on performance analysis
  • Implement security best practices, including automated compliance checks and secure code deployment
  • Manage the technical roadmap, architecture while mentoring SRE and DevOps Engineers. (Player/Coach)
  • Hire, coach, and manage a team of DevOps engineers and Site Reliability Engineers.
  • Define DevOps/Platform roadmap aligned with business goals (e.g., cloud cost optimization, automation maturity).

Benefits

  • Day-one medical, dental & vision coverage
  • 100% company-paid life + disability insurance
  • 401(k) with a sweet company match (up to 8%)
  • Quarterly HSA boosts & flexible spending accounts
  • Flexible time off (salaried) or PTO (hourly) + generous paid holidays
  • Pet insurance (yes, your dog gets benefits too)
  • Legal plan + extras like accident & critical illness coverage
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service