Staff DevOps Engineer

Cast & Crew
$190,000 - $235,000

About The Position

We are looking for a Staff DevOps Engineer to serve as a technical anchor for our platform engineering practice. In this role you will own the design and evolution of our CI/CD pipelines, Kubernetes infrastructure on AWS EKS, and the developer experience tooling that hundreds of engineers depend on daily. Staff-level engineers at this organization are expected to operate with significant autonomy, identify and resolve systemic problems before they become incidents, and raise the technical bar across the teams they partner with.

Requirements

  • 8+ years of DevOps or platform engineering experience, with at least 2 years operating at a Staff or Principal level in an organization of 100+ engineers.
  • Deep, hands-on expertise with Kubernetes — EKS specifically preferred — including troubleshooting workloads, networking, storage, and cluster operations at scale.
  • Strong command of Azure DevOps Pipelines, including YAML pipeline authoring, library management, service connections, and environment promotion gates.
  • Proven track record designing and maintaining CI/CD systems for microservice architectures with multiple independent teams as consumers.
  • Experience operating observability platforms (New Relic, Datadog, or similar) to drive proactive reliability improvements, not just reactive alerting.
  • Proficiency in at least one scripting language (Python, Bash, or Go) and Infrastructure-as-Code tooling (Terraform, Pulumi, or CDK).
  • Familiarity with feature flag patterns and operational considerations around progressive delivery (Unleash or equivalent is a plus).
  • Excellent written communication skills — you default to documentation and can translate complex infrastructure decisions into guidance engineers actually read.

Nice To Haves

  • Experience with data engineering or ML infrastructure workloads on Kubernetes (Spark on EKS, Argo Workflows, Airflow).
  • Background contributing to or maintaining internal developer portals (Backstage or similar).
  • Familiarity with FinOps practices and tooling for AWS cost attribution and optimization across shared Kubernetes clusters.
  • Experience in SRE-adjacent roles; comfort with SLO/SLI definition and error budget policy.

Responsibilities

  • Platform & Infrastructure Architect and continuously improve CI/CD pipelines in Azure DevOps, including pipeline-as-code standards, templating strategies, and artifact promotion workflows across environments.
  • Own the health and evolution of our AWS EKS clusters — node lifecycle, autoscaling, networking (VPC/CNI), RBAC, and cluster upgrades with minimal service disruption.
  • Design and enforce Infrastructure-as-Code practices using Terraform or equivalent tooling; champion GitOps patterns across engineering teams.
  • Drive platform reliability improvements informed by observability data from New Relic, working closely with SRE to translate dashboards and alerts into actionable platform changes.
  • Define and maintain golden-path templates for containerized workloads — Dockerfile standards, Helm chart libraries, and local development parity with production.
  • Partner with engineering teams to accelerate onboarding of new services onto the platform and reduce toil through automation.
  • Act as an escalation point for complex infrastructure incidents coordinated through PagerDuty; participate in on-call rotation and lead post-incident reviews for platform-layer failures.
  • Identify recurring failure modes and drive systemic fixes that reduce page volume and MTTR across the platform.
  • Maintain and improve runbooks and platform documentation in Confluence, ensuring knowledge is accessible and current.
  • Define and socialize DevOps standards — pipeline design, container hygiene, secret management, and deployment safety — across a multi-team engineering organization.
  • Conduct architecture reviews and provide technical guidance on infrastructure-impacting decisions made by product engineering teams.
  • Mentor senior and mid-level engineers; grow internal platform capability through pairing, code review, and structured knowledge sharing.
  • Identify tooling gaps and build the business case for platform investments, working with engineering leadership to prioritize roadmap items.

Benefits

  • Medical, Dental, Vision, PTO, health and wellness programs, employee discounts, and more!

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

11-50 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service