Senior Site Reliability Engineer

SupioSeattle, WA
12h$168,000 - $220,000

About The Position

We’re looking for a hands-on, high-agency Site Reliability Engineer to help shape and scale the reliability layer of our stack. You'll own the release pipeline end-to-end — managing daily releases, weekly deploys, and hotfixes — while also automating infrastructure, monitoring systems, and GitHub workflows. This is a software engineering role, deeply embedded in DevOps culture, with significant autonomy and direct impact on the pace and safety of our shipping process. You’ll work closely with engineers, product leads, and company leadership to ensure uptime, speed, and confidence in every deploy.

Requirements

  • Have 3–6+ years in SRE, DevOps, or infrastructure roles with production ownership.
  • Started your career in software development — and still enjoy writing code.
  • Are fluent in or at least familiar with Bash, Python, TypeScript, and Postgres SQL.
  • Are a confident AWS operator and know your way around EC2, Lambda, RDS, IAM, and VPCs.
  • Have strong experience with GitHub workflows, including GitHub Actions and release automation.
  • Are comfortable using AI tools (Claude, ChatGPT, etc.) to generate code — and have the skill to audit and adapt that code to meet production standards.
  • Are familiar with CI/CD principles and enjoy owning the full deployment lifecycle.
  • Are comfortable being on-call and understand how to design systems for both speed and safety.
  • Can operate with a high level of autonomy in fast-moving, ambiguous environments.

Responsibilities

  • Own Deployments: Lead our release and deployment process — from daily rollouts to weekly deploys and hotfix coordination. Build safe, repeatable, and observable workflows.
  • GitHub Operations: Manage GitHub branching strategies, pull request flows, merge policies, and GitHub Actions. Set and enforce collaboration standards for the engineering team.
  • Infrastructure & Monitoring: Build and maintain resilient AWS-based infrastructure. Set up and manage observability tools (logs, metrics, traces), configure alarms, and be the first responder for incidents. Triage, escalate, or resolve based on impact.
  • Automation & Internal Tooling: Write scripts, services, and automations that reduce friction and improve deployment confidence. Using AI tools to generate code is encouraged and expected — you'll be comfortable guiding, adapting, and integrating AI-assisted outputs into production workflows.
  • Software Development: You’ll contribute code when needed — whether that’s building internal tools, improving system reliability, or unblocking a deploy. This is not a sprint-based role, but strong software fundamentals are key to success.
  • Support Global Teams: Work off-hours as needed to unblock offshore teams and maintain deployment velocity across time zones.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service