Site Reliability Engineer ID60188

AgileEngineDowney, CA
Hybrid

About The Position

We are looking for an SRE Operations Engineer to keep production and staging environments running reliably across a cloud-based SaaS platform. You’ll respond to live incidents, reduce operational toil through automation, and improve observability using Kubernetes, Terraform, Grafana, and AWS. This is a hands-on role with real ownership across CI/CD pipelines, GitOps workflows, and on-call rotations.

Requirements

  • 2+ years of experience in Site Reliability Engineering, DevOps, or Production Operations
  • Experience with AWS supporting production environments
  • Experience supporting production SaaS applications
  • Strong understanding of CI/CD systems such as GitHub Actions, Jenkins, or CircleCI
  • Experience with GitOps and strong Git fundamentals
  • Experience using GitHub, Jira, and Confluence in collaborative environments
  • Experience with Kubernetes such as EKS or kOps
  • Experience with Docker and containerization
  • Experience with observability tools such as Grafana, Prometheus, Loki, or PagerDuty
  • Experience with scripting languages such as Bash, Python, or Go
  • Experience with Infrastructure as Code such as Terraform or Helm
  • Ability to work within structured operational processes and SLAs
  • Strong written and verbal English communication skills
  • Self-driven with a growth mindset

Nice To Haves

  • AWS certifications such as Solutions Architect, DevOps Engineer, or SysOps Administrator
  • Experience in multi-tenant SaaS environments
  • Experience working in globally distributed teams
  • Familiarity with ChatOps practices
  • Experience improving monitoring quality and reducing alert fatigue

Responsibilities

  • Monitor and support production and staging environments in real time, ensuring high availability, performance, and stability
  • Respond to incidents, perform triage and root cause analysis, and contribute to post-incident reviews and remediation efforts
  • Participate in an on-call rotation with defined SLAs
  • Handle ad-hoc and unplanned operational requests from Product, Support, and internal teams
  • Maintain and enhance monitoring, alerting, dashboards, logs, and metrics, and improve observability practices
  • Support CI/CD pipelines, production releases, and GitOps workflows
  • Contribute to automation efforts to reduce operational toil
  • Maintain and improve Kubernetes-based infrastructure and containerized workloads
  • Support Infrastructure as Code practices and ongoing environment improvements

Benefits

  • Professional growth: Mentorship, TechTalks, and personalized growth roadmaps.
  • Competitive compensation: USD-based pay with education, fitness, and team activity budgets.
  • Exciting projects: Modern solutions with Fortune 500 and top product companies.
  • Flextime: Flexible schedule with remote and office options.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service