Senior Site Reliability Engineer - Remote

UnitedHealth GroupEden Prairie, MN
Remote

About The Position

The Sr. Site Reliability Engineer will architect, develop, and maintain Optum Serve’s cloud environment in both the commercial and government clouds. The role will work closely with software engineers, architects, and DevOps engineers to architect and maintain a secure, resilient and high performance cloud infrastructure. You’ll enjoy the flexibility to work remotely from anywhere within the U.S. as you take on some tough challenges. For all hires in the Minneapolis or Washington, D.C. area, you will be required to work in the office a minimum of four days per week.

Requirements

  • 6+ years of experience working in a Site Reliability Engineering, Cloud Engineering, or DevOps role
  • Experience with infrastructure as code (IaC) tools like Terraform, Pulumi
  • Experience with Kubernetes deployment tools like Helm, ArgoCD, Flux
  • Experience supporting infrastructure in production cloud environments
  • Experience working with RESTful services
  • Some experience with monitoring tools (Azure Monitor, Splunk, Dynatrace, Graphana, Prometheus)
  • Knowledge of Encryption, Public Key Infrastructure (PKI), understanding of OWASP
  • Expert knowledge of at least one major cloud service provider (Azure preferred, AWS acceptable)
  • Expert knowledge and hands on production experience in Kubernetes (bare metal or managed) cluster setup and management
  • Understanding of identity and access management (IAM)
  • Familiarity with IDEs and source control tools such as Visual Studio Code, GitHub, GitLab
  • Solid awareness of networking and internet protocols
  • Ability to participate in a 24/7 on-call rotation following documented procedures and escalation paths
  • United States Citizenship
  • If you are offered this position, you will be required to provide extensive personal information to obtain and maintain a suitability or determination of eligibility for a Confidential/Secret or Top Secret security clearance as a condition of your employment
  • All employees working remotely will be required to adhere to UnitedHealth Group’s Telecommuter Policy

Responsibilities

  • Build, operate and support IaaS and PaaS infrastructure in Azure and AWS commercial and government clouds
  • Work closely with dev teams to identify and measure SLOs, SLAs and SLIs
  • Act a solid contributor to development of platform services including architecture, provisioning, configuration, deployment, and support
  • Perform integrations with central logging, metrics dashboards, instrumentation, incident monitoring and management
  • Build/integrate/administer systems and tools that enable engineering teams to observe their applications in production with autonomy (Dashboards, APMs)
  • Support software and/or cloud-infrastructure in an on-call rotation basis
  • Assist with identification and remediation of technical problems at the root cause by continuously implementing automation, self-healing, and real-time monitoring to production systems
  • Maintain and improve operational tooling, frameworks
  • Build frameworks that test the performance and resiliency of our platform services/tools
  • Automate alerts for metrics on performance, cost, vulnerabilities, risk, compliance violations
  • Improve processes and champion automation of any manual items around support

Benefits

  • a comprehensive benefits package
  • incentive and recognition programs
  • equity stock purchase
  • 401k contribution
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service