Software Engineer (SRE) - Remote

UnitedHealth GroupEden Prairie, MN
Remote

About The Position

Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together. We are seeking a Software Engineer with strong Site Reliability Engineering (SRE) capabilities to support a critical modernization initiative within a call center applications team. This role will focus on improving application stability, supporting cloud-based deployments, and enabling the transformation of legacy call center platforms into digital self-service chat and voice bot solutions. The engineer will work collaboratively across multiple teams, supporting cloud environments, deployments, and incident response activities to ensure highly available and resilient systems. If you reside in the states of Minnesota and District of Columbia, you will enjoy the flexibility of a hybrid-remote role as you take on some tough challenges.

Requirements

  • Bachelor's degree in computer science, Engineering, or related technical field (or equivalent practical experience)
  • 5+ years of hands on experience deploying and supporting applications in cloud platforms (GCP, AWS, Azure)
  • 4+ years of experience working in a Site Reliability Engineering (SRE), DevOps, or production support-focused software engineering role
  • 3+ years of experience with cloud-based deployments and CI/CD pipelines, including troubleshooting deployment and region-related issues
  • 3+ years of experience with application performance monitoring and logging tools (Splunk, Dynatrace, Grafana)
  • 3+ years of experience with writing scripts or tools in Python,React similar languages
  • Experience in Terraforms and Provisioning Environments
  • Experiencing querying logs, trace transactions, and identifying root causes of application issues

Nice To Haves

  • Experience supporting large-scale or legacy system modernization efforts
  • Familiarity with call center platforms or digital self-service / bot-enabled applications
  • Proven exposure to global delivery models with teams based in multiple regions
  • Proven solid cross-team collaboration skills, with the ability to work across distributed teams and time zones

Responsibilities

  • Partner with cross-functional engineering and operations teams to ensure application reliability, stability, and performance
  • Support cloud environment provisioning, readiness, and ongoing operations
  • Assist with and monitor pipeline setup and cloud deployments, including daily and nightly deployments
  • Participate in production support, war rooms, and incident response efforts, helping to diagnose and resolve issues quickly
  • Debug issues across regions by tracing logs and analyzing system behavior in cloud environments
  • Leverage application performance monitoring tools to identify, troubleshoot, and prevent system issues
  • Support the reliability, availability, and performance of distributed systems across cloud, edge, and device environments
  • Help define, measure, and monitor SLIs and SLOs for services.
  • Identify reliability risks and collaborate with senior engineers on mitigation plans
  • Participate in on call rotations and assist with incident response and post incident reviews
  • Contribute improvements to runbooks, automation, and tooling that reduce alert noise and operational toil
  • Help enhance detection, alerting, and response workflows
  • Implement and improve telemetry using OpenTelemetry, Grafana,splunk and related tools
  • Build dashboards and tools that improve visibility into system health and AI service behavior
  • Ensure observability data is complete, accurate, and actionable
  • Support safe, reliable deployment workflows including canaries, staged rollouts, and automated rollbacks
  • Assist in improving CI/CD systems and deployment tooling
  • Work closely with senior SREs, DevOps engineers, AI/ML teams, and platform engineers
  • Contribute to reliability reviews, operational readiness checks, and cross team projects
  • Advocate for modern SRE and DevOps practices within the organization

Benefits

  • comprehensive benefits package
  • incentive and recognition programs
  • equity stock purchase
  • 401k contribution
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service