Software Reliability Engineer

T-MobileAtlanta, GA
$83,900 - $151,200Onsite

About The Position

This role improves and protects software and systems supporting IT services by managing scalability, availability, latency, performance, security, and capacity. This role supports the Subscription Product Engineering organization, including in-house subscription and customer lifecycle platforms that support critical business operations and customer-facing services across production and non-production environments. The role primarily involves designing and maintaining continuous integration and continuous delivery, CI/CD, pipelines and building applications on cloud-native platforms. The role differentiates itself by enabling continuous improvement of operational support through automation, monitoring, and reliability-focused practices across production and non-production environments. Success is measured by enhanced software delivery speed, reliability, operational efficiency, platform stability, and a consistent customer experience. We are a team that encourages innovation and advocates an agile and open approach, truly working and playing in the Un-carrier way!

Requirements

  • Bachelor's Degree plus 2 years of related work experience OR combination of education and experience deemed equivalent (Required)
  • DevOps (Required)
  • Integration (Required)
  • Monitoring, observability, or operational support practices (Required)
  • Knowledge of Site Reliability Engineering principles, infrastructure automation, and operational support practices.
  • Knowledge of cloud-native architectures, containerized workloads, API integrations, and distributed systems.
  • Ability to analyze system performance, identify reliability risks, and support operational improvements.
  • Ability to create operational documentation, dashboards, alerts, and support runbooks.
  • Ability to collaborate with software engineering, DevOps, and platform teams in Agile delivery environments.
  • At least 18 years of age
  • Legally authorized to work in the United States

Nice To Haves

  • 2-4 years Relevant experience. (Preferred)
  • Experience working in an Agile and DevOps environment. (Preferred)
  • Experience in one or more of: C, C#, Java, Perl, Python, Go, or scripting experience in Shell and Perl. (Preferred)
  • Experience in Continuous Integration/Continuous Delivery tools, such as, Jenkins, Cloudbees, etc., and other automation tools. (Preferred)
  • Experience with DevOps tools, such as, Ansible, Chef, Puppet, etc. (Preferred)
  • Experience in Docker, Kubernetes, etc. is preferable. (Preferred)
  • Experience in APM tool, like, AppDynamics, logging tool, like Splunk. (Preferred)
  • Experience working in a cloud environment (public/private). (Preferred)
  • Experience in migrating to cloud or cloud native environments. (Preferred)
  • Experience supporting APIs, microservices, distributed applications, or enterprise production platforms. (Preferred)
  • Experience with infrastructure automation tools such as Terraform, Ansible, Chef, or Puppet. (Preferred)
  • Experience with observability or monitoring tools such as Splunk, AppDynamics, Dynatrace, Grafana, Prometheus, or similar platforms. (Preferred)
  • CI/CD pipeline support and deployment automation (Desired)
  • Production and non-production environment support (Desired)

Responsibilities

  • Apply DevOps automation tools to manage CI/CD pipelines and configuration for production and non-production environments.
  • Perform environment management and automated server provisioning to support scalable infrastructure.
  • Deliver software improvements that improve availability, scalability, latency, and efficiency of IT services.
  • Create and manage dashboards, alerts, logging standards, and health checks to improve service quality, supportability, and visibility across services.
  • Contribute to software delivery process improvements including cloud enablement, containerization, and deployment automation.
  • Support cloud-native applications, APIs, microservices, and platform operations across production and non-production environments.
  • Troubleshoot production incidents, participate in root cause analysis, and support implementation of long-term reliability improvements with assistance from leadership and senior technical team members.
  • Partner with Software Engineering, DevOps, and platform teams to improve application resiliency, scalability, and deployment automation under established technical direction.
  • Contribute to operational readiness activities, including release validation, capacity planning, disaster recovery support, and environment support, under the guidance of senior leadership.
  • Participate in Agile ceremonies, production support activities, and continuous improvement initiatives.
  • Also responsible for other duties/projects as assigned by business management as needed.

Benefits

  • competitive base salary and compensation package
  • annual stock grant
  • employee stock purchase plan
  • 401(k)
  • access to free, year-round money coaches
  • medical insurance
  • dental insurance
  • vision insurance
  • flexible spending account
  • employee stock grants
  • employee stock purchase plan
  • paid time off
  • up to 12 paid holidays
  • paid parental and family leave
  • family building benefits
  • back-up care
  • enhanced family support
  • childcare subsidy
  • tuition assistance
  • college coaching
  • short- and long-term disability
  • voluntary AD&D coverage
  • voluntary accident coverage
  • voluntary life insurance
  • voluntary disability insurance
  • voluntary long-term care insurance
  • mobile service & home internet discounts
  • pet insurance
  • access to commuter and transit programs
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service