Site Reliability Engineer

BT GroupIuka, IL
55d

About The Position

As a Site Reliability Engineer (SRE), you will play a critical role in ensuring BT delivers exceptional service performance, reliability, and availability across its digital platforms. In a fast-paced, cloud-driven AI environment where customers expect seamless experiences, this position enables scalable, fault-tolerant, and cost-effective solutions through cross-team collaboration, automation, monitoring, and resilience strategies. By minimising downtime, reducing operational risk, and accelerating innovation, you will safeguard BT’s reputation for reliability while empowering the business to adapt quickly to emerging technologies and deliver consistent value to customers worldwide.

Requirements

  • A deep understanding of full-stack monitoring solutions, such as Dynatrace, to ensure current end-to-end performance and trends of owned CDO Applications.
  • Strong proficiency in one or more programming languages (e.g. Java, Python).
  • Experience with cloud platforms (AWS, Azure, or GCP).
  • Solid understanding of software architecture, design patterns, and microservices.
  • Familiarity with CI/CD tools and DevOps practices.

Nice To Haves

  • AIOps fundamentals (cross-domain telemetry ingestion, event correlation, topology/context building, and remediation augmentation).
  • Agentic/autonomous observability skills (using intelligent agents to detect anomalies, correlate signals, and trigger guarded remediations to cut MTTR).
  • AI-assisted alerting & noise reduction (designing contextual, business impact aware alerts; prioritisation via ML).

Responsibilities

  • Implement and optimise CI/CD pipelines, automation frameworks, and infrastructure-as-code solutions using AWS, GitOps, and container technologies.
  • Design, develop, and troubleshoot large-scale distributed systems across on-prem and cloud environments, ensuring reliability and scalability.
  • Lead performance and scale testing, monitoring, and analysis to improve system stability, security, and efficiency.
  • Drive automation initiatives to eliminate manual toil, reduce detection and resolution times, and enhance operational resilience.
  • Proactively identify and mitigate risks, perform root cause analysis, and implement preventive measures following incidents.
  • Champion best practices in Site Reliability Engineering, mentor team members, and share knowledge on emerging trends and technologies.
  • Collaborate across organisational boundaries to deliver improvements aligned with broader SRE initiatives.

Benefits

  • An annual on-target bonus of 10% (personal and company multipliers).
  • BT Pension scheme: minimum 5% employee contribution, BT contribution 10%.
  • Exclusive colleague discounts on our latest and greatest BT broadband packages.
  • 50% off EE mobile pay monthly or SIM only plans, and 50% discount for friends and family on EE SIM only plans.
  • Discounted EE TV, including TNT Sport and the NOW Entertainment membership.
  • There’s also great support for working parents, including pay whilst on maternity, adoptive, and paternity leave.
  • 25 days annual leave (not including bank holidays), increasing with service.
  • Volunteering days, so you can give back to your local community.
  • Brand new electric vehicle salary sacrifice arrangement, known as ‘My EV’.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service