Site Reliability Engineer

T-MobileFrisco, TX
4d

About The Position

The Site Reliability Engineer at T-Mobile is instrumental in enhancing system reliability and resilience, ensuring our digital infrastructure operates seamlessly. By automating processes and reducing manual efforts, they minimize operational incidents and streamline software development and deployment. Their proficiency in programming, scripting languages, and incident response management fortifies our systems against disruptions. Through continuous learning and adaptation to new technologies, they drive innovation and maintain system robustness. Their contributions are vital to the stability and performance of T-Mobile's digital operations, directly impacting our service quality and operational efficiency. We are a team that encourages innovation and advocate an agile and open approach, truly working and playing in the Un-carrier way!

Requirements

  • Bachelor's Degree plus 3 years of related work experience OR advanced degree with 1 year of related work experience OR combination of education and experience deemed equivalent (Required)
  • Acceptable areas of study include Computer Science or Engineering
  • 2-4+ years - Developing and maintaining CI/CD pipelines for software deployment.
  • 2-4+ years - Implementing and managing cloud-native platforms and solutions.
  • 2-4+ years - Guiding and mentoring teams in reliability engineering practices.
  • Problem Solving: Ability to identify, analyze, and resolve system reliability issues. (Required)
  • Scripting Languages: Proficiency in scripting languages such as Python or Bash to automate tasks and processes. (Required)
  • Incident Response Management: Skilled in managing and responding to system incidents to minimize downtime and impact. (Required)

Nice To Haves

  • Master's/Advanced Degree Computer Science or Data Science (Preferred)
  • Technology Management: Experience in managing technology stacks and continuous integration/continuous deployment (CI/CD) pipelines. (Preferred)
  • Agile Methodologies: Understanding of Agile methodologies to improve and streamline processes. (Preferred)
  • Data Analysis: Ability to analyze system performance data to identify trends and improvement opportunities. (Preferred)
  • Innovation: Capability to drive innovation in system management and operations through new technologies and approaches. (Preferred)
  • Adaptability: Ability to adapt to new technologies and changes in the digital landscape to maintain system robustness. (Preferred)
  • AWS Certified DevOps Engineer: Certification demonstrates an individual's expertise in provisioning, operating, and managing distributed application systems on the AWS platform. (Preferred)
  • Certified Kubernetes Administrator (CKA): Certification validates the ability to use Kubernetes, which is crucial for automating deployment, scaling, and operations of application containers across clusters of hosts. (Preferred)
  • Site Reliability Engineering (SRE) Foundation Certification: Certification provides a foundational understanding of the SRE philosophy, practices, and tools to enhance the reliability and performance of systems. (Preferred)

Responsibilities

  • Automates processes to enhance system reliability and resilience.
  • Minimizes operational incidents through proactive monitoring and maintenance.
  • Streamlines software development and deployment processes.
  • Develops scripts and tools to reduce manual efforts in operational tasks.
  • Manages incident response to ensure rapid recovery and minimal disruption.
  • Adapts to new technologies to maintain and enhance system robustness.
  • Also responsible for other Duties/Projects as assigned by business management as needed.

Benefits

  • Employees enjoy multiple wealth-building opportunities through our annual stock grant, employee stock purchase plan, 401(k), and access to free, year-round money coaches.
  • Employees enjoy multiple wealth-building opportunities through our annual stock grant, employee stock purchase plan, 401(k), and access to free, year-round money coaches.
  • We cover all of the bases, offering medical, dental and vision insurance, a flexible spending account, 401(k), employee stock grants, employee stock purchase plan, paid time off and up to 12 paid holidays - which total about 4 weeks for new full-time employees and about 2.5 weeks for new part-time employees annually - paid parental and family leave, family building benefits, back-up care, enhanced family support, childcare subsidy, tuition assistance, college coaching, short- and long-term disability, voluntary AD&D coverage, voluntary accident coverage, voluntary life insurance, voluntary disability insurance, and voluntary long-term care insurance.
  • eligible employees can also receive mobile service & home internet discounts, pet insurance, and access to commuter and transit programs!
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service