Sr. Technology Reliability Engineer

American Transmission CompanyCottage Grove, WI
57d$116,500 - $135,900

About The Position

Join a Great Place to Work! We're looking for a Technology Reliability Engineer to support cross-functional groups focused on the continuous monitoring of technology environments, as well as the rapid and effective response to events and incidents that may impact system performance, security, compliance or availability. We're looking for someone who has a passion for problem-solving, a strong understanding of modern infrastructures and a commitment to delivering exceptional reliability and operational excellence. In this role, you'll use your three years or more of experience in reliability engineering, systems engineering or related technology operations role and strong knowledge of enterprise technology systems, including networks, servers, cloud platforms and application stacks to develop, implement and optimize monitoring solutions that deliver real-time visibility into the health and performance of critical technology services and infrastructure across ATC. You'll collaborate with key stakeholders to detect, triage and respond to technology events and incidents, minimizing downtime and business impact, participate in post-incident reviews, analyzing and documenting root causes of system failures and recommending corrective actions to mitigate future risks and advocate and implement reliability engineering best practices such as failover strategies, automated recovery and robust alerting mechanisms. ATC embraces flexibility in our work and our workplace, depending on your schedule for the day and the needs of the business. If you enjoy being a technical resource for teams accountable for monitoring an enterprise network, responding to alerts and alarms, and improving network and cyber asset reliability, we want you to bring your positive energy to ATC!

Requirements

  • Three years or more of experience in reliability engineering, systems engineering or related technology operations role
  • Strong knowledge of enterprise technology systems, including networks, servers, cloud platforms and application stacks

Responsibilities

  • Develop, implement and optimize monitoring solutions that deliver real-time visibility into the health and performance of critical technology services and infrastructure across ATC.
  • Collaborate with key stakeholders to detect, triage and respond to technology events and incidents, minimizing downtime and business impact
  • Participate in post-incident reviews, analyzing and documenting root causes of system failures and recommending corrective actions to mitigate future risks
  • Advocate and implement reliability engineering best practices such as failover strategies, automated recovery and robust alerting mechanisms.

Benefits

  • Annual incentive bonus
  • Employer-sponsored pension plan
  • 401(k) match
  • HSA contribution
  • Life & disability insurance
  • Health care benefits
  • Generous time off plans
  • Flexible work arrangements

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Industry

Utilities

Education Level

No Education Listed

Number of Employees

501-1,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service