The Home Depot-posted 2 months ago
Full-time • Manager
5,001-10,000 employees

The Senior Manager, SRE will be responsible for leading a team of Site Reliability Engineers in ensuring the reliability, performance, and operational support of our supply chain systems. The ideal candidate will have a strong background in reliability reviews, performance engineering practices, destructive testing, production engineering, and operational support. This role will require a hands-on approach to drive continuous improvement in our systems and processes. In addition, you will be given the chance to work with our most strategic vendors to ensure that 3rd party tools and applications are readily available to all product teams that are interested in using them. The Sr. Manager must exhibit the ability to lead managers and their teams and drive change management and process improvement.

  • Lead and mentor a team of Site Reliability Engineers, fostering a culture of continuous improvement and innovation.
  • Collaborate with cross-functional teams to ensure alignment on reliability and performance goals.
  • Conduct reliability reviews to identify areas for improvement and implement solutions to enhance system reliability.
  • Implement and promote performance engineering practices to ensure optimal system performance.
  • Develop and execute strategies for destructive testing to identify potential points of failure and improve system resilience.
  • Oversee production engineering efforts to ensure systems are designed for operational excellence and reliability.
  • Provide leadership in incident management and root cause analysis to resolve production issues and prevent recurrence.
  • Establish and maintain operational support practices, including monitoring, alerting, and incident response.
  • Drive continuous improvement initiatives in reliability, performance, and operational support.
  • Stay current with industry trends and best practices to ensure our systems and processes remain cutting-edge.
  • Must be eighteen years of age or older.
  • Must be legally permitted to work in the United States.
  • Mastery of an object oriented programming language (preferably Java).
  • 6-8 years of relevant work experience.
  • Proven experience in reliability reviews, performance engineering, and destructive testing.
  • Strong understanding of production engineering and operational support practices.
  • Experience with supply chain systems and retail environments is a plus.
  • Excellent leadership and team management skills.
  • Strong problem-solving and analytical abilities.
  • Excellent communication and collaboration skills.
  • Mastery of a modern scripting language (preferably Python).
  • Mastery of a modern web application framework such as Ruby on Rails, Spring MVC, and Node.js.
  • Mastery of writing SQL queries against a relational database.
  • Proficient in effective troubleshooting and issue resolution techniques.
  • Proficient in effective system monitoring and log analysis techniques.
  • Experience managing vendor relationships.
  • Health care benefits
  • 401K
  • ESPP
  • Paid time off
  • Success sharing bonus
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service