Reliability Engineer

CyrusOneMidland, MI
1d

About The Position

The Reliability Engineer is accountable for facility infrastructure reliability across mission critical data center systems (power, cooling, controls). You will design, implement, and continuously improve asset strategies and work management processes to achieve uptime, safety, and cost objectives. Core work includes reliability analytics, PM optimization, MOP/SOP governance, change management, root cause analysis (RCA), and program execution for critical spares, condition monitoring, and lifecycle asset management. Reliability Strategy & Asset Care Develop and maintain equipment strategies (criticality, failure modes, maintenance prescriptions) for power and cooling systems. Own PM quality and audit activities; eliminate ineffective tasks and deploy optimized prescriptions. Work Management Excellence Author, review, and govern SOPs/MOPs/EOPs and change packages; ensure adherence through training and approvals. Partner with site teams to maintain CMMS schedules and O&M plans; lead reliability investigations and corrective actions. Condition Monitoring & Analytics Implement oil/coolant analysis, thermography, vibration, and battery monitoring; trend data to preempt failures. Critical Spares & Lifecycle Management Establish and maintain critical spares lists and stocking strategies; track gaps and remedial actions. Support lifecycle asset management processes to guide replacements and capital planning. RCA & Continuous Improvement Lead post incident RCAs and FMEA; publish learnings and update procedures. People & Certification Collaborate with CE leaders to uphold operator certification and training standards; mentor technicians on reliability methods.

Requirements

  • 7 years in reliability, maintenance engineering, or facilities engineering within mission critical environments.
  • Expertise with RCM, FMEA, RCA, and maintenance optimization.
  • Familiarity with UPS, generators, switchgear, chillers, cooling towers, CRAH/CRAC, and BMS/EPMS.
  • Experience governing SOP/MOP/EOP, CMMS scheduling, and change management.
  • Ability to analyze condition monitoring data and turn findings into actions.
  • Proficiency in data analysis and visualization tools (Excel, Power BI, or similar) to mine CMMS, condition-monitoring, and operational data for trends, failure patterns, and predictive insights.
  • Ability to apply statistical methods or reliability modeling to support decision-making.
  • Strong communication skills; able to lead investigations and drive consensus.

Nice To Haves

  • Experience with critical spares programs and lifecycle asset management.
  • Experience with scripting or data science tools (Python, R) for reliability analytics, predictive modeling, or failure trend analysis.
  • Familiarity with SQL or data query languages for extracting and cleaning large operational datasets.
  • Knowledge of battery monitoring and generator fluid analysis programs.
  • Familiarity with NFPA and other regulatory and standards bodies.
  • Preferred: CMRP, CRE, or similar reliability certification.

Responsibilities

  • Develop and maintain equipment strategies (criticality, failure modes, maintenance prescriptions) for power and cooling systems.
  • Own PM quality and audit activities; eliminate ineffective tasks and deploy optimized prescriptions.
  • Author, review, and govern SOPs/MOPs/EOPs and change packages; ensure adherence through training and approvals.
  • Partner with site teams to maintain CMMS schedules and O&M plans; lead reliability investigations and corrective actions.
  • Implement oil/coolant analysis, thermography, vibration, and battery monitoring; trend data to preempt failures.
  • Establish and maintain critical spares lists and stocking strategies; track gaps and remedial actions.
  • Support lifecycle asset management processes to guide replacements and capital planning.
  • Lead post incident RCAs and FMEA; publish learnings and update procedures.
  • Collaborate with CE leaders to uphold operator certification and training standards; mentor technicians on reliability methods.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service