About The Position

IREN is seeking a highly analytical and operationally focused Change & Problem Management Analyst to help build and mature a modern Service Management and Reliability Operations function supporting its HPC Data Center Operations. This role operates at the intersection of operational risk management, service reliability, incident reduction, and delivery enablement. The individual will play a critical role in improving operational stability by driving disciplined Change Enablement practices, leading high-quality Problem Management investigations, reducing recurring incidents, and improving deployment reliability across HPC infrastructure and customer-facing production systems. The individual will function as an operational reliability analyst and governance leader capable of partnering with Infrastructure, Cloud, Platform Engineering, SRE, Network, and Application teams to improve operational outcomes through data-driven analysis, risk-based governance, observability insights, and continual improvement initiatives. The role requires strong operational judgment, analytical thinking, communication skills, and the ability to influence technical teams without direct authority.

Requirements

  • Bachelor's degree in Computer Science, Data Science, Statistics, or equivalent hands-on experience
  • 5+ years of experience in IT Operations, IT Service Management, Reliability Operations, Production Support, or related operational environments.
  • Hands-on experience with Change Management / Change Enablement and/or Problem Management practices.
  • Experience operating in enterprise-scale, high-availability environments supporting critical production systems.
  • Strong understanding of ITIL concepts and modern operational governance practices.
  • Experience facilitating or supporting Root Cause Analysis (RCA) activities.
  • Strong analytical and operational problem-solving skills.
  • Familiarity with CI/CD deployment practices and DevOps operating models.
  • Experience with Service Configuration Management / CMDB concepts and service dependency mapping.
  • Experience supporting operational audit or compliance initiatives.
  • Experience with Jira Service Management, or similar platforms.

Nice To Haves

  • Pre-employment screening, including background check and substance testing may be required according to company policies

Responsibilities

  • Govern and continuously improve Change Enablement practices focused on reducing operational risk while enabling speed of delivery.
  • Review and assess medium and high-risk changes for completeness, technical risk, service impact, rollback readiness, dependency awareness, and operational preparedness.
  • Facilitate risk-based change governance and Change Authority (CAB) activities.
  • Partner with engineering, infrastructure, cloud, platform, and operations teams to improve deployment quality and change success rates.
  • Identify recurring patterns in failed or problematic changes and drive corrective actions to improve operational reliability.
  • Develop and refine policy-driven governance models, standard change frameworks, and automated approval workflows.
  • Leverage observability and telemetry data to support risk assessment and post-change validation activities.
  • Drive continuous improvement initiatives focused on reducing change-related incidents and operational disruption.
  • Support development and optimization of operational KPIs, dashboards, and executive reporting.
  • Assist with operational audit readiness, governance reporting, and documentation integrity.
  • Lead and coordinate Problem Management activities for recurring, high-impact, or systemic operational issues.
  • Facilitate structured root cause investigations and ensure high-quality Root Cause Analysis (RCA) deliverables.
  • Drive accountability for corrective and preventive actions across technical teams.
  • Analyze incident, event, telemetry, and operational trend data to identify systemic reliability risks.
  • Identify recurring service degradation patterns and partner with engineering teams to drive permanent remediation.
  • Develop operational insights and trend analysis supporting service reliability improvements.
  • Track and govern problem backlog health, corrective action completion, and recurring incident reduction.
  • Promote blameless post-incident review practices focused on operational learning and resilience improvement.
  • Participate in operational governance forums including change reviews, incident reviews, problem reviews, and operational readiness assessments.
  • Promote automation-first operational practices that reduce manual overhead and improve scalability.

Benefits

  • Competitive wages with robust per diem and project allowance, when applicable
  • Overtime compensation for non-exempt workers for hours worked over 40 per week
  • Relocation and Living-out-allowance (as applicable and based on successful candidate circumstances)
  • 100% company paid health insurance premiums (medical, dental, and vision) for employees, 75% company paid coverage for dependents
  • Company-paid short-term and long-term disability insurance
  • Voluntary life, critical illness, and accident coverage available
  • Health Savings Accounts (HSA) – when combined with the High-Deductible Health Plan
  • Employee Assistance Program and wellness resources
  • 401(k) retirement plan with company match
  • Paid professional development and access to financial planning and legal services
  • Paid Time Off (PTO) and paid holidays
  • Professional development to support certifications, continuing education, or role related training
  • Company events and team-building activities
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service