HPC Vendor Service Manager

IRENMackenzie, BC
CA$100,000 - CA$140,000Onsite

About The Position

Podtech Data Centers Inc., a member of the IREN Group, is seeking an HPC Vendor Service Manager. This role is responsible for overseeing all Original Equipment Manufacturer (OEM) and third-party vendor activities related to the maintenance, repair, replacement, and support of High-Performance Computing (HPC) infrastructure in a mission-critical data center. The position acts as the main point of contact between the data center operations team and hardware vendors, ensuring timely execution of break-fix activities, warranty repairs, RMA management, parts logistics, and technical escalations. The HPC Vendor Service Manager is accountable for vendor performance, service-level compliance, operational coordination, and maintaining high system availability for HPC clusters and supporting infrastructure. This role collaborates with Technical Operations, Technology, Asset Management, Logistics, Security, and Customer teams to ensure vendor activities are performed safely, efficiently, and according to site operational standards.

Requirements

  • Degree, diploma, or equivalent experience in Information Technology, Computer Engineering, Electronics, Data Center Operations, or related discipline.
  • 3+ years of experience in data center operations, hardware support, vendor management, or mission-critical infrastructure environments.
  • Experience supporting HPC, cloud, enterprise server, or large-scale compute environments.
  • Experience managing OEM service providers and third-party contractors.
  • Experience with warranty programs, hardware lifecycle management, and RMA processes.

Responsibilities

  • Act as the primary point of contact for all OEM and third-party service providers supporting HPC infrastructure.
  • Manage daily activities of vendor technicians performing maintenance, diagnostics, hardware replacements, and warranty repairs.
  • Ensure all vendor personnel comply with site safety, security, and operational requirements.
  • Coordinate vendor access, scheduling, and work execution activities.
  • Develop and maintain strong relationships with vendor account teams and field service organizations.
  • Conduct regular vendor performance reviews and service-level assessments.
  • Oversee all hardware repair activities affecting HPC servers, storage systems, networking equipment, and supporting technologies.
  • Prioritize repair activities based on operational risk and customer impact.
  • Coordinate maintenance windows and repair schedules with operations teams.
  • Ensure timely resolution of hardware failures and service interruptions.
  • Monitor repair progress and escalate delays as required.
  • Verify quality of completed repairs and restoration of service.
  • Manage the complete Return Material Authorization (RMA) lifecycle.
  • Coordinate diagnosis, part replacement, return shipments, and warranty claims.
  • Track all RMAs from initiation through closure.
  • Maintain visibility of open cases, parts status, and expected delivery timelines.
  • Escalate delayed shipments, parts shortages, or vendor response issues.
  • Ensure accurate documentation of all hardware replacements and warranty transactions.
  • Oversee inventory of critical spare parts and replacement components.
  • Coordinate inbound and outbound shipments with logistics providers and vendors.
  • Monitor spare parts consumption and recommend inventory adjustments.
  • Ensure proper handling, storage, and tracking of replacement hardware.
  • Establish and monitor key vendor performance indicators.
  • Track response times, repair times, first-time fix rates, and SLA compliance.
  • Identify recurring vendor performance issues and drive corrective actions.
  • Facilitate quarterly business reviews with key vendors.
  • Maintain vendor scorecards and performance reports.
  • Ensure accurate asset tracking and configuration records.
  • Verify serial number changes and hardware replacement documentation.
  • Support asset lifecycle planning and refresh programs.
  • Maintain records of installed hardware and warranty status.
  • Serve as the primary escalation point for vendor-related operational issues.
  • Coordinate OEM support during critical incidents and major outages.
  • Facilitate rapid mobilization of vendor resources during emergencies.
  • Support root cause analysis and corrective action development following hardware failures.
  • Ensure vendor compliance with site safety requirements, security policies, access control procedures, permit-to-work processes, lockout/tagout requirements, and change management processes.
  • Participate in vendor safety audits and operational reviews.
  • Ensure all work is performed in accordance with established operational standards.

Benefits

  • Medical, dental, and vision insurance coverage – 100% company paid for employees and dependents
  • Company-paid life and disability insurance
  • Voluntary life and critical illness coverage available
  • Employee Assistance Program and virtual health care platform
  • RRSP with company match
  • Voluntary TFSA
  • 3 weeks annually for vacation and paid holidays
  • Opportunities for advancement and internal mobility
  • Training and personal development opportunities
  • Company events and team-building activities
  • Relocation assistance
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service