Manager, Data Center Operations

CrusoeSpringfield, OH
$135,000 - $175,000Onsite

About The Position

Crusoe is seeking a Manager of Data Center Operations to lead our OH5C site in Springfield, Ohio. This is a hands-on leadership role overseeing the day-to-day health of a high-density, GPU-heavy compute environment. You will lead the on-site technician team, drive hardware reliability and break-fix performance, manage colocation relationships, and ensure the site meets fleet-wide operational standards. The ideal candidate is a technically strong, highly accountable leader who can move comfortably between the data center floor and senior-level operational reviews.

Requirements

  • 5+ years of data center operations leadership experience in a production environment.
  • Experience managing and developing technical teams.
  • Hands-on experience troubleshooting enterprise server hardware, including GPU nodes, DIMMs, drives, cabling, and rack-level infrastructure.
  • Strong familiarity with SuperMicro hardware, diagnostics, event logs, and RMA processes.
  • Experience working in colocation environments and managing provider SLAs.
  • Working knowledge of data center electrical and mechanical systems.
  • Experience with Jira, ServiceNow, or a similar ticketing platform.
  • Strong understanding of incident management, root-cause analysis, and operational risk.
  • Clear written and verbal communication skills, including the ability to present technical and operational information to senior leaders.
  • Ability to work on-site in Springfield, Ohio, and support critical incidents as needed.

Nice To Haves

  • Experience supporting AMD GPU clusters, including MI300X or equivalent platforms.
  • Familiarity with NVIDIA GPU platforms such as H100, H200, or B200.
  • Understanding of RoCE fabric topology and common failure modes.
  • Experience with DCIM or asset-management tools such as NetBox.
  • Multi-site or regional data center operations experience.
  • Experience in rapidly scaling cloud, hyperscale, or AI infrastructure environments.

Responsibilities

  • Own the daily operation, health, and availability of the OH5C data center.
  • Lead troubleshooting and repair of GPU compute hardware, including GPU trays, DIMMs, drives, cabling, and server nodes.
  • Drive rapid triage and repair while maintaining MTTR and uptime targets.
  • Coordinate RMAs and hardware support with OEM vendors, primarily SuperMicro.
  • Maintain spare-parts inventory and ensure critical hardware is available when needed.
  • Partner with Fleet Operations, SRE, networking, and infrastructure teams on escalations.
  • Lead, coach, and develop the on-site data center technician team.
  • Set clear expectations for safety, quality, responsiveness, and accountability.
  • Conduct regular one-on-ones, performance reviews, and development planning.
  • Support technician hiring, onboarding, training, and workforce planning.
  • Build a culture of technical precision, ownership, and continuous improvement.
  • Track and report site KPIs, including uptime, MTTR, SLA compliance, deployment velocity, and ticket aging.
  • Use operational data to identify recurring issues and improve reliability.
  • Maintain accurate break-fix workflows in Jira or a comparable ticketing system.
  • Provide clear operational updates, incident summaries, and corrective-action plans to senior leadership.
  • Serve as the primary on-site liaison with the colocation provider.
  • Hold facility partners accountable to SLAs related to power, cooling, security, and availability.
  • Maintain working knowledge of UPS systems, PDUs, generators, CRAC and CRAH systems, and supporting infrastructure.
  • Escalate and track facility issues through resolution.
  • Coordinate planned maintenance to minimize risk to production systems.
  • Maintain site runbooks, SOPs, emergency procedures, and hardware documentation.
  • Ensure work is completed in accordance with safety, security, and change-management standards.
  • Contribute to fleet-wide operating standards and knowledge sharing.
  • Maintain accurate asset, inventory, and configuration records.

Benefits

  • Competitive compensation and equity
  • Restricted Stock Units
  • Paid time off, holidays, and leave programs
  • Medical, dental, and vision insurance
  • Employer HSA contributions
  • Paid parental leave
  • Life, short-term disability, and long-term disability insurance
  • Professional development and tuition reimbursement
  • Mental health and wellness support
  • Commuter benefits
  • Cell phone stipend
  • 401(k) with company match up to 4%
  • Volunteer time off
  • Global travel insurance and emergency assistance
  • Daily meal allowance
  • Additional location-specific benefits
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service