Senior Manager, High Performance Computing Lab & Data Center

ARM•Austin, TX

162d•$241,100 - $326,100•Hybrid

About The Position

Arm technology is becoming the platform of choice for compute and AI. The Arm System Engineering team's mission is to architect, design, and develop server and rack-level infrastructure for at-scale datacenter deployments. The team capabilities span across system hardware, software, system interconnect, system management, storage, data center infrastructure and performance engineering. The team responsibilities include customer engagements, technology selection, system design, network architecture, performance, and datacenter deployment & operations. The Arm System Engineering team is developing industry-leading technology to deliver innovative and high-performing solutions to power the data centers of the future.

Requirements

8+ years of data center or lab operations experience, with at least 3+ years in a leadership or management role.
Proven success managing on-site teams in a high-uptime, mission-critical environment.
Hands-on experience with high-performance computing (HPC), AI clusters, or large-scale infrastructure deployments.
Strong background with break-fix, hardware installation, and repair of servers, networking, and power/cooling systems.
Familiarity with direct liquid cooling systems and other advanced cooling technologies.
Knowledge of incident management, problem management, and ITIL practices.
Excellent communication, leadership, and problem-solving skills.

Nice To Haves

Certifications such as CDCMP, ITIL, CCNA, or equivalent.
Experience with infrastructure monitoring & observability platforms.
Exposure to automation tools for deployment and operations.
Bachelor's degree in Computer Science, Engineering, related field or equivalent hands-on experience.

Responsibilities

Lead and develop on-site operational teams (technicians and engineers) responsible for maintaining lab and data center infrastructure.
Act as the escalation point for all incident response, troubleshooting, and resolution of HPC servers, networking, and liquid-cooled systems.
Oversee physical and logical infrastructure, including rack/stack, cabling, network design, power distribution, and advanced cooling systems (air and direct liquid cooling).
Ensure maximum system uptime by implementing monitoring, observability, and preventative maintenance practices.
Define and enforce operational standards, troubleshooting playbooks, and safety/compliance procedures for high-voltage and liquid-cooled environments.
Drive efficiency through automation, tooling, and process optimization across lab and data center operations.
Partner closely with engineering, facilities, IT, and leadership teams to align operations with business goals.
Oversee hardware lifecycle, including installation, inventory, and decommissioning.

Benefits

The chance to lead operations for cutting-edge AI and HPC systems.
A collaborative environment where your expertise makes an immediate impact.
Growth opportunities in one of the most advanced computing labs in the world.
Access professional growth through complex project involvement and multidisciplinary collaboration.
Join a company committed to diversity and inclusion, where your work matters and drives global progress.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Manager

Education Level

Bachelor's degree

Senior Manager, High Performance Computing Lab & Data Center

About The Position

Requirements

Nice To Haves

Responsibilities

Benefits

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company