HPC AI Systems Administrator

HPESan Jose, Minnesota
$105,500 - $243,000Onsite

About The Position

This position will support government accounts. Therefore, due to federal export-control regulations, the selected candidate must hold U.S. citizenship, U.S. lawful permanent resident/Green Card status or otherwise have a category of refugee/asylee status enabling them to perform the role without requiring a license under the International Traffic in Arms Regulations (ITAR) or Export Administration Regulations (EAR). The Data Center Administration team is seeking a Senior System Administrator to provide advanced system administration and lab operations support for hardware, network, and software environments used by HPE HPC & AI Performance Engineering teams. These environments support internal product development, performance engineering, ISV validation, and customer-facing sales and benchmarking activities. This role serves as a senior technical contributor and lab expert, providing design guidance, operational leadership, and escalation-level troubleshooting across complex HPC and AI lab environments. The position partners closely with engineering teams, infrastructure support groups, and external partners to ensure lab stability, availability, and effective use of resources. The Senior System Administrator contributes to continuous improvement of lab processes, policies, and standards, prioritizes lab requests, mentors junior staff, and supports future lab expansion and facility transitions.

Requirements

  • U.S. citizenship, U.S. lawful permanent resident/Green Card status or otherwise have a category of refugee/asylee status enabling them to perform the role without requiring a license under the International Traffic in Arms Regulations (ITAR) or Export Administration Regulations (EAR).
  • Bachelor’s degree in Computer Science, MIS, or a related technical field required mainly System Administration.
  • Minimum of 8–10 years of Linux system administration experience required, preferably in HPC, AI, or lab-based environments.
  • Strong expertise in Linux system administration with working knowledge of networking, storage, virtualization, and hardware platforms.
  • Candidates with strong Linux or network administration backgrounds and demonstrated interest in advanced lab system administration will also be considered.

Nice To Haves

  • HPC cluster provisioning, validation, and testing workflows
  • Virtualized lab infrastructure
  • High-performance storage systems (e.g., Lustre)
  • Advanced hardware diagnostics
  • Job scheduling and resource management tools

Responsibilities

  • Image, configure, and upgrade servers with Linux operating systems, including firmware updates and switch configuration to support lab environments.
  • Configure and manage multiple root slots hosting varied operating system images in support of HPC cluster provisioning, validation, and testing workflows.
  • Provide design guidance and operational support for virtualized lab infrastructure, including virtual server administration and the design of highly available, fault-tolerant environments.
  • Provide design guidance for lab storage solutions, including installation, configuration, and performance management of high-performance storage systems (e.g., Lustre) to support sales, benchmarking, and partner activities.
  • Provide guidance for hardware and software installation and configuration, including advanced hardware diagnostics and coordination with infrastructure support teams to resolve power, CPU, and GPU issues.
  • Collaborate with AI benchmarking, R&D, and performance engineering teams to design and operate lab environments that meet internal, partner, and customer requirements.
  • Design lab layouts, networks, and operational policies that meet functional needs while adhering to cybersecurity and asset protection standards.
  • Prioritize and coordinate lab work activities to ensure timely delivery of high-impact requests and effective utilization of lab resources.
  • Make recommendations on lab resource usage, capacity planning, and future expansion to support evolving business and engineering needs.
  • Oversee and support lab transitions, including facility moves and infrastructure refresh activities.
  • Install, configure, and support job scheduling and resource management tools to maximize lab utilization.
  • Serve as a technical mentor to junior system administrators and lab staff, providing guidance on best practices, troubleshooting, and operational standards.
  • Communicate lab successes, risks, failures, and issues to management in a timely and professional manner.
  • Work effectively with remote administrators, vendors, and partners when specialized expertise or additional support is required.

Benefits

  • Health & Wellbeing comprehensive suite of benefits that supports their physical, financial and emotional wellbeing.
  • Personal & Professional Development programs catered to helping you reach any career goals you have
  • Unconditional Inclusion
  • Flexibility to manage our work and personal needs.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service