HPC AI Systems Administrator

Hewlett Packard EnterpriseBloomington, MN
1dOnsite

About The Position

HPC AI Systems Administrator This role has been designed as ‘’Onsite’ with an expectation that you will primarily work from an HPE office. Who We Are: Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work. We help companies connect, protect, analyze, and act on their data and applications wherever they live, from edge to cloud, so they can turn insights into outcomes at the speed required to thrive in today’s complex world. Our culture thrives on finding new and better ways to accelerate what’s next. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good. If you are looking to stretch and grow your career our culture will embrace you. Open up opportunities with HPE. Job Description: This position will support government accounts. Therefore, due to federal export-control regulations, the selected candidate must hold U.S. citizenship, U.S. lawful permanent resident/Green Card status or otherwise have a category of refugee/asylee status enabling them to perform the role without requiring a license under the International Traffic in Arms Regulations (ITAR) or Export Administration Regulations (EAR). The Data Center Administration team is seeking a Senior System Administrator to provide advanced system administration and lab operations support for hardware, network, and software environments used by HPE HPC & AI Performance Engineering teams. These environments support internal product development, performance engineering, ISV validation, and customer-facing sales and benchmarking activities. This role serves as a senior technical contributor and lab expert, providing design guidance, operational leadership, and escalation-level troubleshooting across complex HPC and AI lab environments. The position partners closely with engineering teams, infrastructure support groups, and external partners to ensure lab stability, availability, and effective use of resources. The Senior System Administrator contributes to continuous improvement of lab processes, policies, and standards, prioritizes lab requests, mentors junior staff, and supports future lab expansion and facility transitions.

Requirements

  • This position will support government accounts. Therefore, due to federal export-control regulations, the selected candidate must hold U.S. citizenship, U.S. lawful permanent resident/Green Card status or otherwise have a category of refugee/asylee status enabling them to perform the role without requiring a license under the International Traffic in Arms Regulations (ITAR) or Export Administration Regulations (EAR).
  • Communication – Communicates clearly and effectively in both written and verbal forms; collaborates well with diverse technical teams.
  • Creativity / Innovation – Applies creative problem-solving approaches and contributes to continuous improvement of lab processes and capabilities.
  • Customer Service – Demonstrates a service-oriented mindset when supporting internal teams, partners, and stakeholders.
  • Job Knowledge – Maintains deep technical knowledge of Linux systems, lab operations, and HPC/AI infrastructure.
  • Problem Solving / Analysis – Breaks down complex technical issues, identifies root causes, and develops effective solutions.
  • Quality – Demonstrates attention to detail, accuracy, and reliability.
  • Technical Skills – Strong expertise in Linux system administration with working knowledge of networking, storage, virtualization, and hardware platforms.
  • Bachelor’s degree in Computer Science, MIS, or a related technical field required mainly System Administration.
  • Minimum of 8–10 years of Linux system administration experience required, preferably in HPC, AI, or lab-based environments.
  • Candidates with strong Linux or network administration backgrounds and demonstrated interest in advanced lab system administration will also be considered.

Responsibilities

  • Image, configure, and upgrade servers with Linux operating systems, including firmware updates and switch configuration to support lab environments.
  • Configure and manage multiple root slots hosting varied operating system images in support of HPC cluster provisioning, validation, and testing workflows.
  • Provide design guidance and operational support for virtualized lab infrastructure, including virtual server administration and the design of highly available, fault-tolerant environments.
  • Provide design guidance for lab storage solutions, including installation, configuration, and performance management of high-performance storage systems (e.g., Lustre) to support sales, benchmarking, and partner activities.
  • Provide guidance for hardware and software installation and configuration, including advanced hardware diagnostics and coordination with infrastructure support teams to resolve power, CPU, and GPU issues.
  • Collaborate with AI benchmarking, R&D, and performance engineering teams to design and operate lab environments that meet internal, partner, and customer requirements.
  • Design lab layouts, networks, and operational policies that meet functional needs while adhering to cybersecurity and asset protection standards.
  • Prioritize and coordinate lab work activities to ensure timely delivery of high-impact requests and effective utilization of lab resources.
  • Make recommendations on lab resource usage, capacity planning, and future expansion to support evolving business and engineering needs.
  • Oversee and support lab transitions, including facility moves and infrastructure refresh activities.
  • Install, configure, and support job scheduling and resource management tools to maximize lab utilization.
  • Serve as a technical mentor to junior system administrators and lab staff, providing guidance on best practices, troubleshooting, and operational standards.
  • Communicate lab successes, risks, failures, and issues to management in a timely and professional manner.
  • Work effectively with remote administrators, vendors, and partners when specialized expertise or additional support is required.

Benefits

  • Health & Wellbeing We strive to provide our team members and their loved ones with a comprehensive suite of benefits that supports their physical, financial and emotional wellbeing.
  • Personal & Professional Development We also invest in your career because the better you are, the better we all are. We have specific programs catered to helping you reach any career goals you have — whether you want to become a knowledge expert in your field or apply your skills to another division.
  • Unconditional Inclusion We are unconditionally inclusive in the way we work and celebrate individual uniqueness. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service