HPC Systems Engineer - TS/SCI Required

Phoenix Operations Group
11hOnsite

About The Position

Phoenix is seeking a High Performance Computing (HPC) Systems Engineer to support the build, configuration, and sustainment of advanced Linux-based HPC cluster environments. This role is critical to enabling distributed compute workloads, scientific simulations, and GPU-accelerated processing within a secure research environment. You will work in a cluster-scale computing environment where performance optimization, scheduler configuration, and distributed workload execution are key to mission success.

Requirements

  • Active TS/SCI clearance
  • Ability to work onsite in Charlottesville, VA
  • 6+ years of Linux systems administration experience
  • Hands-on experience with HPC clusters or distributed compute environments
  • Experience with workload schedulers such as: Slurm PBS / PBS Pro Torque or similar
  • Strong command-line Linux administration skills (RHEL preferred)
  • Experience with scripting or automation (Bash, Python, or similar)
  • Ability to obtain DoD 8140 (8570) IAT Level II certification

Nice To Haves

  • Experience administering multi-node HPC cluster environments
  • Familiarity with parallel/distributed file systems (Lustre, BeeGFS, GPFS)
  • Experience with MPI, OpenMP, or other parallel computing frameworks
  • Experience supporting GPU compute environments (CUDA)
  • Familiarity with container technologies: Docker, Podman, Singularity/Apptainer
  • Experience with configuration management tools (Ansible, Puppet)
  • Background supporting research labs, university HPC, or defense environments

Responsibilities

  • Configure, deploy, and maintain multi-node Linux HPC clusters
  • Administer and optimize workload schedulers (e.g., Slurm, PBS)
  • Troubleshoot distributed compute workloads across cluster environments
  • Perform performance analysis across compute, storage, and network layers
  • Support GPU-enabled workloads and CUDA-based processing
  • Develop and maintain automation scripts and operational tooling
  • Assist in cluster provisioning and node deployment (e.g., xCAT, Warewulf)
  • Support containerized workloads within HPC environments

Benefits

  • Medical, Dental, Vision Insurance - 100% Company Paid Premiums
  • STD, LTD, and Life Insurance - 100% Company paid
  • 401K – Automatic 10% company contribution; no matching required
  • PTO - 4 weeks/year
  • Holidays - 11 paid/year
  • Birthdays off with pay
  • Referral Bonuses – Upfront AND Annually Recurring
  • Open Source Bonuses – Contribute to our Github projects
  • Professional Development – Paid training, Certifications, and Enrichment

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

1-10 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service