HPC Systems Engineer - TS/SCI Required

Phoenix Operations GroupCharlottesville, VA
23hOnsite

About The Position

Phoenix is seeking a High Performance Computing (HPC) Systems Engineer to support the build, configuration, and sustainment of advanced Linux-based HPC cluster environments. This role is critical to enabling distributed compute workloads, scientific simulations, and GPU-accelerated processing within a secure research environment. You will work in a cluster-scale computing environment where performance optimization, scheduler configuration, and distributed workload execution are key to mission success.

Requirements

  • Active TS/SCI clearance
  • Ability to work onsite in Charlottesville, VA
  • 6+ years of Linux systems administration experience
  • Hands-on experience with HPC clusters or distributed compute environments
  • Experience with workload schedulers such as: Slurm PBS / PBS Pro Torque or similar
  • Strong command-line Linux administration skills (RHEL preferred)
  • Experience with scripting or automation ( Bash, Python, or similar )
  • Ability to obtain DoD 8140 (8570) IAT Level II certification

Nice To Haves

  • Experience administering multi-node HPC cluster environments
  • Familiarity with parallel/distributed file systems (Lustre, BeeGFS, GPFS)
  • Experience with MPI, OpenMP , or other parallel computing frameworks
  • Experience supporting GPU compute environments (CUDA)
  • Familiarity with container technologies : Docker, Podman, Singularity/Apptainer
  • Experience with configuration management tools (Ansible, Puppet)
  • Background supporting research labs, university HPC, or defense environments

Responsibilities

  • Configure, deploy, and maintain multi-node Linux HPC clusters
  • Administer and optimize workload schedulers (e.g., Slurm, PBS)
  • Troubleshoot distributed compute workloads across cluster environments
  • Perform performance analysis across compute, storage, and network layers
  • Support GPU-enabled workloads and CUDA-based processing
  • Develop and maintain automation scripts and operational tooling
  • Assist in cluster provisioning and node deployment (e.g., xCAT, Warewulf)
  • Support containerized workloads within HPC environments

Benefits

  • Medical, Dental, Vision Insurance - 100% Company Paid Premiums
  • STD, LTD, and Life Insurance - 100% Company paid
  • 401K – Automatic 10% company contribution; no matching required
  • PTO - 4 weeks/year
  • Holidays - 11 paid/year
  • Birthdays off with pay
  • Referral Bonuses – Upfront AND Annually Recurring
  • Open Source Bonuses – Contribute to our Github projects
  • Professional Development – Paid training, Certifications, and Enrichment
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service