HPC System Administrator (Top Secret) (Remote)

Content AreaMaryland, MD
Hybrid

About The Position

RedLine Performance Solutions (RedLine) is seeking an HPC System Administrator to provide operational support for HPC clusters located in Dayton, OH. This role involves engaging with the customer, participating in the evolution and maintenance of the technical infrastructure, and operationally supporting the on-site HPC environment. The administrator will be responsible for relaying insights to the RedLine Program Manager and collaborating to translate customer needs into actionable project tasks. The position functions as the lead point of contact for day-to-day operations and real-time problem resolution. While the ideal candidate will be at the customer site in Dayton, OH, remote work may be viable, and relocation may be considered. Operations run 24x7, requiring a rotational on-call commitment. This is a full-time (W-2) position offering a full benefits package.

Requirements

  • Active DoD Top Secret security clearance
  • Relevant technical certifications (e.g., Linux+, Security+)
  • 7 or more years of Linux systems administration, preferably in a Red Hat and/or Rocky environment
  • Strong knowledge of TCP/IP networking
  • 5 or more years of HPC cluster system administration experience, preferably with Dell clusters
  • Strong experience in Bash, Perl, and Python scripting in a version-controlled environment using Git
  • Experience with job scheduling software (e.g., Slurm, PBS)
  • Experience with cluster automation tools (e.g., xCAT, HPCM, Bright Cluster Manager)
  • Experience with parallel filesystems (e.g., Lustre)
  • Experience with high-speed interconnects (e.g., InfiniBand)
  • Strong verbal and written communication skills, with the ability to coordinate between multiple team members in remote locations between several disparate projects
  • Strong organizational skills

Nice To Haves

  • Experienced with system engineering in addition to system administration
  • Red Hat Certification (e.g., RHCSA, RHCE)
  • Server automation experience (e.g., Puppet, Foreman, Ansible)
  • Experience with MPI technologies
  • Experience with Warewulf cluster management and provisioning
  • Experience with Weka parallel file systems
  • Optimization experience with GPU based HPC clusters

Responsibilities

  • Provide operational support for HPC clusters located in Dayton, OH
  • Engage with the customer and participate in the evolution and maintenance of the technical infrastructure
  • Operationally support the on-site HPC environment
  • Relay insights to the RedLine Program Manager
  • Work with the RedLine Program Manager to translate customer needs into actionable project tasks
  • Function as the lead point of contact for day-to-day operations and real-time problem resolution

Benefits

  • Paid time off
  • 401k match
  • Health care benefits

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Education Level

No Education Listed

Number of Employees

1-10 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service