HPC Linux System Administrator

Mclaurin AerospaceHouston, TX
1d

About The Position

Are you passionate about human space exploration, understanding the origins of the universe, and working with a passionate and diverse team to make a difference? If you are, we need you! We need your talent, teamwork, and energy to help us achieve great things that inspire people all over the globe. We need you to bring creative ideas and diverse backgrounds to help us envision, shape, and deliver systems that will enable the exploration of space while benefiting people here on Earth. We are excited about what we do, and we need you on our team as we take on exciting challenges for NASA's pursuits in deep space exploration. We have an exciting opportunity for a HPC Linux System Administrator to join our JETS II contract team at NASA Johnson Space Center in Houston, TX. If selected you will: Work with a team of System Administrators to build and maintain all FSL services. This will include: High Performance Compute (HPC) administration High-end Linux workstation administration High-speed networking High-speed parallel filesystem administration, and more. Oversee high speed parallel filesystems administration and job scheduler administration. Strong skills in administration of parallel filesystems Lustre GPFS Administering SLURM job scheduling system. Investigate system problems. Proactively monitor system health. Work with FSL users to make sure they can support the NASA human spaceflight mission. Qualifications: This position has been posted at multiple levels. Depending on the candidate's experience, requirements, and business needs, we reserve the right to consider candidates at any level for which this position has been advertised. Typically requires a bachelor's degree or equivalent certification in a related area and a minimum of 5 years of experience in the field or in a related area.

Requirements

  • Linux system administration
  • HPC job scheduler administration
  • System configuration management
  • High-speed parallel file storage administration
  • Monitoring and alerting
  • Demonstrated problem solving, planning, and communication skills
  • Ability to work in a team environment
  • US Citizenship and the ability to pass a comprehensive security background investigation is required.

Nice To Haves

  • Strong skills in administration of parallel filesystems like Lustre or GPFS and strong skills administering SLURM job scheduling system
  • RedHat-based systems
  • Luster High-speed Parallel Filesystems
  • InfiniBand
  • Provisioners (xCAT, warewulf)
  • Ansible / Foreman
  • SLURM resource manager
  • SPACK software manager
  • Log consolidation and monitoring
  • Git/Gitlab and software development (CI/CD)
  • Johnson Space Center campus network
  • NASA security mechanisms (security plans, POAMs, ATOs, Risk Assessments)

Responsibilities

  • Work with a team of System Administrators to build and maintain all FSL services.
  • High Performance Compute (HPC) administration
  • High-end Linux workstation administration
  • High-speed networking
  • High-speed parallel filesystem administration
  • Oversee high speed parallel filesystems administration and job scheduler administration.
  • Administering SLURM job scheduling system.
  • Investigate system problems.
  • Proactively monitor system health.
  • Work with FSL users to make sure they can support the NASA human spaceflight mission.

Benefits

  • Competitive pay
  • Positive work-life balance
  • 100% Employer paid insurance including: Medical, dental, vision, life insurance, accidental death and dismemberment, short-term disability, and long-term disability
  • 11 paid holidays annually
  • Generous PTO
  • 401k after 12 months of service with employer match
  • Education assistance
  • Relocation assistance (if applicable)
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service