Lead High Performance Computing Engineer

Brown UniversityManchester, NH
Hybrid

About The Position

Lead High Performance Computing Engineer Office of Information Technology Brown University seeks a Lead High Performance Computing Engineer to provide technical leadership to HPC engineering staff, coordinate team, and guide architectural and operational decisions for the HPC environment. Design, deploy, and maintain university’s high-performance computing cluster. Configure and maintain the workload scheduler and architect quality-of-service policies. Administer Linux systems across infrastructure projects and deployment of new GPUs for research and teaching. Troubleshoot complex system and application issues involving compute, GPU, storage, and scheduling systems. Deploy and support AI workloads, including TensorFlow, PyTorch, and Jax. Conduct vendor evaluations, proof of concept to adopt new HPC, GPU, and storage technologies. Provide advanced technical support to faculty and researchers.

Requirements

  • Must have bachelor’s degree in Computer Science and 3 years of experience in Linux systems administration, including background in the following:
  • Research computing environment
  • RHEL/CentOS Linux operating management
  • Systems architectures, security, networking, storage systems, parallel computing, batch/scheduling systems
  • Programming languages such as C, C++, bash, Perl, etc.
  • Source control systems such at Git
  • Log correlation software such as Sumologic
  • Large-scale research computing platforms such as Globus, HPC environments, SLURM, GPFS
  • Machine Learning Frameworks, including Tensorflow

Responsibilities

  • Provide technical leadership to HPC engineering staff
  • Coordinate team
  • Guide architectural and operational decisions for the HPC environment
  • Design, deploy, and maintain university’s high-performance computing cluster
  • Configure and maintain the workload scheduler and architect quality-of-service policies
  • Administer Linux systems across infrastructure projects and deployment of new GPUs for research and teaching
  • Troubleshoot complex system and application issues involving compute, GPU, storage, and scheduling systems
  • Deploy and support AI workloads, including TensorFlow, PyTorch, and Jax
  • Conduct vendor evaluations, proof of concept to adopt new HPC, GPU, and storage technologies
  • Provide advanced technical support to faculty and researchers

Benefits

  • Information on the Benefits of Working at Brown can be found here.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service