Research Systems Administrator-HPC III

Weill Cornell Medical CollegeNew York, NY
Onsite

About The Position

Resolves critical technical problems and takes lead in mentoring junior staff. Designs complex architectural solutions for a growing and adaptive computational environment. Regularly interfaces with lateral technical and research groups to assist in their endeavors.

Requirements

  • Bachelors of Science degree preferably in a technical field (e.g., computer science, physics, math, chemistry, or engineering)
  • Approximately 4 years of relevant work experience.
  • Experience working in a scientific computing environment, particularly in an academic setting is required.
  • Experience with Infiniband networking is required.
  • Experience with GPU computing, particularly using the nVidia CUDA toolkit is required.
  • Experience architecting scientific computing clusters and their associated scheduling systems, such as SGE, PBS, or SLURM is required.
  • Experience architecting database-backed web applications, particularly managing Shiny services, is required.
  • High proficiency in the Linux operating system, including scripting languages such as bash, Ruby and Perl.
  • Proficiency in building, installing, and configuring a variety of open-source Linux software packages, especially with complex dependencies.
  • High proficiency in at least two programming languages, such as C++ or Java, Perl, Python.
  • A detailed knowledge of computer hardware, specifically, Peta-scale storage platforms.
  • Proficiency in HPC filesystems, Lustre, GPFS.
  • Proficiency of networking concepts and use of tools and protocols such as SSH, DNS, DHCP, and LDAP.
  • Ability to manage complex vendor relationships.
  • Demonstrated leadership and mentoring skills to effectively manage and develop staff members.

Responsibilities

  • Designs the architecture of the department’s Peta-Scale storage infrastructure.
  • Architects data lifecycle management and storage tiers, including off-site replication to cloud providers.
  • Develops highly reliable batch scheduling infrastructure.
  • Designs the architecture of the department’s High Performance Computing (HPC) infrastructure.
  • Trains and mentors junior and operational staff in monitoring and troubleshooting cluster performance issues.
  • Designs and oversees the installation of scientific computing software stacks and research workflows.
  • Assists in the development of, coordinates, and leads, training events for research users in the efficient use of cluster and storage resources.
  • Provides and oversees complex technical documentation for the research community.
  • Directly interfaces with investigators to assess requirements and implement tailored solutions for their problem sets.
  • Responsible for vendor relations and negotiations of commercial solutions.
  • Performs other related duties as co-developed with management.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service