Columbia University-posted about 1 year ago
$85,000 - $105,000/Yr
Full-time
New York, NY
Educational Services

The Research Systems Engineer at Columbia University plays a crucial role in the design, development, implementation, and operation of high-performance computing (HPC) services. This position collaborates with technical teams and researchers to ensure that computing resources meet user requirements and to plan for ongoing improvements to these systems and services.

  • Participates in the execution of research computing services designs and plans.
  • Investigates new and emerging technologies, evaluating usefulness to Columbia researchers.
  • Interacts with Columbia researchers on various topics, including the use of existing services, service policies, and research requirements.
  • Takes a role in HPC system troubleshooting including coordinating with users, vendors, and other CUIT departments to resolve system problems.
  • Manages storage systems.
  • Resolves incidents and service requests.
  • Provides administration of systems in the research computing infrastructure, including the installation and management of configuration, monitoring and notification tools, as well as basic network administration.
  • Creates and maintains user documentation.
  • All other duties as assigned.
  • Bachelor's degree and/or its equivalent required.
  • Minimum 3-5 years related experience.
  • 2+ years Linux/Unix experience.
  • Prior experience in programming, software development, or system administration.
  • Excellent written and verbal communication skills.
  • Demonstrated ability to work in a fast-paced, deadline driven environment.
  • Demonstrated excellence in teamwork/collaboration, analytical thinking, communication and influencing skills, and technical expertise.
  • Ability to work with changing priorities and with multiple projects.
  • Ability to be precise and attentive to detail is essential.
  • Ability to work with minimal supervision.
  • Ability to work weekend and off-hour work on occasion.
  • Experience with Linux system administration, particularly Red Hat (7,8).
  • Experience with SLURM or other workload management services.
  • Experience with Bright (Base Command Manager), OpenHPC, SGE, Confluent or other clusterware.
  • Knowledge of GPFS, Lustre, ZFS, NFS or other network or parallel file systems.
  • Familiarity with Ansible, Puppet or other Linux configuration management tools.
  • Familiarity with other HPC components, such as Infiniband network and GPU.
  • Experience with Shell scripting and Python.
  • Experience with version control systems, such as Git and monitoring tools like Grafana or Nagios.
  • Familiarity with HPC programming technologies (such as MPI, OpenMP, or CUDA).
  • Familiarity with other HPC technologies (such as Infiniband, or GPU, DDN appliance).
  • Familiarity with JupyterHub.
  • Familiarity with standard programming languages (such as C, C++, Fortran, or Java).
  • Knowledge of TCP/IP.
  • Familiarity with statistical tools (such as R) or mathematical tools (such as Matlab).
  • Knowledge of technology, applications and interfaces designed to support research, such as Globus.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service