HPC Systems Engineer - Associate

Numerical Algorithms Group
Onsite

About The Position

As an HPC Systems Engineer, you will join a small, highly skilled team responsible for delivering a performant, reliable, and secure high‑performance computing (HPC) environment. In this role, you will primarily support the day-to-day operations of a complex HPC platform, with opportunities to contribute to automation, performance improvements, and infrastructure engineering initiatives. This role is ideal for an engineer with strong Linux fundamentals who wants to grow into HPC platform engineering while gaining hands-on experience across compute, storage, networking, and scheduling systems that support data-intensive workloads.

Requirements

  • A degree in Computer Science, Computer Engineering, Information Systems, or equivalent practical experience
  • Minimum 3 years of hands-on Linux experience (e.g., RHEL, CentOS)
  • Strong Linux command-line proficiency
  • Experience working within a high-performance computing (HPC) environment
  • Strong troubleshooting and analytical problem-solving skills
  • Excellent written and verbal communication skills, including technical documentation
  • Collaborative mindset and strong interpersonal skills
  • Self‑motivated, proactive, and eager to learn in a fast-paced environment

Nice To Haves

  • Broad IT knowledge across infrastructure and applications
  • Exposure to HPC schedulers (e.g., Slurm)
  • Experience supporting large-scale production environments
  • Understanding of data centre fundamentals (networking, cooling, power)
  • Experience installing and compiling vendor and open-source software
  • Scripting or automation experience (e.g., Bash, Python)

Responsibilities

  • Provide technical support to a globally distributed HPC user community across multiple time zones
  • Monitor and maintain HPC system performance across compute, storage, networking, and job scheduling resources
  • Provision and manage project and user storage environments
  • Install, configure, and upgrade system software, including OS updates, in a production HPC environment
  • Perform routine maintenance and contribute to reliability, performance, and capacity improvements
  • Troubleshoot infrastructure and application issues in collaboration with internal teams and external vendors
  • Evaluate new tools and technologies to improve operational efficiency and user experience
  • Build and maintain strong cross‑functional partnerships to support effective delivery and execution
  • Contribute to a collaborative team culture through clear, open communication and shared ownership

Benefits

  • We provide a comprehensive benefits package including a competitive salary (dependent on your experience), 401(k) plan with company match up to 5%, and health, dental, life, short-term and long-term disability insurance.
  • Additionally, we offer 20 vacation days, with an additional 4 days mandatorily taken between Christmas and New Year’s holidays, as well as paid sick days and maternity and paternity leave.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Entry Level

Number of Employees

101-250 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service