GPU Systems Engineer

Hudson River Trading•Seattle, WA

137d•$200,000 - $300,000

About The Position

Hudson River Trading (HRT) is looking for GPU Systems Engineers to help scale and evolve our exceptionally sophisticated HPC/AI research environment. Joining our Research and Development team, you will collaborate with experts responsible for the compute, storage, operating systems, and automation tools that enable our trading and research to run 24/7 across the globe. We design, grow, and operate infrastructure at a large scale, including triple-digit petabyte-scale storage and massive CPU and GPU clusters in globally distributed data centers. As such, this is a high-impact role with broad scope, from HPC/AI cluster design and performance tuning, to troubleshooting and automation for thousands of nodes.

Requirements

5+ years of experience in large-scale Linux systems engineering in HPC, AI or distributed infrastructure roles
Extensive experience in Linux system installation, performance tuning, and troubleshooting
Expertise in troubleshooting distributed GPU workloads
Deep knowledge around GPU optimization and performance
Proficiency in Python scripting and automation frameworks
CUDA or C/C++ experience is a plus
Experience with NVIDIA technologies beyond CUDA, such as NCCL, GPUDirect RDMA, and NVLink
Familiarity with configuration management tools (e.g. Salt, Ansible, Puppet, Chef)
Comfortable diagnosing complex system issues at the hardware, OS, and network levels
Strong communication and organizational skills; able to collaborate across diverse technical teams
Thrive in fast-paced environments and excited by high-impact work

Responsibilities

Design, build, and optimize large-scale distributed GPU compute clusters
Identify and resolve GPU workloads’ performance bottlenecks across compute, storage, and networking layers
Collaborate with research and development teams to profile, benchmark, and fine-tune GPU-based workloads
Automate system deployment, monitoring, and troubleshooting across thousands of nodes
Collaborate with research, and engineering teams to support evolving workloads
Own critical infrastructure projects — from concept to implementation and support
Test and deploy new hardware and software, and partner with vendors to resolve complex issues

Benefits

Medical insurance
Dental insurance
Vision insurance
Basic life insurance
Enrollment in the company’s 401k plan
20 vacation days annually
10 paid holidays annually
Sick leave
Parental leave
Discretionary performance-based bonuses

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

GPU Systems Engineer

About The Position

Requirements

Responsibilities

Benefits

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company