High Performance Computing Platform Engineer

Apply

PDT Partners

Posted:

February 7, 2023

Onsite

Job Commitment

Full-time

Experience Level

Mid Level

Workplace Type

Onsite

Job Function

Dev & Engineering

This job is closed

We regret to inform you that the job you were interested in has now been closed. Although this specific position is no longer available, we encourage you to continue exploring other opportunities on our job board.

About the position

The job overview for the High Performance Computing Platform Engineer role is that the candidate will be responsible for designing, implementing, and supporting scalable and performant HPC systems. They will work closely with other platform teams and collaborate with engineers and researchers to build high-quality and reliable systems. The role also involves implementing automation, managing capacity, optimizing benchmarks, and contributing to the day-to-day running of the platform systems. The ideal candidate should have experience in systems programming and/or software engineering, as well as practical experience in supporting and improving production systems.

Responsibilities

Design, implement, and deliver scalable and performant systems
Implement automation for the platform infrastructure
Collaborate closely with peer engineers and/or researchers to build high-quality, efficient, and reliable systems
Manage capacity and optimize benchmarks for critical workloads
Run and support platform systems day-to-day through automation and quality work

Requirements

Design, implement, and deliver scalable and performant systems
Implement automation for CI/CD pipelines and production metrics
Collaborate closely with engineers and researchers to build high-quality systems
Manage capacity and optimize benchmarks for critical workloads
Contribute to the day-to-day running and support of platform systems
Experience with systems programming and/or software engineering
Practical experience supporting, debugging, and improving production systems and services

Benefits

Practical experience supporting, debugging, and improving production systems and services
Experience using Linux and other Open Source Software
Experience with configuration management and infrastructure-as-code frameworks
Production experience working with a public cloud, AWS preferred
Experience with distributed parallel filesystems (Lustre, GPFS, parallel NFS)
Experience with batch scheduling systems (slurm, torque, SGE, AWS batch, AWS parallel cluster)
Experience with high-performance networking
Bachelors or Masters degree in an Engineering or Applied Sciences field from a rigorous academic program or equivalent professional experience
Salary range between $195,000 and $225,000 (excluding potential bonus amounts)

Learn more about PDT Partners employee perks and benefits.