We are seeking an HPC Performance & Reliability Engineer to join our software engineering team supporting the development of gas turbine design tools. This is a critical individual contributor role responsible for profiling, benchmarking, and optimizing the performance of a diverse portfolio of engineering applications running on a hybrid HPC environment, with a majority of compute in AWS. The successful candidate will establish best practices for job configuration, lead scaling studies, coordinate SLURM job launch configurations, and proactively monitor HPC resource usage to ensure a reliable and efficient compute environment for our internal engineering users. This role works in close partnership with the IT team and serves as the technical focal point for HPC performance, coordinating efforts across bubble assignment contributors and maintaining the documentation and standards that guide our user community.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior