Designs, develops, tests, deploys, documents, maintains, and enhances complex and diverse software for HPC (high performance computing) systems based upon documented requirements. Requires either a very strong math background and/or a very strong computer hardware background to understand HPC architecture and the mathematical principles underlying the software applications. HPC systems may include processing-intensive analytics, novel algorithm development, manipulation of extremely large data sets, real-time systems, and systems incorporating data repositories, data transport services, and application and systems development and monitoring. Works individually or as part of a team; reviews and tests software components for adherence to design requirements, documents test results, and resolves software problem reports. Utilizes software development and software design methodologies appropriate to the development environment and provides input to system design including hardware/software trade-offs, software reuse, OSS/COTS/GOTS use, and requirements analysis and synthesis from system level to individual software components. Supports efforts to understand performance limitations of FOSS, COTS, and GOTS software, frameworks, and tools deployed on high performance computers, including metrics collection, testing, and informing software or hardware architecture changes. Design, document and execute tests of FOSS, COTS and GOTS software architectures to determine what aspects of the software and/or computer infrastructure are limiting performance Research and identify metrics necessary to understand performance limitations of the software and/or computer infrastructure to support testing Research and identify monitoring necessary to support timely alerting of infrastructure and software failures encountered during testing Identify hardware and software failure trends and develop mitigations encountered during testing Perform root cause analysis Work with the customer metrics and monitoring team to introduce new metrics capabilities to support testing Modify the software architecture and/or develop new software capabilities to overcome performance limitations encountered during testing Review and test software components for adherence to design requirements and document test results Resolve software problem reports Provide input to software components of system design, including hardware/software trade-offs, software reuse, use of OSS, COTS, and GOTS software, and requirements analysis and synthesis from system level to individual software components
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior