Hewlett Packard Enterprise-posted 4 months ago
$95,100 - $181,500/Yr
Full-time
Hybrid • Bloomington, MN
Computer and Electronic Product Manufacturing

An MPI/SHMEM validation engineer plays a crucial role in ensuring the quality and performance of Message Passing Interface (MPI) and SHared MEMory (SHMEM) based applications and middleware, particularly in High-Performance Computing (HPC) environments. They are responsible for testing, debugging, and validating parallel programming frameworks and their implementations to meet established standards and specifications. This involves working with both hardware and software aspects of HPC systems and ensuring optimal functionality and efficiency for communication middleware like MPI and SHMEM.

  • Test plan development and execution: Designing and executing comprehensive test plans to validate MPI and SHMEM features, functionality, and performance.
  • Debugging and Root Cause Analysis: Identifying, analyzing, and resolving issues found during validation and testing, collaborating with development teams to implement corrective actions.
  • Performance Evaluation and Optimization: Evaluating and optimizing the performance of MPI and SHMEM based applications and middleware, including communication collective algorithms like AllReduce.
  • Automation and Infrastructure Development: Developing and maintaining post-silicon validation infrastructure including software, hardware, and automation environments.
  • Collaboration: Working closely with hardware teams, software developers, architects, and various stakeholders to ensure seamless integration and validation of systems.
  • Documentation: Generating and maintaining detailed documentation of validation activities, test results, and compliance reports.
  • Troubleshooting: Providing technical expertise and support for troubleshooting and resolving technical issues related to MPI and SHMEM.
  • Staying updated with technology: Maintaining knowledge of validation trends, industry standards, and new technologies in high-performance computing, parallel programming, and communication middleware.
  • Strong understanding of parallel programming models, specifically MPI and SHMEM, including their concepts, features, and one-sided communication APIs.
  • Knowledge of high-performance memory subsystems, SoC/ASIC memory architecture, high-speed I/O interfaces, and their interaction with parallel programming models.
  • Proficiency in programming languages like C/C++, Python, and potentially others like Perl, for developing validation tests, scripts, and tools.
  • Experience with various validation methodologies, including formal analysis and runtime instrumentation, for detecting MPI bugs and ensuring correctness.
  • Expertise in utilizing debugging tools, methodologies, and techniques for identifying and resolving hardware and software issues at various levels.
  • Experience with test automation frameworks and methodologies for developing and maintaining automated regression tests and scripts.
  • Excellent analytical and problem-solving abilities to dissect complex systems, identify issues, and propose innovative solutions.
  • Strong communication and interpersonal skills for effective collaboration with cross-functional teams and stakeholders.
  • Meticulous attention to detail to catch discrepancies and ensure thorough validation of systems and processes.
  • Cloud Architectures
  • Cross Domain Knowledge
  • Design Thinking
  • Development Fundamentals
  • DevOps
  • Distributed Computing
  • Microservices Fluency
  • Full Stack Development
  • Security-First Mindset
  • Solutions Design
  • Testing & Automation
  • User Experience (UX)
  • Health & Wellbeing: Comprehensive suite of benefits that supports physical, financial and emotional wellbeing.
  • Personal & Professional Development: Programs catered to helping you reach any career goals you have.
  • Unconditional Inclusion: A culture that celebrates individual uniqueness and values varied backgrounds.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service