Senior Cloud Software Development Engineer

Intel CorporationAustin, TX
Onsite

About The Position

Join our Communication Runtimes team as a Senior Cloud Software Development Engineer to develop cutting-edge software features and optimizations for Intel's communication libraries including Intel SHMEM (Shared Memory Access), Intel MPI (Message Passing Interface), MPICH (Message Passing Interface Chameleon), and Intel oneCCL (Collective Communications Library). This role has a primary focus on development for oneCCL, and there are opportunities to contribute to these other communication libraries. This role offers the opportunity to build expertise with the latest Intel GPUs and CPUs used in data centers, collaborate directly with scientists and engineers on the Aurora supercomputer at Argonne National Labs, and make meaningful contributions that advance scientific computing and machine learning capabilities. You will work on the Aurora Supercomputer with direct collaboration with Argonne National Labs on one of the world's most advanced supercomputers. You will also work with Cutting-Edge Hardware, specifically the latest Intel GPUs and CPUs designed for data center and HPC applications. This role offers the opportunity to make meaningful contributions to scientific computing breakthroughs and machine learning advancement, and to contribute to the innovation and development of next-generation communication libraries and optimization techniques.

Requirements

  • Master's degree in Computer Science, Computer Engineering or in a STEM related field of Study
  • 3+ years of software development experience
  • 3+ years of Linux environment development experience
  • 3+ years of C and C++ programming experience
  • Experience with multithreaded programming and parallel computing concepts
  • Experience with Distributed computing systems and architectures (at least one required)
  • Experience with HPC (High-Performance Computing) communications libraries (at least one required)
  • Experience with Collective communications libraries (MPI, oneCCL/NCCL, or SHMEM) (at least one required)
  • Experience with GPU software development and optimization (at least one required)
  • Experience with Network communications stack development (one or more layers) (at least one required)

Nice To Haves

  • Ph.D. degree in Computer Science, Computer Engineering or in a STEM related field of Study
  • Experience developing performance optimizations that measurably improve communications latency or throughput
  • Experience debugging complex problems across different layers of hardware and software stack
  • Deep understanding of high-performance computing architectures and optimization techniques
  • Experience with Intel GPU and CPU architectures and their optimization characteristics
  • Knowledge of supercomputing environments and large-scale distributed systems
  • Familiarity with scientific computing and machine learning communication patterns

Responsibilities

  • Design, develop, and maintain advanced features and performance optimizations for oneCCL, with potential to contribute to Intel SHMEM, Intel MPI and MPICH libraries
  • Optimize software to achieve performance requirements including low latency, high bandwidth, and high reliability
  • Implement and enhance communication protocols across multiple layers of the communications stack
  • Collaborate with cross-functional teams to define software requirements and technical specifications
  • Work directly with scientists and engineers on high-performance computing applications and supercomputer implementations
  • Partner with hardware teams to optimize software-hardware integration for maximum performance
  • Develop performance optimizations that improve communication latency and throughput
  • Conduct comprehensive performance analysis and benchmarking across different system configurations
  • Debug complex problems spanning multiple layers of hardware and software stack

Benefits

  • Opportunity to work on world-class supercomputing and HPC technologies
  • Direct impact on scientific research and machine learning advancement
  • Collaboration with leading researchers and engineers in high-performance computing
  • Access to cutting-edge Intel hardware and advanced development tools
  • Professional development in emerging HPC and communication technologies
  • Competitive pay
  • Stock bonuses
  • Health benefit programs
  • Retirement benefit programs
  • Vacation benefit programs
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service