About The Position

NVIDIA has become the platform upon which every new AI-powered application is built. We are seeking a Senior Machine Learning Performance Engineer to join our team of scientists and engineers passionate about building the next generation of scientific machine learning (ML) frameworks. Starting with digital biology, we will enable powerful and efficient ML methods through collaborations with industry and academic partners. Together, we will advance NVIDIA’s capacity to accelerate AI for Science and industries that depend on it. What you'll be doing: Design performance and accuracy evaluation frameworks and carry out evaluations of pioneering ML models used in scientific discovery, in particular the ones relating to atomistic modeling. Identify end-to-end model execution bottlenecks, design and implement solutions at scale such as model parallelism. Drive the testing and maintenance of the algorithms and software stack used in the AI for Science applications within the company and in the open source community Stay up-to-date on the latest machine learning technologies and evaluate their potential as solutions to accuracy and/or computational performance bottlenecks. Collaborate with multiple high performance computing, AI infrastructure, and research teams Contribute to documentation or educational content relating to product

Requirements

  • Advanced degree in a quantitative field such as Computer Science, Computational Biophysics, Computational Chemistry, Physics, Mathematics, or equivalent experience
  • 5+ years of relevant experience
  • Consistent track record of performance engineering in large scale AI model training and inference applications, and deep understanding of compute bottlenecks of these models, and of paradigms of parallelism in these applications such as model parallelism.
  • Expertise in modern machine learning frameworks such as PyTorch, JAX, Warp and distributed learning strategies within them
  • Up-to-date knowledge of ML research in scientific discovery and in atomistic modeling
  • Experience with software design, building, packaging and launching software products based on ML research or atomistic simulation tools
  • Recognized for technical leadership contributions, capable of self-direction, and ability to learn from and teach others
  • You should display strong communication skills, be organized and self-motivated, and play well with others (be an excellent teammate!)

Nice To Haves

  • Contributor to major scientific codebase for atomistic modeling or AI for science
  • Experience with CUDA/Triton programming or familiarity with CUDA/Triton extensions of ML frameworks

Responsibilities

  • Design performance and accuracy evaluation frameworks and carry out evaluations of pioneering ML models used in scientific discovery, in particular the ones relating to atomistic modeling.
  • Identify end-to-end model execution bottlenecks, design and implement solutions at scale such as model parallelism.
  • Drive the testing and maintenance of the algorithms and software stack used in the AI for Science applications within the company and in the open source community
  • Stay up-to-date on the latest machine learning technologies and evaluate their potential as solutions to accuracy and/or computational performance bottlenecks.
  • Collaborate with multiple high performance computing, AI infrastructure, and research teams
  • Contribute to documentation or educational content relating to product
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service