Senior Research Engineer

NVIDIASanta Clara, CA
13d

About The Position

Join NVIDIA and help build the software that will define the future of generative AI. We are looking for a research engineer who is passionate about open-source and excited to create our next-generation post-training software stack. You will work at the intersection of research and engineering, collaborating with the Post-Training and Frameworks teams to invent, implement, and scale the core technologies behind our Nemotron models. What you’ll be doing: Work with applied researchers to design, implement and test next generation of RL and pos-training algorithms Contribute and advance open source by developing NeMo-RL, Megatron Core, and NeMo Framework and yet to be announced software You will be engaged as part of one team during Nemotron models post-training Solve large-scale, end-to-end AI training and inference challenges, spanning the full model lifecycle from initial orchestration, data pre-processing, running of model training and tuning, to model deployment. Work at the intersection of computer-architecture, libraries, frameworks, AI applications and the entire software stack. Performance tuning and optimizations, model training with mixed precision recipes on next-gen NVIDIA GPU architectures. Publish and present your results at academic and industry conferences

Requirements

  • BS, MS or PhD in Computer Science, AI, Applied Math, or related fields or equivalent experience
  • 3+ years of proven experience in machine learning, systems, distributed computing, or large-scale model training.
  • Experience with AI Frameworks such as Pytorch or JAX
  • Experience with at least one inference and deployment environments such as vLLM, SGLang or TRT-LLM
  • Proficient in Python programming, software design, debugging, performance analysis, test design and documentation.
  • Strong understanding of AI/Deep-Learning fundamentals and their practical applications.

Nice To Haves

  • Contributions to open source deep learning libraries
  • Hands-on experience in large-scale AI training, with a deep understanding of core compute system concepts (such as latency/throughput bottlenecks, pipelining, and multiprocessing) and demonstrated excellence in related performance analysis and tuning.
  • Expertise in distributed computing, model parallelism, and mixed precision training
  • Prior experience with Generative AI techniques applied to LLM and Multi-Modal learning (Text, Image, and Video).
  • Knowledge of GPU/CPU architecture and related numerical software.

Responsibilities

  • Work with applied researchers to design, implement and test next generation of RL and pos-training algorithms
  • Contribute and advance open source by developing NeMo-RL, Megatron Core, and NeMo Framework and yet to be announced software
  • You will be engaged as part of one team during Nemotron models post-training
  • Solve large-scale, end-to-end AI training and inference challenges, spanning the full model lifecycle from initial orchestration, data pre-processing, running of model training and tuning, to model deployment.
  • Work at the intersection of computer-architecture, libraries, frameworks, AI applications and the entire software stack.
  • Performance tuning and optimizations, model training with mixed precision recipes on next-gen NVIDIA GPU architectures.
  • Publish and present your results at academic and industry conferences

Benefits

  • Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.
  • The base salary range is 168,000 USD - 264,500 USD for Level 3, and 192,000 USD - 304,750 USD for Level 4.
  • You will also be eligible for equity and benefits.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service