About The Position

NVIDIA is widely regarded as one of the most desirable employers in technology. It leads in High-Performance Computing, Artificial Intelligence, and Visualization. Our invention, the GPU, acts as the visual cortex of modern computers and powers our products. GPU deep learning sparked modern AI, the next computing era. The GPU serves as the brain for computers, robots, autonomous cars, and conversational AI that understand the world. Today, we are known as “the AI computing company.” We want to grow and hire the smartest people. Join us at the forefront of technology. NVIDIA is hiring Senior Deep Learning Scientists interested in streaming multimodal conversational AI, including speech, audio, vision, voice chat, and action, as well as human-AI interaction. You will demonstrate foundational expertise in deep learning, reinforcement learning, computational statistics, and applied mathematics. You will have a chance to define core algorithmic improvements and scale your ideas through our Nemotron platform. You will work on high-impact, high-visibility large language model products that improve the experience for millions of users. If you are creative and passionate about real-world conversational AI issues, come join our Nemotron LLM team. For more details on Nemotron LLM, check https://www.nvidia.com/en-us/ai-data-science/foundation-models/nemotron/

Requirements

  • Master’s degree (or equivalent experience) or PhD in Computer Science, Electrical Engineering, Artificial Intelligence, or Applied Math with 8+ years of experience
  • Excellent programming skills in Python with strong fundamentals in programming, optimizations, and software development
  • Strong knowledge of ML/DL techniques, algorithms, and tools with exposure to CNN, RNN (LSTM), Transformers (ViT, BERT, BART, GPT/T5, Megatron, LLMs, MoEs)
  • Experience with training real-time audio language, streaming visual language, and streaming real-time audio-visual language models, and ViT, BERT, GPT, and Nemotron Models for different computer vision, NLP, and dialog system tasks using “PyTorch” Deep Learning Frameworks and performing data wrangling, tokenization, and multimodal alignment
  • Practical experience in natural language processing, speech/audio processing, computer vision, machine learning, and human-AI interaction
  • Hands-on experience on conversational AI Technologies like Natural Language Understanding, Natural Language Generation, Dialog systems (including system integration, state tracking, and action prediction), Information retrieval, Question and Answering, Machine Translation, etc.
  • Understanding of model development life cycle and experience with model development workflows & traceability, and versioning of datasets, including know-how of database management and queries (in SQL, MongoDB, etc.).
  • Strong collaborative and interpersonal skills, specifically a proven ability to effectively guide and influence within a dynamic matrix environment

Nice To Haves

  • Native or near-native fluency is required in one of these non-English languages: Spanish, Mandarin, German, Japanese, Russian, French, UK English, Arabic, Korean, Italian, or Portuguese.
  • Verified background in building LLMs that incorporate knowledge discovery along with reasoning abilities, including disambiguation, clarification, anticipation, and effective error handling for embodied AI systems
  • Validated experience adapting LLMs to different domains such as gaming, virtual assistants, video conferencing, and so on
  • Contributing experience in integrating embodied AI systems with various sensor inputs (camera, microphone, torch, and so on) and backend action fulfillment systems
  • Experience with long-term reasoning for embodied AI tasks (navigation, mobile manipulation, instruction following, and collaboration with humans) in gaming/physical environments, given natural-language instructions.

Responsibilities

  • Develop, Train, Fine-tune, and Deploy streaming large language models to power multimodal conversational AI systems encompassing multimodal understanding, speech synthesis, speech-to-speech conversation, video generation, UI and animation rendering and control, environment interaction, and dialog reasoning and tool systems
  • Apply brand-new fundamental and applied research to develop products for multimodal conversational artificial intelligence
  • Apply techniques such as instruction tuning and reinforcement learning from human feedback (RLHF), reinforcement learning with verifiable reward (RLVR), and parameter-efficient finetuning methods like p-tuning, adapters, and LoRA. These methods improve embodied conversational LLMs for multiple use cases.
  • Lead the collection, development, and labeling of domain-specific datasets to train LLMs for various multimodal tasks and applications
  • Measure and benchmark model and application performance. Analyze model accuracy and bias and recommend the next course of action & improvements.
  • Collaborate with various teams on new product features and improvements of existing products
  • Participate in developing and reviewing code, building documents, and conducting use case reviews and test plan reviews.
  • Help innovate, identify problems, recommend solutions, and perform triage in a collaborative team environment

Benefits

  • With competitive salaries and a generous benefits package, NVIDIA is considered one of the technology world’s most desirable employers.
  • You will also be eligible for equity and benefits.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service