NVIDIA's technology is at the heart of the AI revolution, touching people across the planet by powering everything from self-driving cars and robotics to co-pilots and more. Join us at the forefront of technological advancement in intelligent assistants and information retrieval. NVIDIA NIM provides containers to self-host GPU-accelerated inferencing microservices for pre-trained and customized AI models across clouds, data centers, RTX™ AI PCs, and workstations. NIM microservices expose industry-standard APIs for simple integration into AI applications, development frameworks, and workflows. Built on pre-optimized inference engines from NVIDIA and the community, including NVIDIA TensorRT and TensorRT-LLM, NIM microservices optimize response latency and throughput for each combination of foundation model and GPU. NVIDIA NeMo Retriever is a collection of NIMs for building multimodal extraction, re-ranking, and embedding pipelines with high accuracy and maximum data privacy. It delivers quick, context-aware responses for AI applications like advanced retrieval-augmented generation (RAG) and Agentic AI workflows. The NeMo Retriever team is looking for an AI Engineer to join our team, focusing on the intersection of machine learning development, performance optimization, and MLOps. This role requires a unique blend of technical expertise in ML model development, system optimization, and operational excellence. We are looking for someone with a passion for working with the world's most complicated problems in Generative AI, LLM, MLLM, and RAG spaces using our innovative hardware and software platforms. You will leverage and augment existing tools that enable building NIMs, which power flexible, multi-modal retrievers and agents. If you're creative & passionate about solving real-world conversational AI problems, come join us.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Number of Employees
5,001-10,000 employees