Senior ML Compiler Engineer

NVIDIA•Redmond, WA

About The Position

NVIDIA's invention of the GPU 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, we are increasingly known as “the AI computing company”. We are looking for outstanding ML/DL compiler engineers to join the team and develop groundbreaking technologies in machine learning compilers and AI systems. We build innovative AI compiler solutions that work together with NVIDIA's software stack to provide comprehensive acceleration for modern machine learning models. As a member of the team, you will develop innovative AI compiler technologies for NVIDIA's hardware architecture. You will develop new ML/DL compiler abstractions, build efficient attention runtimes, and ML/DL -compiler driven system solutions to accelerate large language models, agents and other high-impact machine learning workloads. As part of this role, you will be building a close technical relationship with internal NVIDIA software and hardware teams to push the latest developments to NVIDIA's product. What you’ll be doing: Innovate and develop new machine learning compiler and systems technologies Design, implement, and optimize compilers for high impact AI workloads Building strong kernel and domain specific language solutions for state of art kernels in LLM inference workloads Developing AI-driven solutions to automate the overall development flow. Co-design learning system solutions with current and future ML compiler and algorithm technologies. Collaborate closely with other engineering teams at NVIDIA to build high impact solutions for machine learning acceleration What we need to see: Bachelor's degree in Computer Science, Electrical Engineering, or related field (or equivalent experience); MS or PhD are preferred 4+ years (academic/ industry) experience in machine learning systems development – including ML compilers, LLM inference kernels, kernel generations. Strong experience in developing or using deep learning frameworks (e.g. PyTorch, JAX etc) Strong python and C/C++ programming skills Ways to stand out from the crowd: Expertise in AI frameworks such as PyTorch, TensorFlow, and ONNX Expertise in machine learning compilers (e.g. Apache TVM, MLIR) Expertise in domain specific compiler and library solutions for LLM inference and training (e.g. FlashInfer, Flash Attention) Strong experience in GPU performance optimizations as well as experience machine learning systems research and productization Open source project ownership or contributions With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us and, due to unprecedented growth, our exclusive engineering teams are rapidly growing. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4. You will also be eligible for equity and benefits. Applications for this job will be accepted at least until March 2, 2026. This posting is for an existing vacancy. NVIDIA uses AI tools in its recruiting processes. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. NVIDIA is the world leader in accelerated computing. NVIDIA pioneered accelerated computing to tackle challenges no one else can solve. Our work in AI and digital twins is transforming the world's largest industries and profoundly impacting society. Learn more about NVIDIA.

Requirements

Bachelor's degree in Computer Science, Electrical Engineering, or related field (or equivalent experience); MS or PhD are preferred
4+ years (academic/ industry) experience in machine learning systems development – including ML compilers, LLM inference kernels, kernel generations.
Strong experience in developing or using deep learning frameworks (e.g. PyTorch, JAX etc)
Strong python and C/C++ programming skills

Nice To Haves

Expertise in AI frameworks such as PyTorch, TensorFlow, and ONNX
Expertise in machine learning compilers (e.g. Apache TVM, MLIR)
Expertise in domain specific compiler and library solutions for LLM inference and training (e.g. FlashInfer, Flash Attention)
Strong experience in GPU performance optimizations as well as experience machine learning systems research and productization
Open source project ownership or contributions

Responsibilities

Innovate and develop new machine learning compiler and systems technologies
Design, implement, and optimize compilers for high impact AI workloads
Building strong kernel and domain specific language solutions for state of art kernels in LLM inference workloads
Developing AI-driven solutions to automate the overall development flow.
Co-design learning system solutions with current and future ML compiler and algorithm technologies.
Collaborate closely with other engineering teams at NVIDIA to build high impact solutions for machine learning acceleration