Senior AI Compiler Engineer, MLIR

NVIDIA•Austin, WA

About The Position

NVIDIA's invention of the GPU 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, we are increasingly known as “the AI computing company”. NVIDIA is hiring a Senior AI Compiler Engineer. GPUs are driving rapid progress in deep learning—from LLMs and generative AI to recommendation, vision, and speech. On this team, you’ll build an MLIR-based AI compiler that powers NVIDIA’s inference engine end to end, with a focus on performance, fast builds, low memory use, and Ahead-of-Time and Just-in-Time usability across data center and edge.

Requirements

Bachelor's, Master's, or Ph.D. in Computer Science, Computer Engineering, a related field, or equivalent experience.
3+ years of relevant work or research experience in performance analysis and compiler optimizations.
Experience with compiler technologies such as MLIR, XLA, and LLVM.
Excellent C/C++ and Python programming and software design skills, including debugging, performance analysis, and testing.
Ability to work independently, define project goals and scope, and lead your own development efforts.
Strong interpersonal skills and the ability to thrive in a fast-moving, dynamic, product-oriented team.

Nice To Haves

Understanding of deep learning models, algorithms, and frameworks such as PyTorch and JAX.
Experience with GPU kernel generation targeting high performance and fast build times.
Proficiency in GPU architecture with CUDA or OpenCL programming experience.
A track record of mentoring early career engineers and interns is a bonus

Responsibilities

Develop MLIR-based graph representations and optimizations for future GPU architectures.
Partner with framework and hardware teams to enable new model patterns and upcoming GPU architectural features.
Define APIs and MLIR dialects, conduct performance optimizations and analysis, implement compiler optimizations and kernel generation for neural networks, and contribute to other general software engineering work.