Sr. Staff Software Engineer - NPU Compiler Solutions

Advanced Micro Devices, Inc•San Jose, CA

36d•Hybrid

About The Position

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career. AMD is looking for an influential software engineer who is passionate about improving the performance of key applications and benchmarks. You will be a member of a core team of incredibly talented industry specialists and will work with the very latest hardware and software technology.

Requirements

Strong C/C++ systems programming (modern C++), Python for tooling/prototyping.
Hands-on MLIR experience: Dialect design (ops/types/attributes/interface), Pass pipelines, (pattern rewrites, canonicalization/legalization). Proficiency with core dialects: Linalg/Tensor, Affine, SCF, Vector.
Accelerator compiler experience: GPU/NPU/AI engines or similar spatial devices; memory hierarchies, streams, scratchpads, NoC-aware optimization.
Auto-optimization: practical experience building or integrating auto-tiling, auto-scheduling, and auto-tuning systems.
Operator fusion and graph-level optimization for DSP&ML workloads (CNNs, transformers, FFT), layout/dtype transforms, quantization-aware workflows.
Solid understanding of heterogeneous runtime models, concurrency, synchronization, and performance profiling.

Nice To Haves

Polyhedral/affine analysis, dependence analysis, bank/conflict modeling, memory placement strategies.
Open-source contributions to LLVM/MLIR or peer-reviewed publications in compilers/ML systems.
ML compiler stacks: IREE, TVM, XLA, Glow, Halide; auto-schedulers/tuners integration.
Expertise in Linux kernel/driver development for multi-processor heterogeneous systems.
Knowledge of Acceleration platforms like GPU, TPU, APU, FPGAs.

Responsibilities

Contributing to the architecture and design of the MLIR-centric NPU compiler platform.
Develop and integrate solutions to efficiently deploy algorithms on spatial compute devices, such as NPUs.
Work with cross functional teams to identify problems and create solutions.
Work with management team on project planning activities.