Sr. Machine Learning - Compiler Engineer III, AWS Neuron, Annapurna Labs

Amazon•Cupertino, CA

21h•Onsite

About The Position

This role is for a senior software engineer in the Compiler team for AWS Neuron. As part of this role, you will be responsible for building the next generation Neuron compiler which transforms ML models written in ML frameworks (e.g, PyTorch, TensorFlow, and JAX) to be deployed on AWS Inferentia and Trainium based servers in the Amazon cloud. You will be responsible for solving hard compiler optimization problems to achieve optimum performance for a variety of ML model families including massive scale large language models like Llama, Deepseek, and beyond, as well as stable diffusion, vision transformers, and multi-model models. You will be required to understand how these models work inside-out to make informed decisions on how to best coax the compiler to generate optimal implementation instructions. You will leverage your technical communication skills to partner with other teams and will be involved in pre-silicon design, bringing new products/features to market, and many other exciting projects.

Requirements

Experience in object-oriented languages like C++/Java is a must.
5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience.
2+ years of experience in developing compiler features and optimizations.
Proficiency with 1 or more of the following programming languages: C++ (preferred), C, Python.

Nice To Haves

Experience with compilers or building ML models using ML frameworks on accelerators (e.g., GPUs) is preferred but not required.
Experience with technologies like OpenXLA, StableHLO, MLIR will be an added bonus!
Master or PhD degree in computer science or equivalent.
Proficiency with resource management, scheduling, code generation, and compute graph optimization.
Experience optimizing Tensorflow, PyTorch or JAX deep learning models.
Experience with multiple toolchains and Instruction Set Architectures.

Responsibilities

Design, implement, test, deploy and maintain innovative software solutions to transform Neuron compiler’s performance, stability and user-interface.
Work side by side with chip architects, runtime/OS engineers, scientists and ML Apps teams to seamlessly deploy cutting edge ML models from our customers on AWS accelerators with optimal cost/performance benefits.
Become the front-face of Neuron Compiler to work with open-source communities (e.g., StableHLO, OpenXLA, MLIR) and influence industry-wide partners to pioneer optimizing cutting-edge ML workloads on AWS software and hardware.
Build innovative features that will deliver the best possible experiences for our customers – developers across the globe.
Create compiler optimization and verification passes.
Build features that surface features and peculiarities of AWS accelerators to developers.
Implement tools to analyze numerical errors, and resolve the root cause of compiler defects.
Participate in design discussions and code reviews.
Communicate with internal (other Neuron SDK and Amazon wide teams) and external stakeholders (open-source communities and respond to Neuron compiler related questions in open forums, e.g. GitHub).
Work in a startup-like development environment, where you’re always working on the most important stuff.

Benefits

health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage)
401(k) matching
paid time off
parental leave

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume