This role is for a senior software engineer in the Compiler team for AWS Neuron. As part of this role, you will be responsible for building the next generation Neuron compiler which transforms ML models written in ML frameworks (e.g, PyTorch, TensorFlow, and JAX) to be deployed on AWS Inferentia and Trainium based servers in the Amazon cloud. You will be responsible for solving hard compiler optimization problems to achieve optimum performance for a variety of ML model families including massive scale large language models like Llama, Deepseek, and beyond, as well as stable diffusion, vision transformers, and multi-model models. You will be required to understand how these models work inside-out to make informed decisions on how to best coax the compiler to generate optimal implementation instructions. You will leverage your technical communication skills to partner with other teams and will be involved in pre-silicon design, bringing new products/features to market, and many other exciting projects.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior