AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine learning accelerators. This role is for a senior software engineer in the Machine Learning Applications (ML Apps) team for AWS Neuron. This role is responsible for development, enablement and performance tuning of a wide variety of ML model families, including state of art GEN-AI models and massive scale large language models like llama-4, Deepseeq and beyond, as well as stable diffusion, Vision Transformers and many more. The ML Apps team works side by side with chip architects, compiler engineers and runtime engineers to create, build and optimize performance and accuracy of state-of-art models. The team automates the ML techniques to evaluate, detect, debug and resolve accuracy issues raising from migration of model to AI accelerators. The team develops AI tool chain for optimizing performance and accuracy of state-of-art models.