Amazon.com-posted 6 months ago
$129,300 - $223,600/Yr
Full-time • Senior
Seattle, WA
5,001-10,000 employees
General Merchandise Retailers

AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine learning accelerators. This role is for a senior software engineer in the Machine Learning Applications (ML Apps) team for AWS Neuron. This role is responsible for development, enablement and performance tuning of a wide variety of ML model families, including state of art GEN-AI models and massive scale large language models like llama-4, Deepseeq and beyond, as well as stable diffusion, Vision Transformers and many more. The ML Apps team works side by side with chip architects, compiler engineers and runtime engineers to create, build and optimize performance and accuracy of state-of-art models. The team automates the ML techniques to evaluate, detect, debug and resolve accuracy issues raising from migration of model to AI accelerators. The team develops AI tool chain for optimizing performance and accuracy of state-of-art models.

  • Help lead efforts in building distributed inference support into PyTorch using XLA and the Neuron compiler and runtime stacks.
  • Identify optimization opportunities by performing comparative analysis and benchmarking with alternative solutions.
  • Develop and automate solutions to ensure the accuracy of AI accelerators while optimizing their performance.
  • Develop a set of deep AI toolchains to simplify and abstract the low-level AI accelerator modules.
  • 3+ years of programming using a modern programming language such as Java, C++, or C#, including object-oriented design experience.
  • 3+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience.
  • 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience.
  • Fundamentals of Machine learning and deep learning models, their architecture, training and inference lifecycles along with work experience on some optimizations for improving the model execution.
  • Bachelor's degree in computer science or equivalent.
  • Flexibility in working hours
  • Mentorship and career growth opportunities
  • Inclusive culture that empowers employees
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service