Software Development Manager, LLM Inference Model Enablement, Neuron SDK

AmazonCupertino, CA
109d$166,400 - $287,700

About The Position

AWS Utility Computing (UC) provides product innovations, from foundational services such as Amazon Elastic Compute Cloud (EC2), to new product innovations that continue to set AWS’s services and features apart in the industry. We develop AWS Neuron, the complete software stack for Trainium, Amazon's custom cloud-scale machine learning accelerators. Come optimize LLMs such as Llama and GPT-OSS to run really fast on Trainium. As the SDM for the LLM Inference Model Enablement team, you will lead a team of expert AI/ML engineers to onboard and optimize state-of-the-art open-source and customer LLMs, both dense and MoE, for inference on Neuron and Trainium and Inferentia accelerators. You will also drive improvements in model enablement speed and experience, while advancing inference usability and quality through inference features, infrastructure optimization, tools, and automation. The ideal candidate will have a strong background in LLM model architectures, model performance optimizations, and inference techniques, such as delivering high-performance models using distributed inference libraries. You should be capable of managing demanding, fast-changing priorities. You should have a strong technical ability to understand and deliver as part of a vertically integrated system stack consisting of the PyTorch inference library, Neuron compiler, runtime, and collectives.

Requirements

  • 3+ years of engineering team management experience.
  • 7+ years of working directly within engineering teams experience.
  • 3+ years of designing or architecting new and existing systems experience.
  • Experience partnering with product or program management teams.

Nice To Haves

  • Experience in communicating with users, other technical teams, and senior leadership to collect requirements, describe software product features, technical designs, and product strategy.
  • Experience in recruiting, hiring, mentoring/coaching and managing teams of Software Engineers.

Responsibilities

  • Lead a team of expert AI/ML engineers to onboard and optimize state-of-the-art open-source and customer LLMs for inference on Neuron and Trainium.
  • Drive improvements in model enablement speed and experience.
  • Advance inference usability and quality through inference features, infrastructure optimization, tools, and automation.
  • Work with senior management and technical leaders to define model enablement and performance optimization for the latest SOTA LLMs.
  • Manage changing priorities as new models and technologies emerge.
  • Help the team solve technical challenges.

Benefits

  • Flexible working culture.
  • Mentorship and career growth opportunities.
  • Diverse experiences valued.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service