Software Dev Engineer II - Neuron Kernel Interface , Annapurna Labs

Amazon.com, Inc.•Cupertino, CA

60d

About The Position

The Product: AWS Machine Learning accelerators are at the forefront of AWS innovation. The Inferentia chip delivers best-in-class ML inference performance at the lowest cost in cloud. Trainium will deliver the best-in-class ML training performance with the most teraflops (TFLOPS) of compute power for ML in the cloud. This is all enabled by edge software stack, the AWS Neuron Software Development Kit (SDK), which includes an ML compiler, Neuron Kernel Interface (NKI) compiler, and runtime that natively integrates into popular ML frameworks, such as PyTorch and TensorFlow. Neuron Kernel Interface (NKI) is a bare-metal language and compiler for directly programming NeuronDevices available on AWS Trn/Inf instances. You can use NKI to develop, optimize and run new operators directly on NeuronCores while making full use of available compute and memory resources. AWS Neuron and Inferentia are used at scale with customers and partners like PyTorch, Epic Games, Snap, AirBnB, Autodesk, Amazon Alexa, Amazon Rekognition and more customers in various other segments. The Team: The Amazon Annapurna Labs team is a responsible for building innovative silicon and software for AWS customers. We are at the forefront of innovation by combining cloud scale with the world's most talented engineers. Our team covers multiple disciplines including silicon engineering, hardware design and verification, software and operations. With such breadth of talent, there's opportunity to learn all of the time. We operate in spaces that are very large, yet our teams remain small and agile. There is no blueprint. We're inventing. We're experimenting. When you couple that with the ability to work on so many different products and services, it's a very unique learning culture. You: The AWS Neuron Kernel Interface team is actively seeking skilled engineers to join our efforts in developing a state-of-the-art compiler stack. This stack is designed to optimize application models across diverse domains, including Large Language and Vision, originating from leading frameworks such as PyTorch, TensorFlow, and JAX. Your role will involve working closely with our custom-built Machine Learning accelerators, including Inferentia and Trainium, which represent the forefront of AWS innovation for advanced ML capabilities, powering solutions like Generative AI.

Requirements

3+ years of non-internship professional software development experience
2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
Experience programming with at least one software programming language

Nice To Haves

3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
Bachelor's degree in computer science or equivalent
Experience in compiler design for CPU/GPU/Vector engines/ML-accelerators

Responsibilities

Develop state-of-the-art tools (compiler, debugger, profiler) that allow customers to maximize performance of their ML models.
Work with customers to enable and optimize their ML kernels on AWS accelerators, understanding their requirements and use cases
Design and implement compiler optimizations
Collaborate across teams to develop innovative optimization techniques that enhance AWS Neuron SDK's performance capabilities
Work in a startup-like development environment, where you're always working on the most important stuff.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume