About The Position

Annapurna Labs designs silicon and software that accelerates innovation. Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago- even yesterday. Our custom chips, accelerators, and software stacks enable us to take on technical challenges that have never been seen before, and deliver results that help our customers change the world. In Annapurna Labs we are at the forefront of hardware/software co-design not just in Amazon Web Services (AWS) but across the industry. Our Annapurna MLA Software team is looking for candidates interested in diving deep into the different hardware technologies that power our Machine Learning servers and develop the software and firmware to drive, support and sustain these technologies as they evolve though concept and manufacturing, and finally take their place in our rapidly expanding fleet of bleeding edge Machine Learning products our customers demand. You'll architect and develop the software and firmware that drives NeuronSwitches- the high-performance interconnect fabric for Trainium chips. Day-to-day, you'll work closely with EC2, Annapurna Labs teams and manufacturing teams to bring up new hardware, debug board-level issues and optimize data paths. You'll write device drivers, build ML infrastructure, implement switch fabric control logic and develop the tooling needed for testing, qualification, and production deployment. This is hands-on systems work- from initial hardware bring-up through manufacturing scale-up. The Annapurna ML Software team builds the software and firmware that powers NeuronSwitches- AWS' next-generation switching infrastructure forming the high-performance interconnect fabric for Trainium 3 chips. We focus on mission-mode control of sensors, board-level hardware, and the critical data paths that enable chip-to-chip communication at scale. Our work spans device drivers, switched fabric and everything in between- from debug and testing through qualification and manufacturing. We work at the hardware-software boundary where silicon meets systems. While ML engineers optimize models and algorithms, we ensure the underlying infrastructure can move data at the speeds those workloads demand. If you're excited about low-level systems programming, hardware bring-up, and building the foundation that makes next generation AI possible, this is the team for you.

Requirements

  • 5+ years of non-internship professional software development experience
  • 5+ years of programming with at least one software programming language experience
  • 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Experience as a mentor, tech lead or leading an engineering team

Nice To Haves

  • Bachelor's degree in computer science or equivalent

Responsibilities

  • Define technical strategy and lead architecture for Annapurna Labs' machine learning platform, driving decisions that impact multiple product lines
  • Collaborate with EC2 teams and manufacturing partners to ensure seamless system integration
  • Own end-to-end qualification frameworks and processes, mentoring engineers on implementation best practices
  • Drive end-to-end qualification processes for new software implementations
  • Craft high-performance solutions using C/C++ running on Linux

Benefits

  • Amazon package will include sign-on payments and restricted stock units (RSUs).
  • Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service