In Annapurna Labs, we are at the forefront of hardware/software accelerator solutions for not only Amazon Web Services (AWS), but across the industry. The Machine Learning Acceleration Systems Firmware team is looking for candidates interested in diving deep into our designs of Machine Learning servers and developing world class firmware to support current and future generations of accelerator silicon. Our team designs and builds Annapurna's fleet of Accelerated Servers using internally designed silicon. We solve systemic hardware issues and we build hardware and software systems to detect and mitigate future failure recurrences so that our customers can experience the highest quality of service possible! In this role, you will lead an organization of software and firmware developers to build reliable server firmware deployed across millions of accelerators across EC2. You will build AI-driven software tooling that root causes failures and identifies causes of system failures—work that directly impacts how our customers leverage AWS Trainium for their machine learning workloads.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Principal