Would you like to develop the Next Generation of AI accelerator compute systems? Lead bleeding-edge HW development projects? Have you heard of Amazon Web Services (AWS) Project Rainer? This is the opportunity to be a part of a fast-moving innovation team that is changing the world of AI at massive scale. At AWS Trainium we develop a complete vertical stack system, from our own Silicon to Hardware to Software and deploy directly to our customers in our own Data Centers We are seeking experienced Lead System Design Engineers to build the next generation of our cloud server infrastructure, Project Rainier. Project Rainier is a massive $11 billion Amazon Web Services (AWS) AI infrastructure initiative, featuring one of the world's largest compute clusters dedicated to training and running Anthropic’s Claude AI models. It utilizes over 500,000 custom Trainium2 chips, designed for high-performance AI training. As a member of the AWS Trainium Machine Learning Acceleration team you’ll be responsible for the System design and optimization of hardware in our data centers. You’ll provide leadership in the application of new technologies to large scale server deployments in a continuous effort to deliver a world-class customer experience. This is a fast-paced, intellectually challenging position, and you’ll work with thought leaders in multiple technology areas. You’ll have high standards for yourself and everyone you work with, and you’ll be constantly looking for ways to improve your products performance, quality and cost. We’re changing industry, and we want individuals who are ready for this challenge and want to reach beyond what is possible today. Key job responsibilities We are looking for a Lead Hardware Design Engineer with strong skills in both hardware and software. In this role, you will be responsible for system design, validation, and integration of hardware in the AWS fleet through its entire life cycle. You will work cross functionally with AWS monitoring teams, members of the Hardware Design team, and additional teams across AWS to improve quality and reliability of products operating in the fleet. We are looking for candidates who thrive in a fast-paced start-up like environment and work independently to deliver multiple projects in parallel. To be successful, you need to be highly motivated and detailed oriented while meeting the highest standards and time to market, cost and quality goals.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level