Sr. GPU/Accelerator Hardware Development Engineer

AmazonCupertino, CA
$159,200 - $247,600Onsite

About The Position

Would you like to develop the Next Generation of AI accelerator compute systems? Lead bleeding-edge HW development projects? Have you heard of Amazon Web Services (AWS) Project Rainer? This is the opportunity to be a part of a fast-moving innovation team that is changing the world of AI at massive scale. At AWS Trainium we develop a complete vertical stack system, from our own Silicon to Hardware to Software and deploy directly to our customers in our own Data Centers. We are seeking experienced Lead System Design Engineers to build the next generation of our cloud server infrastructure, Project Rainier. Project Rainier is a massive $11 billion Amazon Web Services (AWS) AI infrastructure initiative, featuring one of the world's largest compute clusters dedicated to training and running Anthropic’s Claude AI models. It utilizes over 500,000 custom Trainium2 chips, designed for high-performance AI training. As a member of the AWS Trainium Machine Learning Acceleration team you’ll be responsible for the System design and optimization of hardware in our data centers. You’ll provide leadership in the application of new technologies to large scale server deployments in a continuous effort to deliver a world-class customer experience. This is a fast-paced, intellectually challenging position, and you’ll work with thought leaders in multiple technology areas. You’ll have high standards for yourself and everyone you work with, and you’ll be constantly looking for ways to improve your products performance, quality and cost. We’re changing industry, and we want individuals who are ready for this challenge and want to reach beyond what is possible today.

Requirements

  • Bachelor's degree in electrical engineering, computer engineering, or equivalent
  • Minimum of 5 years of experience with High-Speed system design and validation
  • Experience with Schematic and layout tools.
  • Strong knowledge in electrical engineering fundamentals, power & signal integrity, and analog/digital circuits
  • Experience with hardware development process and system development across full product life cycles
  • Experience using lab equipment such as bench power supplies, high-speed oscilloscopes, logic analyzers, spectrum analyzers, VNA’s, and thermal chambers
  • Experience with supply chain management

Nice To Haves

  • Lead bleeding-edge HW development projects
  • Lead System Design Engineers to build the next generation of our cloud server infrastructure, Project Rainier
  • Strong skills in both hardware and software
  • Thrive in a fast-paced start-up like environment and work independently to deliver multiple projects in parallel.
  • Highly motivated and detailed oriented while meeting the highest standards and time to market, cost and quality goals.
  • Lead end-to-end server hardware development lifecycle from Concept, Architecture, Design, Validation and Production.
  • Drive PCB board design for server motherboards, accelerator carrier boards, and high-speed interconnect boards.
  • Collaborate with silicon, firmware, and system software teams to enable optimal hardware/software co-design.
  • Improve compute density, power efficiency, and network bandwidth utilization.
  • Drive root cause analysis for hardware issues during validation and production.

Responsibilities

  • System design and optimization of hardware in our data centers.
  • Provide leadership in the application of new technologies to large scale server deployments.
  • Responsible for system design, validation, and integration of hardware in the AWS fleet through its entire life cycle.
  • Work cross functionally with AWS monitoring teams, members of the Hardware Design team, and additional teams across AWS to improve quality and reliability of products operating in the fleet.
  • Drive ODM HW development and testing and be part of the Production flow definition team.
  • Drive component selection and validation of electrical, mechanical components, cables.
  • Lead end-to-end server hardware development lifecycle from Concept, Architecture, Design, Validation and Production.
  • Drive PCB board design for server motherboards, accelerator carrier boards, and high-speed interconnect boards.
  • Collaborate with silicon, firmware, and system software teams to enable optimal hardware/software co-design.
  • Improve compute density, power efficiency, and network bandwidth utilization.
  • Drive root cause analysis for hardware issues during validation and production.

Benefits

  • health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage)
  • 401(k) matching
  • paid time off
  • parental leave
  • sign-on payments
  • restricted stock units (RSUs)
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service