About The Position

This role is for a Cloud Hardware Dev Engineer within AWS Hardware Engineering Services, focusing on building the backbone of Generative AI cloud at AWS, specifically for AI training and inference. The position involves delivering continuous price performance improvements in the cloud for AI model training for multi-billion variable LLMs. As a member of the AWS Utility Computing (UC) organization, the engineer will contribute to product innovations from foundational services such as Amazon's Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS's services and features apart in the industry. The role supports the development and management of Compute, Database, Storage, Internet of Things (IoT), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. The team is diverse, comprising software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles, collaborating across AWS to deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for customers. The culture is inclusive, welcoming bold ideas and empowering ownership.

Requirements

  • Experience working with interdisciplinary teams to execute product design from concept to production
  • Experience developing and executing test procedures for mechanical or electrical systems/components based on design intent and approved equipment submissions
  • Knowledge of server hardware and components
  • Bachelor's degree in electrical engineering or equivalent
  • 1+ years of server hardware troubleshooting and repair experience
  • 4+ years of hardware design and validation of components, subsystems and systems experience

Nice To Haves

  • Master's degree or above in electrical engineering, computer engineering, or equivalent
  • Experience with the project management of technical projects
  • Experience in compute and storage server architecture and design for large scale applications
  • AI infrastructure hardware development and debugging experience

Responsibilities

  • Own and lead the design, development and root cause of a new segment of accelerated servers.
  • Work closely with customers to understand their technical needs and business goals, leveraging experience with server design and the knowledge of various teams to architect the solutions that will be deployed at scale.
  • Work with an interdisciplinary team of component, firmware, test, qualification, and integration engineers, and lead design and manufacturing partners to bring these servers to the data center.
  • Oversee the fleet of servers developed, monitoring their quality and how they are meeting the customer requirements.
  • Interface with internal and external customers to understand project requirements and facilitate system development on top of server design.
  • Learn operational challenges to the existing fleet with the goal of improving the current customer experience as well as developing improved systems for future designs.
  • Work directly with vendors and ODM/JDM design teams to develop and manufacture products at scale.

Benefits

  • Sign-on payments
  • Restricted stock units (RSUs)
  • Health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage)
  • 401(k) matching
  • Paid time off
  • Parental leave
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service