AI Scale-up Switch System Design Engineer

Advanced Micro Devices, IncSecaucus, NJ

About The Position

We are looking for a hands-on, technically sharp system design engineer to join our growing team and lead the bring-up of cutting-edge scale-up switches at the heart of next-generation AI rack infrastructure. As a key contributor, you will bring deep expertise in high-speed Ethernet, server management, and platform validation to drive switch platforms from initial power-on through full system qualification. In this role, you will take full ownership of bring-up execution, apply your debugging skills to solve complex multi-layer problems, and collaborate closely with hardware, firmware, and software teams to deliver production-ready systems.

Requirements

  • Extensive hands-on experience in hardware bring-up, platform validation, or high-speed networking silicon characterization
  • Experience with high-speed switch ASICs (Broadcom TH6/Tomahawk series preferred) and familiarity with Broadcom's SDK/DAPI frameworks
  • Deep understanding of high-speed Ethernet standards (400GbE, 800GbE) including AN/LT (IEEE 802.3), RS-FEC / KP4-FEC, and PAM4 SerDes technology
  • Hands-on experience with PRBS testing, BER measurement, eye diagram analysis, and Snake/loopback traffic validation methodologies
  • Familiarity with LinkCAT or equivalent PHY/link characterization tools
  • Experience with server management protocols: IPMI, Redfish/OpenBMC, KCS, IPMB, and PLDM for out-of-band control and telemetry
  • Proficiency in Python for test automation, log parsing, and data analysis
  • Strong debugging skills — comfortable working across hardware (oscilloscope, protocol analyzer), firmware logs, and software traces to isolate root cause
  • Experience reading schematics and PCB layout to correlate signal integrity observations with hardware design
  • Excellent communication skills with the ability to document findings clearly and collaborate across multidisciplinary teams
  • Bachelor’s/Master’s degree in Computer Science or related field strongly preferred

Nice To Haves

  • Experience with high-density switch/router platforms or AI/ML fabric infrastructure is a strong plus

Responsibilities

  • Lead the system bring-up and validation of state-of-the-art AI scale-up switches purpose-built for high-density GPU compute racks, from initial power-on through full system validation
  • Perform high-speed SerDes and link bring-up, including configuring and validating Auto-Negotiation/Link Training (AN/LT), tuning TX equalization, and characterizing signal integrity across 200G/400G/800G interfaces
  • Execute comprehensive link qualification testing using PRBS (Pseudo-Random Binary Sequence), Snake Traffic loopback testing, and FEC (Forward Error Correction) analysis to validate BER performance at scale
  • Utilize LinkCAT and Broadcom SDK tools to characterize port performance, diagnose link failures, and validate PHY configurations across large port counts
  • Integrate and validate server management infrastructure including BMC/IPMI, Redfish API, and out-of-band management workflows for automated bring-up and health monitoring
  • Develop and maintain bring-up scripts and test automation (Python) to accelerate validation coverage across chassis configurations
  • Debug complex system-level failures spanning hardware, firmware, and software including signal integrity issues, firmware crashes, and management plane anomalies and drive issues to root cause
  • Collaborate with hardware, firmware, and software teams to reproduce failures, document findings, and verify fixes across platform revisions
  • Maintain detailed bring-up documentation, test reports, and issue tracking throughout the product development lifecycle

Benefits

  • AMD benefits at a glance
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service