Failure Analysis Engineer - Power & Design

Advanced Micro Devices, IncSecaucus, NJ
Onsite

About The Position

The Quality Engineering team is looking for an experienced Failure Analysis Engineer - Power & Design with strong expertise in board architecture, failure isolation, and rail bring-up. This individual will support customer and factory failure investigations for GPU accelerators, with primary ownership of PCB triage and board-level fault isolation. They will review schematics, layouts, and power architecture to develop targeted debug strategies, run diagnostics and functional test DOE’s to reproduce and isolate failures, and work closely with design, validation, FW, and manufacturing teams to accelerate root cause analysis and corrective actions. Your contributions will directly impact product quality, reliability, and customer satisfaction.

Requirements

  • Strong electrical engineering foundation and deep experience in hardware design, board bring-up, and electrical debug.
  • Strong analytical mindset and skilled at triaging complex PCB failures by narrowing issues to the board, component, rail, or system interaction level.
  • Comfortable running diagnostics and designing functional test DOE’s to reproduce and isolate hard-to-find failures, while working effectively across design, validation, manufacturing, and repair teams.
  • Strong communication and documentation skills enable clear reporting and collaboration.
  • Curiosity and persistence help drive timely, high-quality root cause analysis and corrective actions.
  • Bachelor’s degree in Electrical Engineering, Computer Engineering, or a related field.

Nice To Haves

  • Deep expertise in electrical engineering fundamentals, PCBA design, power delivery architecture, and hardware debug, including diagnostics and functional test development.
  • Skilled in using lab equipment (oscilloscopes, logic analyzers, power analyzers, and custom test tools) for board bring-up, rail validation, and hardware debug.
  • Strong background in PCB triage, board-level failure analysis, PCBA diagnostics, power delivery debug, and failure isolation techniques from NPI through production.
  • Proficient in Python, shell scripting, and working across Windows and Linux environments.
  • Solid understanding of firmware, drivers, and hardware interactions, with the ability to tune firmware as needed.
  • Extensive experience in hardware verification and system integration.
  • Hands-on experience assembling, installing, and configuring computer systems and servers.
  • Strong communication, documentation, collaboration, and presentation skills.
  • Able to read schematics, interpret datasheets, identify components, and perform soldering/rework to support efficient hardware debug and failure isolation.
  • Knowledge of high-speed digital design, power delivery networks, voltage regulator behavior, memory interfaces (HBM, GDDR), PCIe, and display outputs (DP, HDMI).
  • Experience with GPU data center infrastructure and AI/ML technologies is a plus.

Responsibilities

  • Support internal and external requests to troubleshoot AMD GPU product failures with primary focus on PCB triage, power delivery debug, and board-level failure isolation for continuous yield, quality, and customer support improvements.
  • Develop and execute diagnostics and functional test DOE’s to reproduce, characterize, and isolate difficult board- and power-related failures.
  • Develop Automation and tools to run tests and analyze results/logs.
  • Perform structured PCB triage by narrowing failures to the board, component, power rail, layout interaction, or system integration level, and work with the contract manufacturer and internal AMD teams to reproduce failures, isolate root cause, and determine the most effective next steps for debug and corrective action.
  • Use board schematics, layout data, and power delivery design knowledge to understand circuit behavior, trace power and signal paths, form debug hypotheses, and build targeted validation plans that drive efficient fault isolation and high-quality failure analysis.
  • Document all findings into FA database and create a complete failure analysis report for customer consumption as needed.
  • Present findings to key stakeholders, including senior management.
  • Implement ongoing continuous improvements of failure analysis process & techniques and create procedures of the steps to follow.
  • Oversee the set-up of new products and test stations for Failure Analysis operations.

Benefits

  • AMD benefits at a glance
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service