Data Center Systems Engineer

Advanced Micro Devices, IncAustin, TX
Onsite

About The Position

The Data Center Platform Engineering Group (DPEG) is responsible for deploying, validating, and sustaining AMD’s most advanced CPU and GPU platforms inside large-scale data center environments. This team operates at the intersection of silicon, firmware, systems, and world infrastructure, ensuring platforms perform reliably under demanding workloads. As a Senior Systems Design Engineer, you will be a hands-on technical owner for platform deployment and availability in a live data center setting. You’ll work directly with cutting-edge GPU/CPU systems, diagnosing complex hardware, firmware, power, thermal, networking, and clustering issues that occur at scale. This role offers deep exposure to how next-generation compute platforms behave in production, close collaboration with cross-functional engineering teams and external partners, and the opportunity to drive continuous improvements that directly impact system reliability and uptime. This is an on-site, fast-paced environment where you will learn by solving real problems, influencing platform readiness, and working with highly experienced engineers across the hardware and software stack.

Requirements

  • Self-driven systems thinker who enjoys owning complex technical problems from initial failure through root cause and resolution.
  • Naturally curious, methodical in debugging approach, and comfortable navigating ambiguity in large, interconnected systems.
  • Communicate clearly with both technical and non-technical stakeholders.
  • Collaborate well across disciplines.
  • Confident working with vendors and partners.
  • Pride in driving issues to closure, improving processes along the way, and mentoring or guiding others when needed.

Nice To Haves

  • Systems engineering experience supporting complex CPU, GPU, or SoC-based platforms
  • Platform or system-level validation, bring-up, or design reliability experience
  • Debugging expertise across BIOS, BMC, Linux, and hardware interfaces
  • Familiarity with test automation and failure-analysis methodologies
  • Experience working with OEM, ODM, or hardware vendors
  • Knowledge of high-speed interconnects such as PCIe Gen5
  • Hands-on experience using hardware lab equipment (e.g., scopes, programmers, system bring-up tools)
  • Exposure to FPGA, CPLD, or firmware development environments

Responsibilities

  • Manage platform deployment, availability, and stability for data center CPU/GPU systems
  • Lead system-level debugging efforts involving hardware, firmware, Linux, networking, power, and thermal behavior
  • Develop structured debug strategies, validation flows, and failure-analysis methodologies
  • Track daily activities, prioritize issues, assign work, and monitor progress to resolution
  • Collaborate with silicon, firmware, validation, networking, and operations teams to assess risks and requirements
  • Execute hands-on laboratory validation to ensure systems operate as intended prior to and during deployment
  • Partner with OEMs, ODMs, and vendors to resolve issues and improve platform reliability
  • Drive continuous improvement in tools, processes, and platform readiness

Benefits

  • AMD benefits at a glance
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service