AI/HPC Cluster Thermal Design Engineer

Advanced Micro Devices, IncAustin, TX

About The Position

We are seeking a Cluster Thermal Engineer to help architect and deliver scalable thermal solutions for AI/HPC clusters and data center deployments. In this role, you will support the evaluation, modeling, and validation of cooling architectures for high-density platforms. You will work closely with system architects, platform engineers, and data center operations teams to ensure thermal performance, reliability, and serviceability across development and deployment environments. At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.

Requirements

  • Strong understanding of fundamentals: thermodynamics, fluid dynamics, and heat transfer.
  • Familiarity with electronics cooling concepts (heat sinks, cold plates, TIMs, heat exchangers, pumps, valves, fans).
  • Exposure to data center or cluster thermal concepts such as: rack/row layout considerations, CDU/coolant distribution, RDHx/AHU/fan-wall concepts, chilled water interfaces and heat rejection.
  • Exposure to one or more thermal/CFD simulation tools (ANSYS, COMSOL, FloTHERM, OpenFOAM, or similar).
  • Familiarity with measurement and validation practices (instrumentation, uncertainty, sensor placement, data analysis).
  • Comfort working in cross-functional engineering environments and communicating technical ideas clearly.

Nice To Haves

  • Understanding of PUE/WUE drivers, economizers/free cooling, or waste-heat reuse concepts.
  • Coursework in heat transfer, thermodynamics, two-phase flow and heat transfer, refrigeration.
  • Projects, internships, or research related to HPC/AI infrastructure, data centers, or high-power electronics cooling.

Responsibilities

  • Support the thermal design of AI/HPC cluster solutions, including compute racks, cooling loops, and facility interfaces.
  • Assist in evaluating cooling architectures (air cooling, direct liquid cooling, hybrid approaches) and identifying trade-offs in performance, cost, complexity, and reliability.
  • Build and refine thermal and airflow models for system/cluster/data center concepts using industry tools (e.g., OpenFOAM, ANSYS, FloTHERM, or similar).
  • Contribute to flow-network modeling for liquid cooling and coolant distribution analyses to ensure adequate flow, pressure, and temperature margins.
  • Help define and execute test plans to validate thermal performance at component, system, and rack/cluster levels.
  • Support integration of cooling solutions and thermal telemetry at the cluster level in collaboration with power, networking, platform, firmware, and controls teams.
  • Participate in design reviews with internal stakeholders and external partners/customers; summarize findings and track action items.
  • Assist with experimental setup, instrumentation, data collection, and analysis in partnership with validation and lab teams.
  • Create and maintain technical documentation, including requirements, design notes, modeling assumptions, test reports, and user/customer-facing summaries.
  • Support issue triage and root-cause analysis for thermal performance and reliability concerns during bring-up and deployment.

Benefits

  • AMD benefits at a glance
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service