Advanced Micro Devices, Inc-posted 2 months ago
Senior
Austin, TX
5,001-10,000 employees

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.

  • Execute Power Attainment test plans in post-silicon phases in support of Data Center GPU product roadmap optimizing for power, perf/watt and performance.
  • Configure and setup ML/AI Datacenter GPU systems for data collection, experiments and test plan execution.
  • Utilize lab equipment such as oscilloscope, high speed probes, function generator and data acquisition equipment to gather required electrical characterization data for power and performance optimization.
  • Actively participate in analysis of post silicon performance and power data collected to ensure integrity of results, provide summary and conclusions of results, drive productization of features.
  • Analyze and debug interactions between various power management features.
  • Analyzing data from workload or execution output datalogs using excel or JMP analysis tools manually or developed automation.
  • Execute ROI analysis of power management features and provide feedback to power management architecture team.
  • Support prototyping experiments and development of new GPU features that impact performance and power.
  • Electrically stress the system, validate the limits of ASIC and system/board components and optimize settings for stability and performance.
  • Troubleshoot system-level issues that may occur in test environments and platforms.
  • Proactively drive continuous improvement for post-silicon power attainment activities.
  • Participate in development of automation environment in developing scripts automating workloads, enhancing capabilities of execution capabilities in Linux, Python and other support software support tools.
  • Work in a fast-paced resource constrained environment to build top of the line HPC & AI GPU products.
  • 8+ years of hands-on experience as an engineer in semiconductor industry.
  • Demonstrated ability to execute and deliver multiple projects in a timely fashion.
  • Prioritizing work items in a fast-paced environment and escalating as necessary.
  • Excellent grasp of computer organization/architecture, GPU architecture and power management.
  • Knowledge in power limited performance methodologies and control theory.
  • Extensive experience in platform optimization. Solid knowledge of Computer I/O.
  • Experience with tools for power and performance analysis.
  • Strong programming skills, scripting experience in Python preferred.
  • Familiarity with HPC/AI applications, benchmarks would be a big plus.
  • Desirable to be proficient in Linux command line environment and Shell scripting.
  • Deep knowledge of power management techniques like deep sleep, clock gating, pstates etc.
  • Experience with container technologies (ex. Docker).
  • Strong analytical and problem-solving skills with a key attention to detail.
  • Experience in data analysis, summarization, and presentation.
  • Excellent presentation and communication skills.
  • Experience in use and debug of lab tools such as oscilloscopes, DAQs, power measurement capabilities.
  • AMD benefits at a glance.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service