Summer 2026 PhD HPC & AI GPU Performance Intern

Advanced Micro Devices, Inc•Austin, TX

3d•Onsite

About The Position

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career. As an AMD intern, you’ll be placed at the epicenter of the AI ecosystem, working alongside experts and industry pioneers. You’ll do important work, learn new skills, expand your network, and gain real-world experience on projects that impact millions of end-users worldwide. Whether you’re an undergrad or a PhD student, your contributions matter—and your experience here will be a launchpad for what comes next. JOB DETAILS: Location: Austin, Texas Onsite: This role requires the Intern to work full time (40 hours a week) from the Austin office throughout the duration of the internship term. Duration: May 18, 2026 – August 7, 2026 WHAT YOU WILL BE DOING: We are seeking a highly motivated GPU & CPU Research intern to join our team. In this role you will participate in one or more of (but not limited to) the below assignments – Build, run, and analyze performance of benchmarks and application on GPU accelerated platforms Assess the capabilities of development tools and runtime environment in terms of capabilities, performance, and usability. Explore the benefits of different code optimization techniques Example projects include: Optimize communication patterns for HPC applications Investigate how GPU development tools can be adapted to future AMD GPUs Identify and measure performance capabilities across GPU families. Agents “ranger” through Jira/Atlassian to harvest reproducers. Use LLM workflows (maybe eventually agents) and existing Atlassian/Jira/Internal AMD AI available tools to incorporate existing bug reproducers Perform a thorough study porting simple synthetic workloads and maybe proxy apps to adaptiveCPP (SYCL) and measuring performance. WHO WE ARE LOOKING FOR: You are pursuing a PhD degree in PhD program in Computer Science, Computational Science, Electrical/Computer Engineering, Applied Mathematics, or a related field. You have strong background in parallel computing, distributed systems, or AI/ML frameworks. You are passionate about technology, looking to exploit GPUs to improve application performance. You have experience with following technical skills and key qualifications - Proficiency in programming languages such as Python, C/C++, or CUDA. Experience with at least one deep learning framework (e.g., PyTorch, TensorFlow, JAX). C/C++ programming skills Familiarity with MPI, OpenMP, or GPU programming. Solid understanding of numerical methods, optimization, or scientific computing. Experience with performance analysis, hot-spot identification, developing GPU kernels, analyzing and/or quantifying benefits of GPU offloading are a plus This role is not eligible for visa sponsorship. Note: By submitting your application, you are indicating your interest in AMD intern positions. We are recruiting for multiple positions, and if your experience aligns with any of our intern opportunities, a recruiter will contact you.

Requirements

You are pursuing a PhD degree in PhD program in Computer Science, Computational Science, Electrical/Computer Engineering, Applied Mathematics, or a related field.
You have strong background in parallel computing, distributed systems, or AI/ML frameworks.
You are passionate about technology, looking to exploit GPUs to improve application performance.
Proficiency in programming languages such as Python, C/C++, or CUDA.
Experience with at least one deep learning framework (e.g., PyTorch, TensorFlow, JAX).
C/C++ programming skills
Familiarity with MPI, OpenMP, or GPU programming.
Solid understanding of numerical methods, optimization, or scientific computing.

Nice To Haves

Experience with performance analysis, hot-spot identification, developing GPU kernels, analyzing and/or quantifying benefits of GPU offloading are a plus

Responsibilities

Build, run, and analyze performance of benchmarks and application on GPU accelerated platforms
Assess the capabilities of development tools and runtime environment in terms of capabilities, performance, and usability.
Explore the benefits of different code optimization techniques
Optimize communication patterns for HPC applications
Investigate how GPU development tools can be adapted to future AMD GPUs
Identify and measure performance capabilities across GPU families.
Agents “ranger” through Jira/Atlassian to harvest reproducers.
Use LLM workflows (maybe eventually agents) and existing Atlassian/Jira/Internal AMD AI available tools to incorporate existing bug reproducers
Perform a thorough study porting simple synthetic workloads and maybe proxy apps to adaptiveCPP (SYCL) and measuring performance.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume