Senior Engineer, GPU Performance Architect (PPA)

Samsung Electronics•San Jose, CA

1d•$124,000 - $208,400

About The Position

Samsung, a world leader in advanced semiconductor technology, is founded on a simple philosophy – the endless pursuit of excellence will create a better world for all. At Samsung Austin Research and Development Center (SARC) and Advanced Computing Lab (ACL), we are building a center of excellence for Intellectual Property (IP) that is applied to high-performance computing devices (mobile, automotive, and other custom market segments) consumed by millions of people around the world. Come build with us! Role and Responsibilities As a Senior Engineer, GPU Architect, you will work on the analysis, verification, and optimization of end-to-end system performance for Samsung’s premium mobile GPUs. In this mid-to-senior individual contributor role, you will contribute to the performance strategy and guide architectural decisions that define the efficiency and scalability of Samsung’s GPU designs. You bring expertise in GPU architecture, performance analysis and verification, waveform-level and RTL debugging, with curiosity for advancing performance and power efficiency across complex systems. You are passionate about contributing to the design and execution of advanced performance modeling and analysis for GPU pipelines—including shader-level analysis, pipeline optimization, system-level characterization, and architectural prototyping. You enjoy learning about building and refining models and tools using C/C++, Python, and cycle-approximate frameworks to analyze and validate key metrics (such as latency, throughput, and power consumption), developing benchmarks, identifying bottlenecks, and proposing data-driven optimizations. You ensure design excellence and correctness through prototyping, model-to-RTL correlation, and deep-dive validation—including simulation, waveform analysis using tools like Synopsys Verdi, functional debug, and performance verification against required specifications. You proactively seek cross-functional collaboration with global teams, clearly communicate architecture proposals to diverse audiences, and exercising data-driven decision making to ensure seamless integration of the GPU into the overall system. You take initiatives on moderate-to-complex projects and help advance best practices, methodologies by staying ahead of industry trends and emerging technologies.

Requirements

5+ years of experience with a Bachelor’s Degree in Computer Science/Engineering, or 3+ years of experience with a Master’s Degree, or 1+ years of experience with a Ph.D.
5+ years of experience in broad GPU system-level architecture (not limited to a specific block), performance analysis, verification, and optimization.
Hands-on experience in RTL (System Verilog/Verilog).
Familiarity with waveform level debugging tools (e.g., Synopsys Verdi) and RTL debugging.
Solid programming skills in in C/C++ and Python.
Strong analytical and problem-solving skills, with the ability to identify bottlenecks and propose data-driven solutions
Excellent communication and collaboration skills, with the ability to navigate ambiguity in a fast-paced, global team environment.

Nice To Haves

Experience with prototyping GPU optimizations.
Experience with GPU profiling tools (e.g., RenderDoc, PIX, AMD RGP, Nvidia Nsight) to analyze and optimize graphics performance, power consumption, and system-level interactions.
Knowledge of OpenGL, Vulkan, DX11/12.
Experience with mobile platforms.

Responsibilities

Analysis, verification, and optimization of end-to-end system performance for Samsung’s premium mobile GPUs.
Contribute to the performance strategy and guide architectural decisions that define the efficiency and scalability of Samsung’s GPU designs.
Contributing to the design and execution of advanced performance modeling and analysis for GPU pipelines—including shader-level analysis, pipeline optimization, system-level characterization, and architectural prototyping.
Building and refining models and tools using C/C++, Python, and cycle-approximate frameworks to analyze and validate key metrics (such as latency, throughput, and power consumption), developing benchmarks, identifying bottlenecks, and proposing data-driven optimizations.
Ensuring design excellence and correctness through prototyping, model-to-RTL correlation, and deep-dive validation—including simulation, waveform analysis using tools like Synopsys Verdi, functional debug, and performance verification against required specifications.
Proactively seek cross-functional collaboration with global teams, clearly communicate architecture proposals to diverse audiences, and exercising data-driven decision making to ensure seamless integration of the GPU into the overall system.
Take initiatives on moderate-to-complex projects and help advance best practices, methodologies by staying ahead of industry trends and emerging technologies.