Principal Engineer, CPU Architecture & Performance Research

Samsung Semiconductor•San Jose, CA

1d•Onsite

About The Position

Architecture Research Lab is seeking a Principal CPU Architecture & Performance Engineer to lead the definition, analysis, and optimization of next-generation CPU microarchitectures (RISC-V core). This role is focused on end-to-end performance: from architectural trade-offs and workload characterization to micro-architectural modeling, simulation, and silicon bring-up correlation. You will work closely with architecture, design, compiler, and system teams to drive performance and efficiency across a broad set of real-world workloads. Location: Daily onsite presence at our San Jose office in alignment with our Flexible Work policy.

Requirements

Master’s with 18+ years of experience in Computer Engineering, Computer Science, or related field. or PhD with 15+ years of experience preferred.
10+ years of experience in CPU microarchitecture and/or performance engineering.
Experience with RISC-V, ARM or X86 architectures.
Strong understanding of: Out-of-order execution, branch prediction, pipelines, and speculation
Cache coherence, memory systems, prefetching, and NUMA effects
Hands-on experience with architectural simulators (like gem5).
Strong programming skills in C/C++ and Python.
Familiarity with compiler optimizations and hardware/software co-design.
Familiarity with SIMD / Vectors / VME for AI inference workloads.
Experience analyzing large performance datasets and traces.
Proven ability (Tapeout / Patents / Publications) to influence architecture / micro architectural decisions through quantitative analysis.

Nice To Haves

Background in power/performance/area (PPA) trade-off analysis.
Experience with SIMD/Vectors.
Experience with compiler optimizations and hardware/software co-design.
Prior technical leadership at Senior Staff or Principal level.

Responsibilities

Define and evaluate CPU micro-architectural features for future cores (frontend, execution engine, memory hierarchy, interconnect).
Lead performance analysis using simulators, RTL.
Help develop and validate performance models (cycle-accurate, trace-driven, statistical).
Characterize workloads (SPEC, server, client, AI/ML, cloud, internal traces) and translate findings into architectural requirements.
Identify performance bottlenecks and propose data-driven optimizations.
Drive architecture-to-implementation alignment with design team.
Collaborate with compiler, OS, and system architects on cross-stack performance issues.
Mentor senior and staff engineers; provide technical leadership across projects.
Work leading to patents and publication.