Senior CPU Performance Architect

NVIDIAAustin, CA

About The Position

Do you want to help drive the development of CPU technology for architectures used for artificial intelligence (AI) / deep learning (DL), high-performance computing (HPC), cloud service providers (CSP), gaming, virtual reality, and autonomous vehicles? Come join the CPU performance architecture team and help us push performance boundaries for all our CPU products! With the introduction of the Grace CPU Superchip, and more recently, the announcement of the Vera CPU, NVIDIA has expanded into the CPU server market, complementing our world-class GPUs and SoCs. These CPUs play a critical role in orchestrating complex workloads with exceptional performance-per-watt efficiency. The CPU architecture team is driving innovations that integrate seamlessly with NVIDIA’s broader technology stack, enabling faster AI model training, agentic use-cases, efficient data processing, and scalable cloud deployments.

Requirements

  • BS/MS in Electrical Engineering, Computer Science, Computer Engineering, or equivalent experience.
  • 12+ years of relevant experience.
  • Experience with CPU workloads and performance analysis.
  • Knowledge of performance test development and benchmarking for CPU and I/O.
  • Deep knowledge of CPU microarchitecture and system architecture.
  • Experience with the ARM instruction set architecture (ISA) preferable but not required.

Nice To Haves

  • PhD or Research experience.
  • GPU driver experience.
  • Knowledge of GPU-accelerated workloads and modeling performance of accelerated workloads.
  • Experience with performance optimization of AI frameworks such as PyTorch.

Responsibilities

  • Work on workload bring-up and performance analysis/projection, both on silicon and full-system simulator.
  • Study workloads for a wide range of markets, including AI/DL, CSP, HPC, and autonomous vehicles.
  • Study real world use-cases and identify critical application behavior and reduce to directed test cases.
  • Analyze and debug performance scaling bottlenecks on multi-core and multi-socket CPU and CPU/GPU systems.
  • Work with CPU and interconnect architects to improve future CPU and system designs based on your findings.
  • Benchmark NVIDIA’s CPU offerings against competition and suggest software or hardware improvements.

Benefits

  • You will also be eligible for equity and benefits
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service