Principle AI Performance Modeling Architect

Qualcomm•Santa Clara, CA

About The Position

Today, more intelligence is moving to end devices, and mobile is becoming a pervasive AI platform. At the same time, data centers are expanding AI capability through widespread deployment of ML accelerators. Qualcomm envisions making AI ubiquitous - expanding beyond mobile and powering other end devices, data centers, vehicles, and things. We are inventing, developing, and commercializing power-efficient on-device AI, edge cloud AI, data center and 5G to make this a reality. The Cloud AI Architecture team is comprised of experts that span the full gamut from performance modeling, software architecture, algorithm development, kernel optimization, down to hardware accelerator block architecture and SOC design. We are looking for an AI Accelerator Architect who would lead a team developing performance models that accurately estimate workload performance and power characteristics given known AI architectures. This team also executes developed models against various workloads. These models will be used to estimate performance for various internal teams and external customers.

Requirements

Master's degree or equivalent in Engineering, or related field.
8+ years Hardware Engineering, Systems Engineering, or related work experience.
3+ years experience performance modeling of AI accelerators or similar systems
Team leadership experience
In-depth knowledge of nVidia/AMD GPGPU capabilities and architectures
Knowledge of LLM architectures and their HW requirements

Nice To Haves

Knowledge of computer architecture, digital circuits
Knowledge of modeling of communication systems
Knowledge of communication protocols used in AI systems
Understanding of Network-on-Chip (NoC) designs used in System-on-Chip (SoC) designs
Understanding of various memory technologies used in AI systems
High-level architectural-level hardware modeling experience preferred
Knowledge of AI Inference and Training systems such as NVIDIA DGX and NVL72
Strong communication skills (written and verbal)
Detail-oriented with strong problem-solving, analytical and debugging skills
Good leadership abilities with attention given to tracking tasks and ensuring timely completion of those tasks
Proficient in Excel including VBA scripting
Demonstrated ability to learn, think and adapt in a fast-changing environment
Ability to code in C++ and Python
Understanding of elemental operations performed in ML workloads

Responsibilities

Define and develop performance models designed to estimate performance and power usage of AI workloads
Lead a team of performance architects, providing guidance for performance studies and model enhancements
Execute models against requested workloads and configurations
Understand trends in ML network design through customer engagements and latest academic research and determine how this will affect both SW and HW design
Work with customers to understand workloads and performance requirements
Analysis of current accelerator and GPU architectures
Suggest hardware enhancements required for efficient execution of AI workloads
Pre-Silicon prediction of performance for various ML training workloads
Perform analysis of performance/area/power trade-offs for future HW and SW ML algorithms including impact of SOC components (memory and bus impacts)

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume