About The Position

Today, more intelligence is moving to end devices, and mobile is becoming a pervasive AI platform. At the same time, data centers are expanding AI capability through widespread deployment of ML accelerators. Qualcomm envisions making AI ubiquitous - expanding beyond mobile and powering other end devices, data centers, vehicles, and things. We are inventing, developing, and commercializing power-efficient on-device AI, edge cloud AI, data center and 5G to make this a reality. The Cloud AI Architecture team is comprised of experts that span the full gamut from performance modeling, software architecture, algorithm development, kernel optimization, down to hardware accelerator block architecture and SOC design. We are looking for an AI Accelerator Architect who would lead a team developing performance models that accurately estimate workload performance and power characteristics given known AI architectures. This team also executes developed models against various workloads. These models will be used to estimate performance for various internal teams and external customers.

Requirements

  • Master's degree or equivalent in Engineering, or related field.
  • 8+ years Hardware Engineering, Systems Engineering, or related work experience.
  • 3+ years experience performance modeling of AI accelerators or similar systems
  • Team leadership experience
  • In-depth knowledge of nVidia/AMD GPGPU capabilities and architectures
  • Knowledge of LLM architectures and their HW requirements

Nice To Haves

  • Knowledge of computer architecture, digital circuits
  • Knowledge of modeling of communication systems
  • Knowledge of communication protocols used in AI systems
  • Understanding of Network-on-Chip (NoC) designs used in System-on-Chip (SoC) designs
  • Understanding of various memory technologies used in AI systems
  • High-level architectural-level hardware modeling experience preferred
  • Knowledge of AI Inference and Training systems such as NVIDIA DGX and NVL72
  • Strong communication skills (written and verbal)
  • Detail-oriented with strong problem-solving, analytical and debugging skills
  • Good leadership abilities with attention given to tracking tasks and ensuring timely completion of those tasks
  • Proficient in Excel including VBA scripting
  • Demonstrated ability to learn, think and adapt in a fast-changing environment
  • Ability to code in C++ and Python
  • Understanding of elemental operations performed in ML workloads

Responsibilities

  • Define and develop performance models designed to estimate performance and power usage of AI workloads
  • Lead a team of performance architects, providing guidance for performance studies and model enhancements
  • Execute models against requested workloads and configurations
  • Understand trends in ML network design through customer engagements and latest academic research and determine how this will affect both SW and HW design
  • Work with customers to understand workloads and performance requirements
  • Analysis of current accelerator and GPU architectures
  • Suggest hardware enhancements required for efficient execution of AI workloads
  • Pre-Silicon prediction of performance for various ML training workloads
  • Perform analysis of performance/area/power trade-offs for future HW and SW ML algorithms including impact of SOC components (memory and bus impacts)
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service