CPU/DSP Performance Engineer

QualcommAustin, TX
6d

About The Position

About the Role: Qualcomm is seeking a low‑level embedded performance engineer with a strong foundation in CPU, DSP, and system architecture to drive end‑to‑end performance improvements across our automotive compute platforms. This role focuses on CPU & NPU interactions, analyzing and optimizing latency, throughput, power, and utilization across the full software and hardware stack. We’re looking for someone with a hacker mindset—curious, resourceful, and eager to dive deep into systems, reverse engineer behavior, and solve complex performance challenges close to the metal. The ideal candidate will be proficient in processor architecture, C/C++ and assembly, embedded operating systems, and performance profiling tools. Experience with AI workloads and large language models (LLMs) is a plus but not required.

Requirements

  • Proficiency in low‑level debugging and performance analysis, including interpreting traces, counters, and system‑level behavior
  • Strong background in operating systems, computer architecture, and micro-architecture.
  • Experience with multi-threaded and multi-processor systems.
  • Strong programming skills in C/C++, Python, or similar.
  • Excellent problem-solving skills and attention to detail.
  • Strong collaboration and communication skills.
  • Bachelor’s degree in Electrical Engineering, Computer Science, Computer Engineering, or related field and 2+ years of relevant experience.
  • OR Master’s degree in a related field and 1+ years of relevant experience.
  • OR PhD in a related field.
  • Bachelor's degree in Electrical Engineering, Computer Science, Computer Engineering, or related field and 4+ years of Software Engineering, Electrical Engineering, Systems Engineering, or related work experience.
  • Master's degree in Electrical Engineering, Computer Science, Computer Engineering, or related field and 3+ years of Software Engineering, Electrical Engineering, Systems Engineering, or related work experience.
  • PhD in Electrical Engineering, Computer Science, Computer Engineering, or related field and 2+ years of Software Engineering, Electrical Engineering, Systems Engineering, or related work experience.
  • 2+ years of experience with high-performance microprocessor design.

Nice To Haves

  • Hands-on experience with debugging (gdb, lldb, winDbg or similar) and performance profiling tools (perf, VTune, Nsight or similar).
  • Familiarity with SIMD and SPMD execution models.
  • Understanding of hardware-software co-design principles.
  • Familiarity with linear algebra, and precision aware arithmetic is a plus.
  • Understanding of how ML architectures map to the hardware is a plus.
  • Knowledge of ML frameworks and libraries (ggml/llama.cpp or similar) is a plus.

Responsibilities

  • Benchmark and analyze performance of multi-threaded and multi-processor software.
  • Identify and resolve performance bottlenecks across software, architecture, and micro-architecture layers.
  • Analyze ML KPIs to guide optimization efforts.
  • Collaborate with cross-functional teams across hardware and software domains.
  • Develop and maintain tools for performance analysis and tuning.
  • Stay current with advancements in computer architecture, micro-architecture, and AI model design.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service