XR Perception Performance & Optimization Engineer, Sr

Qualcomm•San Diego, CA

7d•Onsite

About The Position

Responsible for in-depth analysis of XR and mobile chipset architectures for concurrent Perception workloads and enabling implementation of optimal use of hardware and software features. We seek a passionate engineer with deep software and system design experience to develop an understanding of chip-to-chip HW/SW enhancements impacting perception execution, analyzing system bottlenecks across key BSP components, and proposing algorithm and system-level optimizations for both single and concurrent perception workloads. Our solutions leverage dedicated hardware, multi-core processors, DSP, and GPU cores to provide Head Tracking, Hand Tracking, 3D Scene Understanding, Object Detection and Tracking, Depth for building XR experiences across a portfolio of solutions.

Requirements

Bachelor's degree in Engineering, Information Systems, Computer Science, or related field and 2+ years of Systems Engineering or related work experience.
OR Master's degree in Engineering, Information Systems, Computer Science, or related field and 1+ year of Systems Engineering or related work experience.
OR PhD in Engineering, Information Systems, Computer Science, or related field.
Experience in performance profiling, debugging, and optimization of real-time or latency-sensitive workloads (preferably perception / ML / CV pipelines).
Strong understanding of system-level execution on embedded platforms, including compute accelerators and runtime stacks and their performance tradeoffs across chip variants.
Practical experience analyzing BSP/runtime interactions that affect compute workloads (e.g., IPC overhead, scheduling behavior, memory/DDR effects).
Strong programming and debugging skills in C/C++ and Python (or equivalent).
Experience with OS principles and HW/SW interactions
Ability to coordinate across multiple teams, track technical dependencies, and communicate risks/impact clearly to technical leads and stakeholders.

Responsibilities

Understand and analyze chip-to-chip differences (e.g., on-chip compute/AI accelerators, DSP/HTP variants, memory subsystem characteristics, etc.) that impact perception performance, power, latency, and concurrency behavior.
Profile perception algorithms in both singular and concurrent execution scenarios; identify bottlenecks, resource contention, scheduling inefficiencies, and memory bandwidth constraints.
Collaborate with Perception teams and drive perception algorithm optimizations per chip, aligned with upcoming hardware and software improvements.
Propose and maintain concurrency priorities and resource allocation/reservation strategies across perception algorithms to meet product latency and throughput goals.
Understand connectivity options (e.g., device-to-device / distributed compute scenarios) and analyze how connectivity choices affect distributed perception workload partitioning, latency, and robustness.
Understand and debug BSP software components that impact perception workloads, including fastrpc, qurt, QNN, QAIRT (and related runtime/IPC/execution paths).

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume