Qualcomm is leveraging its strengths in compute, connectivity, and AI acceleration to play a central role in the evolution of Cloud AI. The Qualcomm Cloud AI team develops hardware and software platforms enabling efficient inference of large-scale foundation models. We are seeking a Staff Engineer – AI Model Optimization Architect to lead end-to-end model transformation and optimization for LLMs, VLMs, diffusion, and multimodal models on Qualcomm inference accelerators. This role works closely with compiler, performance, and accuracy teams to translate models into accelerator efficient execution while balancing throughput, latency, memory, and quality. The scope spans Day0 enablement through production deployment, with a strong emphasis on scaling optimizations to future architectures.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level