AI Performance Engineer (Cloud AI Engineering), Sr. Staff

Qualcomm•San Diego, CA

141d•$178,400 - $267,600

About The Position

Qualcomm is utilizing its traditional strengths in digital wireless technologies to play a central role in the evolution of Cloud AI. We are investing in several supporting technologies including Deep Learning. The Qualcomm Cloud AI team is developing hardware and software solutions for Inference Acceleration. We are hiring an AI Performance Engineers at multiple levels to join our dynamic, collaborative team. This role spans the full product lifecycle—from cutting-edge research and development to commercial deployment—and demands strategic thinking, strong execution, and excellent communication skills.

Requirements

Hands-on experience in building and optimizing language models, notably in PyTorch, ONNX, preferably in production-grade environments.
Deep understanding of transformer architectures, attention mechanisms and performance trade-offs.
Experience in workload mapping strategies exhibiting sharding or various parallelisms.
Strong Python programming skills.
Proactive learning about the latest inference optimization techniques.
Understanding of computer architecture, ML accelerators, in-memory processing and distributed systems.
Strong communication, problem-solving skills and ability to learn and work effectively in a fast-paced and collaborative environment.
MS in Computer Science, Machine Learning, Computer Engineering or Electrical Engineering.

Nice To Haves

Background in neural network operators and mathematical operations, including linear algebra and math libraries.
Understanding of machine learning compilers.
Experience in converging accuracy and its evaluation methods.
Knowledge of torch.compile or torchDynamo.
PhD in Computer Science, Computer Engineering or Machine Learning.

Responsibilities

Convert, optimize and deploy models for efficient inference using PyTorch, ONNX.
Work at the forefront of GenAI by understanding advanced algorithms (e.g. attention mechanisms, MoEs) and numerics to identify new optimization opportunities.
Performance analysis and optimization of LLM, VLM, and diffusion models for inference.
Scale performance for throughput and latency constraints.
Mapping the next generation AI workloads on top of current and future hardware designs.
Work closely with customers to drive solutions by collaborating with internal compiler, firmware and platform teams.
Analyze complex performance or stability issues to work towards final root cause of underlying problems.
Create engineering solutions to deliver continuous insights into performance of AI workloads guiding the improvements over time.
Design and implement high-level kernels, e.g. in Triton, with a focus on generating efficient, low-level code.

Benefits

Competitive annual discretionary bonus program.
Opportunity for annual RSU grants.
Highly competitive benefits package designed to support your success at work, at home, and at play.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Career Level

Mid Level

Education Level

Master's degree

Number of Employees

5,001-10,000 employees

AI Performance Engineer (Cloud AI Engineering), Sr. Staff

About The Position

Requirements

Nice To Haves

Responsibilities

Benefits

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company