About The Position

The Physical AI Model Optimization Lead will drive the technical execution of advanced robotic AI model deployment on Qualcomm Dragonwing chipsets. This is a deeply technical, hands‑on role focused on quantization, compression, optimization, mixed‑precision tuning, and hardware‑aware graph transformations using Qualcomm’s internal toolchains. This role provides exposure to industry‑leading robotics‑centric AI models, including next‑generation vision‑language‑action (VLA) architectures and complex multimodal transformers, with responsibility for taking models from research grade to highly optimized real‑time deployment on heterogeneous compute. Your work will directly impact real robots—and the teams building them.

Requirements

  • Bachelor's degree in Computer Science, Engineering, Information Systems, or related field and 6+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.
  • Master's degree in Computer Science, Engineering, Information Systems, or related field and 5+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.
  • PhD in Computer Science, Engineering, Information Systems, or related field and 4+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.

Nice To Haves

  • MS in Computer Science, Electrical Engineering, Robotics, or a related field; PhD a plus.
  • 5+ years of experience in embedded/on‑device AI, model optimization, or performance engineering.
  • Deep technical expertise in:
  • Mixed‑precision quantization (INT8/FP16/FP8)
  • QDQ graph‑based quantization flows
  • PTQ and QAT workflows
  • Model compression techniques (pruning, distillation, low‑rank methods)
  • Strong experience with ONNX and PyTorch or TensorFlow model export and graph manipulation.
  • Hands‑on profiling experience on edge devices, custom SoCs, or heterogeneous compute targets.
  • Experience with Qualcomm toolchains: AI Hub Workbench, AIMET, QNN, QGenie, or similar.
  • Background optimizing transformer‑based perception, VLMs, and VLA architectures.
  • Understanding of heterogeneous compute system design and operator scheduling.
  • Direct experience supporting customers or partners in model deployment and performance tuning.

Responsibilities

  • Execute end‑to-end model optimization, including graph rewrites, operator fusion, and hardware‑specific transformations.
  • Apply mixed‑precision quantization and QDQ workflows (PTQ/QAT) for high-performance deployment.
  • Implement compression techniques such as pruning, distillation, and low-rank factorization.
  • Debug accuracy issues using fine‑grained tensor comparisons during quantization and conversion.
  • Use Qualcomm tools (AI Workbench, AIMET, QNN, QGenie, profilers) to convert, validate, and optimize models.
  • Map and tune models across heterogeneous compute (DSP/NPU/GPU), including operator placement and kernel selection.
  • Perform detailed performance profiling and analyze memory, tiling, and scheduling behavior.
  • Collaborate with internal teams and external customers to integrate, tune, and validate models on Dragonwing hardware.
  • Set the technical bar for optimization of physical AI models.
  • Own optimization workflows from initial model drop → compression → mixed‑precision/QDQ quantization → conversion → on‑device profiling → final tuned deployment.
  • Work closely with Qualcomm’s existing tools and teams—AI Hub Workbench, QNN, AIMET, QGenie, compiler, and robotics AI.
  • Serve as the technical authority on quantization correctness, mixed‑precision design, and hardware‑aware optimization for physical AI.
  • Drive improvements in internal tools and processes through hands‑on experiments and data‑driven reporting.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service