Senior Product Manager, Software & Developer Platform

quadric.io•Burlingame, CA

55d•Onsite

About The Position

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code. Quadric is seeking a Senior Principal Product Manager to own the software roadmap for the Chimera Graph Compiler (CGC) — the developer-facing platform customers live in from pre-silicon through production. This role drives the monthly SDK release train, sets pattern coverage and quantization strategy, and works directly with anchor customers to convert model gaps into engineering roadmap. You'll partner with the CPO on strategy and with SW engineering on execution. This role is based in Burlingame (on-site), with quarterly travel to Japan, U.S. East Coast, and customer SW teams worldwide.

Requirements

Shipped product on at least one of: NPU or AI accelerator IP/silicon stack; graph or ML compiler (TVM, MLIR, XLA, or proprietary); developer-facing AI inference runtime or agent framework. "Adjacent" does not count.
Ready conversation, no prep, on: agentic workflows and LLM serving, KV cache optimization, quantization schemes (AWQ, GPTQ, SmoothQuant, QAT vs. PTQ), datatypes (INT4, FP8, BF16, OCP MX), and inference platforms (vLLM, llama.cpp, TensorRT-LLM, ExecuTorch, ORT).
You shipped a developer-facing AI or compute product — SDK, runtime, compiler, or inference service — with real users and a release cadence you owned.
You use agentic AI tools daily (Claude Code, Cursor, or equivalent) to produce work. Having read about agentic AI without integrating it is not sufficient.
When engineering wants to build the elegant thing and the customer needs the workable thing, you take the workable thing every time.
5+ years in PM, with 3+ years on a developer-facing AI/ML or compute platform
Owned a release cadence — picked what ships, what slips, and defended the call
Experience with in-person technical customer reviews
Bay Area resident or willing to relocate to Burlingame

Nice To Haves

ML background: graduate degree, published work, trained model, or OSS contribution
Automotive Tier 1 engagement; ISO 26262 awareness
Prior product work on a competing NPU/GPU/AI accelerator stack (MetaWare, Arm ML, Ceva, TensorRT, Hailo, Tenstorrent, etc.)
OSS contributions to vLLM, llama.cpp, TVM, MLIR, ONNX Runtime, or ExecuTorch

Responsibilities

Own the monthly SDK release and quarterly major: contents, release sync, go/no-go, release notes, and customer communications.
Decide which graph patterns CGC compiles next — attention variants, quantization schemes, normalization patterns — and sequence them against customer model requirements each quarter.
Lead with the market story and push it through every layer: demo, model zoo, pattern coverage, compiler work. Own what we publish and when.
Present in technical reviews with anchor customers. Convert model gap lists into engineering-ready roadmap entries.
Own the roadmap for INT4 (W4A8/W4A16), FP8, OCP MX, and KV cache compression. Coordinate with HW PM on MAC capability and with customer SW teams on model format decisions ahead of silicon tape-out.
Define the integration strategy for GGML/llama.cpp, vLLM, ONNX Runtime, ExecuTorch, and HF Optimum — deep partnership vs. thin reference.
Maintain a set of customer-confidence models (LLM chat, BEVFormer, VLA, ADAS perception) that serve as forcing functions for compiler completeness.
Take the roadmap to anchor customers, prospects, and the field. Brief the PMM monthly on what shipped and how to position it.
Track Synopsys MetaWare, Arm KleidiAI, Ceva NeuPro Studio, and NVIDIA TensorRT-LLM. Brief exec and sales quarterly.
Coordinate with the safety lead on ISO 26262 traceability and qualification artifacts.