Data Scientist, New Grad - Model Optimization

quadric, Inc•Burlingame, CA

49d•$120,000 - $160,000•Onsite

About The Position

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code. You will join the data science team in a full-time role focused on model optimization for Quadric's custom GPNPU architecture. You will own quantization work for production models, extend our quantization library, and build the numerical accuracy testing and debugging infrastructure that the team relies on day to day.

Requirements

B.S., M.S., or Ph.D. in CS, EE, Applied Math, or a related field, completed within the last year.
Strong Python skills and fluency with PyTorch (or TensorFlow), NumPy, and data-viz tools (Matplotlib/Plotly).
Solid machine learning foundations; working knowledge of CNNs and Transformers.
Interest in quantization, numerical representation, fixed-point arithmetic, or low-level performance.
Ability to read research papers, evaluate the core ideas, and reproduce key results.

Nice To Haves

Hands-on experience with quantization or model compression, or any of PyTorch FX/PTQ/QAT, TF-Lite, ONNX-Runtime, TVM, or MLIR Quant.
Experience with embedded systems, DSPs, GPUs, or other accelerators.
Published research, open-source contributions, or coursework projects in model optimization, efficient ML, or systems for ML.

Responsibilities

Develop and deploy quantization workflows for vision and language models, taking models from FP32 reference to deployable low-precision implementations that meet accuracy targets.
Investigate per-layer numerical error, identify accuracy regressions, and propose calibration, PTQ, or QAT strategies to recover lost accuracy.
Extend Quadric's quantization library with new operators, observers, and algorithms.
Build and maintain numerical accuracy testing infrastructure and debug tooling for neural networks running on the GPNPU.
Collaborate with graph compiler, kernel, and hardware teams to co-design solutions that exploit the GPNPU's numerical capabilities.