Data Scientist, New Grad - Model Optimization

quadric.io•Burlingame, CA

48d•Onsite

About The Position

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code. This role is a full-time position within the data science team, focused on model optimization for Quadric's custom GPNPU architecture. The successful candidate will be responsible for quantization work for production models, extending the quantization library, and building the numerical accuracy testing and debugging infrastructure essential for the team's daily operations.

Requirements

B.S., M.S., or Ph.D. in CS, EE, Applied Math, or a related field, completed within the last year.
Strong Python skills and fluency with PyTorch (or TensorFlow), NumPy, and data-viz tools (Matplotlib/Plotly).
Solid machine learning foundations; working knowledge of CNNs and Transformers.
Interest in quantization, numerical representation, fixed-point arithmetic, or low-level performance.
Ability to read research papers, evaluate the core ideas, and reproduce key results.

Nice To Haves

Hands-on experience with quantization or model compression, or any of PyTorch FX/PTQ/QAT, TF-Lite, ONNX-Runtime, TVM, or MLIR Quant.
Experience with embedded systems, DSPs, GPUs, or other accelerators.
Published research, open-source contributions, or coursework projects in model optimization, efficient ML, or systems for ML.

Responsibilities

Develop and deploy quantization workflows for vision and language models, taking models from FP32 reference to deployable low-precision implementations that meet accuracy targets.
Investigate per-layer numerical error, identify accuracy regressions, and propose calibration, PTQ, or QAT strategies to recover lost accuracy.
Extend Quadric's quantization library with new operators, observers, and algorithms.
Build and maintain numerical accuracy testing infrastructure and debug tooling for neural networks running on the GPNPU.
Collaborate with graph compiler, kernel, and hardware teams to co-design solutions that exploit the GPNPU's numerical capabilities.

Benefits

Competitive salary and meaningful equity
Medical, dental, and vision plan options starting on day one
401(k) retirement plan
Flexible paid time off (unlimited, non-accrual) to support work-life balance
Company-provided lunches and a stocked kitchen (when working in-office)
Support for commuting, including monthly parking or Caltrain passes
The opportunity to build long-term career relationships in a company that values strong personal connections alongside professional excellence

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume