AI Inference Engineer Intern - Model Pruning

quadric.ioBurlingame, CA
Onsite

About The Position

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code.

Requirements

  • MS student in CS or related fields.
  • Proficiency in Python
  • Experience with model pruning and training in PyTorch
  • Experience in quantization, and vision model accuracy metrics.

Responsibilities

  • Model pruning: Prune the model to speed up inference with re-training to maintain accuracy.

Benefits

  • Hands-on experience working alongside industry experts in AI and semiconductor technology
  • Access to mentorship
  • Meaningful project ownership from day one
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service