Nuro-posted 3 months ago
$235,030 - $352,290/Yr
Full-time
Mountain View, CA
Professional, Scientific, and Technical Services

Nuro is seeking an experienced Technical Lead Manager with deep expertise in quantized training and model compression to join our ML Infrastructure team. In this role, you will drive the adoption of state-of-the-art quantization techniques, enabling training and deployment of highly-efficient models that power the Nuro Driver™. You will lead technical strategy, mentor a team of engineers and researchers, and partner closely with research and product groups to ensure our ML infrastructure is optimized for both cutting-edge research and real-time deployment on autonomous vehicles.

  • Setting technical direction for the Training Infrastructure team.
  • Staying ahead of emerging research and evaluating new methods.
  • Establishing telemetry to root-cause quality regressions in lower precision training.
  • Driving the adoption of quantized training methods (e.g., AWQ, AQT, GPTQ) across Nuro's ML infrastructure to accelerate model training and inference.
  • Leading the design and implementation of efficiency initiatives for model training, including low-bit quantization, pruning, and knowledge distillation, for both research and production workloads.
  • Collaborating cross-functionally with research, infrastructure, and product teams balancing accuracy, latency, and resource constraints.
  • Mentoring and growing a high-performing team of engineers and researchers.
  • 6+ years of professional or research experience in ML infrastructure, distributed training, or ML systems engineering.
  • Hands-on experience with quantization methods, including Activation-Aware Weight Quantization (AWQ), Accurate Quantized Training (AQT), FP-8 training, or related methods.
  • Knowledge of broader model compression techniques, such as structured/unstructured pruning and knowledge distillation.
  • Experience building or maintaining quantization libraries (e.g., AQT, bitsandbytes, NVIDIA Transformer Engine, DeepSpeed Compression).
  • Understanding of calibration and scaling strategies for quantized models to minimize accuracy loss.
  • Advanced degree (Ph.D. or strong M.Sc. with research experience) in Computer Science, Electrical Engineering, or related fields.
  • Knowledge of sparse networks and complementary model compression techniques (e.g., AdaRound, BRECQ, structured pruning).
  • Published work or open-source contributions in quantization methods (e.g., AWQ, AQT, GPTQ, SmoothQuant, ZeroQuant).
  • Annual performance bonus
  • Equity
  • Competitive benefits package
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service