About The Position

We are looking for a research scientist to push the frontier of multimodal representation learning for autonomous driving. You will focus on method-level innovation, including designing new learning paradigms, architectures, and training strategies for large-scale vision and video models. This role is ideal for candidates who care deeply about how models learn, not just how to train them.

Requirements

  • Strong academic background in Computer Science, AI, or related fields
  • Solid understanding of modern representation learning frameworks: CLIP, DINOv2, Masked Autoencoders
  • Proven experience in method-level innovation, such as: Designing new loss functions, Modifying architectures, Creating new training strategies

Nice To Haves

  • Publications in top-tier venues: CVPR, ICCV, NeurIPS, ICLR
  • Experience training models at scale using: DeepSpeed / Megatron-LM
  • Strong intuition for why models behave the way they do, not just how to train them

Responsibilities

  • Design and develop novel representation learning methods, including: Self-supervised learning, Contrastive learning, Masked modeling, Teacher-student / distillation frameworks
  • Innovate on model architectures, including: Vision Transformers (ViT), Video models (e.g., Video Swin), Multimodal fusion architectures
  • Develop new training paradigms and loss functions to improve representation quality
  • Drive measurable improvements in: Representation quality (e.g., linear probe, retrieval metrics), Downstream performance (e.g., perception / E2E driving systems)
  • Collaborate with engineering and data teams to integrate new methods into production systems
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service