About The Position

This internship is part of the Perception Semantics team, focused on advancing on-robot AI systems that enable machines to understand and interact with the physical AI world. You’ll work on cutting-edge problems in vision-language-action (VLA) modeling, world modeling, spatial reasoning, and mapping, contributing to both research and real-world deployment. Projects are open-ended and research-driven, giving you the opportunity to explore new ideas, develop novel approaches, and evaluate them in realistic settings. This role is ideal for Ph.D. students interested in pushing the boundaries of computer vision and embodied AI while seeing their work translate into real-world impact.

Requirements

  • Currently enrolled in the Ph.D program in Computer Science, Electrical/Computer Engineering, or related field, with the specialization in the CV/NLP/ML
  • Experience in multi-modal modeling (vision, language, or planning), with deep understanding of Vision Language Model, vision foundation model, flow-matching, temporal modeling, and reinforcement learning techniques
  • Strong proficiency in PyTorch and modern transformer-based model design
  • Currently working towards a Ph.D in a relevant engineering program
  • Good academic standing
  • Able to commit to a 12-week internship during one of the following summer 2026 cohorts: May 18th - August 7th, OR May 26th - August 14th, OR June 15th - September 4th
  • At least one previous industry internship, co-op, or project completed in a relevant area
  • Ability to relocate to the Bay Area, California (or Boston, Massachusetts) for the duration of the internship
  • Interns at Zoox may not use any proprietary information they are working on as part of their thesis, any published work with their university, or to be distributed to anyone outside of Zoox

Nice To Haves

  • Publication records in top-tier AI conferences (CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML, etc)
  • Prior experience building foundation or end-to-end driving models for autonomous driving or robotics, or working deeply on LLM/VLM architectures (e.g., ViT, Flamingo, BEVFormer, RT-2, or GRPO-style policies)
  • Knowledge of RLHF/DPO/GRPO, trajectory prediction for safety

Benefits

  • medical insurance
  • a housing stipend (relocation assistance will be offered based on eligibility)

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Career Level

Intern

Education Level

Ph.D. or professional degree

Number of Employees

501-1,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service