Research Scientist, Robotics VLAs Post-Training and Adaptation

Toyota Research Institute•Los Altos, CA

50d•$176,000 - $264,000

About The Position

At Toyota Research Institute (TRI), weâre on a mission to improve the quality of human life. Weâre developing new tools and capabilities to amplify the human experience. To lead this transformative shift in mobility, weâve built a world-class team advancing the state of the art in AI, robotics, driving, and material sciences. Overview We are seeking a creative and technically strong researcher to advance post-training methods for Vision-Language-Action (VLA) models in robotics. This role focuses on improving model alignment, robustness, and adaptability in real-world robotic settings through advanced post-training and continual learning techniques. You will develop algorithms and frameworks that enable persistent learning and optimize data efficiency in embodied systems.

Requirements

Ph.D. or M.S. in Robotics, Machine Learning, Computer Vision, or related field, or equivalent applied research experience.
Expertise in reinforcement learning, imitation learning, and multimodal representation learning.
Strong proficiency with deep learning frameworks (e.g., PyTorch, JAX) and robotics simulation environments (e.g., MuJoCo, IsaacSim, PyBullet, Habitat).
Experience with sim-to-real transfer, policy adaptation, or continual learning in embodied settings.
Strong coding and experimental skills with an emphasis on reproducibility and evaluation at scale.
Prior robotics experience with real-world hardware and ML-based robot deployments.

Nice To Haves

Prior work on VLA models (e.g., PI0/PI0.5, OpenVLA, custom models).
Experience building or managing robot data collection infrastructure.
Familiarity with real-world robot platforms (e.g., Franka, Humanoids, or mobile manipulators).
Publications in top-tier conferences (CoRL, RSS, NeurIPS, ICLR, ICML, ICRA, CVPR).

Responsibilities

Post-training and adaptation: Design and implement post-training pipelines for VLA models using techniques such as reinforcement learning (RL), reinforcement learning from human or preference feedback (RLHF/RLAIF), in-context learning. Experience with real-world RL is a plus!
Sim-to-real transfer: Develop methods to enhance real-world transferability of policies trained in simulation.
Reset-free and continual learning: Explore and implement reset-free and autonomous data collection strategies that enable continual skill improvement without manual resets or supervision. Learn continually under settings with large-scale, long term data collection.
Structured exploration: Investigate exploration algorithms that balance safety, curiosity, and efficiency for data gathering in both simulation and real-world robotic systems.
Data curation and feedback loops: Lead the design of data collection and curation pipelines for exploration and post-training, using multimodal data from demonstrations, teleoperation, and on-policy rollouts.
Collaborate across teams in perception, control, and ML infrastructure to deploy scalable and reproducible research systems.
Publish research outcomes and contribute to the open robotics and embodied AI communities.

Benefits

TRI offers a generous benefits package including medical, dental, and vision insurance, 401(k) eligibility, paid time off benefits (including vacation, sick time, and parental leave), and an annual cash bonus structure.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume