AI Intern – VLA Deployment

XPENGSanta Clara, CA

About The Position

XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric vertical take-off and landing (eVTOL) aircraft, and robotics. With a strong focus on intelligent mobility, XPENG is dedicated to reshaping the future of transportation through cutting-edge R&D in AI, machine learning, and smart connectivity. The Mission: Vision-Language-Action (VLA) models and foundation models are becoming increasingly important in autonomous driving, but turning research models into real-time, production-ready systems on vehicle hardware remains a major challenge. We are looking for an entry-level engineer or intern to support the optimization and deployment of multimodal models onto vehicle-grade compute platforms. This role is a strong fit for candidates who are excited about deep learning systems, model deployment, and edge inference for real-world autonomous driving applications.

Requirements

  • BS, MS, or PhD in Computer Science, Electrical Engineering, Robotics, or a related field.
  • Strong programming skills in C++ and/or Python.
  • Familiarity with deep learning frameworks such as PyTorch.
  • Basic understanding of model inference, deployment, or optimization workflows using tools such as ONNX, TensorRT, or similar frameworks.
  • Exposure to model compression or quantization concepts such as INT8, FP16, or related approaches.
  • Interest in computer architecture, performance optimization, and edge or embedded systems.
  • Strong problem-solving skills and the ability to learn quickly in a fast-paced engineering environment.
  • Good communication skills and the ability to collaborate with cross-functional teams.

Nice To Haves

  • Internship, research, or project experience in deep learning model deployment, inference acceleration, or embedded AI.
  • Familiarity with Transformers, multimodal models, or foundation models.
  • Experience with CUDA or GPU programming.
  • Exposure to autonomous driving, robotics, or real-time systems.
  • Contributions to research projects, open-source repositories, or relevant course projects.

Responsibilities

  • Support model quantization and deployment efforts for large-scale multimodal models, including Transformers and vision-language models.
  • Assist with applying model optimization techniques such as post-training quantization, quantization-aware training, pruning, and related compression methods under guidance from senior engineers.
  • Work with research and platform teams to help improve model deployability and understand hardware and runtime constraints.
  • Contribute to deployment tools, test pipelines, and runtime modules in C++ and Python for autonomous driving systems.
  • Help analyze model performance, memory usage, latency, and numerical accuracy across different deployment targets.
  • Participate in debugging and performance tuning across the model, runtime, and system stack.
  • Support validation and testing workflows to ensure stable and reliable deployment in vehicle and simulation environments.

Benefits

  • Infrastructures and computational resources to support your work.
  • Opportunity to work on cutting edge technologies with the top talents in the field.
  • Opportunity to make significant impact on the transportation revolution by the means of advancing autonomous driving.
  • Competitive compensation package.
  • Snacks, lunches, dinners, and fun activities.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service