Applied AI Engineer - Multimodal Transformers

KodiakMountain View, CA
3hOnsite

About The Position

Kodiak Robotics, Inc. was founded in 2018 and has become a leader in autonomous ground transportation committed to a safer and more efficient future for all. The company has developed an artificial intelligence (AI) powered technology stack purpose-built for commercial trucking and the public sector. The company delivers freight daily for its customers across the southern United States using its autonomous technology. In 2024, Kodiak became the first known company to publicly announce delivering a driverless semi-truck to a customer. Kodiak is also leveraging its commercial self-driving software to develop, test and deploy autonomous capabilities for the U.S. Department of Defense. Kodiak's autonomy stack is built on AI that fuses diverse sensor streams into a unified, actionable understanding of the world. We are developing GigaFusionNet – a large-scale multimodal transformer that learns rich, joint representations across camera, LiDAR, and radar through attention-based fusion. We are looking for engineers to push the boundaries of how transformer architectures combine and reason over heterogeneous sensor data.This role is open to all levels – from those eager to contribute to cutting-edge research to experts driving innovation at scale.

Requirements

  • BS, MS, or PhD in AI, Computer Science, or a related field, or at least 2-3 years of industry experience
  • Experience with transformer architectures, particularly in multimodal or multi-stream settings
  • Familiarity with cross-attention, token fusion, or modality alignment techniques
  • Proficiency in Python and deep learning frameworks like PyTorch or TensorFlow
  • Strong understanding of scalable training for large models, including distributed training and mixed-precision optimization
  • Passion for building AI that reasons over the full breadth of sensory input to operate safely in the real world

Responsibilities

  • Design and develop multimodal transformer architectures that fuse camera, LiDAR, and radar into unified representations
  • Research and implement cross-modal attention mechanisms, token fusion strategies, and efficient multi-stream tokenization
  • Build scalable training pipelines for large-scale multimodal transformers across massive real-world datasets
  • Explore self-supervised and contrastive pretraining objectives that learn transferable multimodal representations
  • Optimize transformer models for real-time inference under latency and compute constraints

Benefits

  • Competitive compensation package including equity and annual bonuses
  • Excellent Medical, Dental, and Vision plans through Kaiser Permanente, Cigna, and MetLife (including a medical plan with infertility benefits)
  • MetLife Legal Services, Identity & Fraud Protection, Hospital Indemnity Insurance, Accident Insurance, & Critical Illness Insurance
  • Flexible PTO, 10 paid holidays, and generous parental leave policies
  • Our office is centrally located in Mountain View, CA
  • Office perks: dog-friendly, free catered lunch, a fully stocked kitchen, and free EV charging
  • Long Term Disability, Short Term Disability, Life Insurance
  • Wellbeing Benefits - Headspace through Cigna, Calm through Kaiser, One Medical, Gympass, Spring Health through Cigna, Rula (mental health navigation)
  • Fidelity 401(k)
  • Commuter, FSA, Dependent Care FSA, HSA
  • Various incentive programs (referral bonuses, patent bonuses, etc.)
  • The pay range listed below reflects the base salary in our SF/Silicon Valley location, across several internal levels. Actual starting pay will be based on job-related factors including: work location, experience, relevant training, education, skill level and performance during interview. Total compensation at Kodiak includes base pay, equity, bonus and a competitive benefits package
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service