Research Scientist, Video Understanding

MeckaNew York, NY
$200,000 - $250,000

About The Position

Mecka AI is seeking a Research Scientist, Video Understanding to lead their video understanding initiatives. This role involves training large-scale video representation and video-language models using Mecka's unique egocentric and stereo dataset. The goal is to transform these models into production-ready signals for downstream systems. The position focuses on large model training, video encoders, video-language models (VLMs/VLAs), and temporal representation learning using real-world robotics data.

Requirements

  • Deep experience training large models in PyTorch (or equivalent), including multi-GPU or distributed training.
  • Strong understanding of modern video representation learning and/or multimodal modeling.
  • Ability to run rigorous experiments and communicate results clearly.
  • Strong software engineering discipline: you write research code that can be shipped.

Nice To Haves

  • Experience with video VLMs / VLA-adjacent systems (VideoCLIP, InstructBLIP-Video, LLaVA-Video-class).
  • Experience with egocentric / embodied datasets (Ego4D, EgoExo4D, EPIC-Kitchens, Something-Something).

Responsibilities

  • Own model architecture and training strategy across Mecka’s task families (manipulation, locomotion, daily activity, long-horizon behavior).
  • Run self-supervised and multimodal pretraining (VideoMAE / VJEPA / VideoPrism / InternVideo-class) with rigorous evals and clean ablations.
  • Train and fine-tune video encoders and video-language models (temporal transformers, joint-embedding models, contrastive objectives, masked modeling, instruction/video alignment).
  • Incorporate useful priors (pose, depth, camera motion, optical flow) when it improves representation quality.
  • Turn checkpoints into usable artifacts: embeddings and model outputs that downstream systems can reliably consume (retrieval, labeling, QA, analytics).
  • Build a disciplined training + eval workflow with regression tracking and reproducible runs.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service