Research Scientist, Video Understanding

Mecka•New York, NY

3d•$200,000 - $250,000

About The Position

Mecka AI is seeking a Research Scientist, Video Understanding to lead their video understanding initiatives. This role involves training large-scale video representation and video-language models using Mecka's unique egocentric and stereo dataset. The goal is to transform these models into production-ready signals for downstream systems. The position focuses on large model training, video encoders, video-language models (VLMs/VLAs), and temporal representation learning using real-world robotics data.

Requirements

Deep experience training large models in PyTorch (or equivalent), including multi-GPU or distributed training.
Strong understanding of modern video representation learning and/or multimodal modeling.
Ability to run rigorous experiments and communicate results clearly.
Strong software engineering discipline: you write research code that can be shipped.

Nice To Haves

Experience with video VLMs / VLA-adjacent systems (VideoCLIP, InstructBLIP-Video, LLaVA-Video-class).
Experience with egocentric / embodied datasets (Ego4D, EgoExo4D, EPIC-Kitchens, Something-Something).

Responsibilities

Own model architecture and training strategy across Mecka’s task families (manipulation, locomotion, daily activity, long-horizon behavior).
Run self-supervised and multimodal pretraining (VideoMAE / VJEPA / VideoPrism / InternVideo-class) with rigorous evals and clean ablations.
Train and fine-tune video encoders and video-language models (temporal transformers, joint-embedding models, contrastive objectives, masked modeling, instruction/video alignment).
Incorporate useful priors (pose, depth, camera motion, optical flow) when it improves representation quality.
Turn checkpoints into usable artifacts: embeddings and model outputs that downstream systems can reliably consume (retrieval, labeling, QA, analytics).
Build a disciplined training + eval workflow with regression tracking and reproducible runs.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume