Mecka AI is building the data infrastructure layer for robotics and embodied AI. We partner with leading AI labs and robotics companies to deliver high-quality, real-world datasets used to train, evaluate, and deploy robotic systems. Our work sits directly between research, data, and real-world execution — where model performance is dictated by data quality. Our Mission: Robotics will become the largest industry in human history — larger than anything that has come before it. As intelligent machines move into the physical world, they will dramatically expand global GDP, raise the material standard of living for everyone, and ultimately help make humanity a multiplanetary civilization. None of that happens without one thing: enormous amounts of high-quality, real-world data. Mecka AI builds that foundation. We are the data infrastructure layer for robotics and embodied AI — the substrate that teaches machines to perceive, reason, and act in reality. Get this right, and we accelerate the most important technological transition of our time. Our Culture: Excellence as the baseline. We hold an extremely high bar and expect the best work of your career. Mediocrity isn't interesting to us. Highly technical. We reason from first principles, not by analogy. The best argument wins — regardless of title or tenure. Truth-seeking. We are relentlessly honest with ourselves and each other. We chase reality — measured, not assumed — and kill our own bad ideas fast. Maniacal urgency. The work matters and the clock is real. We move fast, ship, measure, and iterate. Extreme ownership. You own outcomes end-to-end — no hand-offs, no excuses, no waiting for permission. Hardcore. This is a high-intensity environment for people who want to do the defining work of their lives. The Role: We are looking for a Research Scientist, Video Understanding to own Mecka’s video understanding agenda end-to-end: train large-scale video representation and video-language models on our egocentric + stereo corpus, and turn the resulting checkpoints into production signals the rest of the stack ships on. This role is focused on large model training, video encoders, video-language models, VLMs/VLAs, and temporal representation learning on real-world robotics data.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
No Education Listed