Human Data Architect, Text Annotations

Mecka•New York, NY

1d•$140,000 - $180,000

About The Position

Mecka AI is building the data infrastructure layer for robotics and embodied AI. We partner with leading AI labs and robotics companies to deliver high-quality, real-world datasets used to train, evaluate, and deploy frontier models. Robotics will become the largest industry in human history — larger than anything that has come before it. As intelligent machines move into the physical world, they will dramatically expand global GDP, raise the material standard of living for everyone, and ultimately help make humanity a multiplanetary civilization. None of that happens without one thing: enormous amounts of high-quality, real-world data. Mecka AI builds that foundation. We are the data infrastructure layer for robotics and embodied AI — the substrate that teaches machines to perceive, reason, and act in reality. Get this right, and we accelerate the most important technological transition of our time. Our Culture Excellence as the baseline. We hold an extremely high bar and expect the best work of your career. Mediocrity isn't interesting to us. Highly technical. We reason from first principles, not by analogy. The best argument wins — regardless of title or tenure. Truth-seeking. We are relentlessly honest with ourselves and each other. We chase reality — measured, not assumed — and kill our own bad ideas fast. Maniacal urgency. The work matters and the clock is real. We move fast, ship, measure, and iterate. Extreme ownership. You own outcomes end-to-end — no hand-offs, no excuses, no waiting for permission. Hardcore. This is a high-intensity environment for people who want to do the defining work of their lives. As a Human Data Architect, Video-Text Annotation, you'll design the systems that govern how human video-text annotations are collected, structured, and delivered to train frontier AI models. This role sits at the intersection of post-training and multimodal methodology (video-language grounding, captioning, temporal/event annotation, preference and evaluation design) and human annotation at scale (task design, rubric design, workflow design, quality gates, and delivery guarantees). You will think carefully about how video is described, segmented, and labeled in text, and you'll build the end-to-end collection + delivery systems that make those choices real. You are not just writing guidelines — you are building the architecture that makes high-signal human video-text data repeatable, measurable, and scalable.

Requirements

You understand what makes a good video-text training example and what a multimodal / post-training pipeline needs to produce useful training signal.
You can reason about dataset design the way others reason about product design: user goals, edge cases, incentives, and measurable outcomes.
You care about methodology and rigor, but you also care about shipping systems that work under real operational constraints.
You can translate fuzzy "we need better video understanding" into concrete tasks, rubrics, and acceptance tests.

Nice To Haves

Task & rubric design: define what we ask annotators to do on video — captions, dense descriptions, temporal segmentation, event/action labeling, Q&A, and grounding — how we score it, and how we prevent ambiguity from leaking into the dataset.
Annotation methodology: design how video is described and labeled in text so the resulting data actually improves video-language model behavior.
Evaluation methodology: design evaluation datasets that discriminate between video-understanding capabilities (and don't collapse into noisy averages).
Data spec → pipeline design: translate a training goal into a concrete video-text data spec (schema, sampling strategy, frame/clip selection, quality thresholds, acceptance tests).
Quality architecture: design the quality gates, audits, and adjudication logic that keep video-text datasets trustworthy at scale.
Human-in-the-loop workflow architecture: define how annotators, tools, and models interact across the video annotation pipeline (where automation/pre-labeling helps, where humans are the ground truth).
Delivery reliability: build the systems so we can deliver consistent training signal on schedule (versioning, change control, and dataset contracts).

Responsibilities

Design and iterate on video-text collection tasks (instructions, examples, edge-case handling) to produce training signal that improves real model behavior.
Build evaluation sets and grading rubrics for video understanding that reveal capability gaps and guide what to collect next.
Work with engineering to implement the schemas, annotation tooling, and workflow logic that operationalizes your methodology.
Define acceptance tests for datasets (coverage, difficulty distribution, temporal accuracy, disagreement rates, failure surfaces).
Run retrospectives on model outcomes and trace failures back to upstream annotation design choices; then fix the architecture.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume