Research Platform Engineer

World Labs•San Francisco, CA

About The Position

We are looking for a Research Platform Engineer to build the mission-critical systems that research at World Labs depends on. This is a senior engineer whose work lives at the seam between research and platform. Your ownership is technical and concrete — the systems you design, the performance you unlock, the distributed infrastructure you operate, and the outcomes those systems produce. You will sit on either our research or platform org depending on where your strengths fit best, and the work happens across the seam between them. We want systems depth paired with a product engineer's instincts for research iteration speed and developer experience. This is a hands-on role. You will design, build, and ship code directly.

Requirements

5+ years of experience building and shipping production systems, with demonstrated ownership of infrastructure used by other engineers or researchers.
Strong depth in at least one of: ML infrastructure, distributed training or inference systems, data systems, or research tooling.
Strong distributed systems foundations — concurrency, consistency tradeoffs, replication, failure modes, and scaling behavior under real workloads.
Strong performance optimization skills across at least one of: training throughput, inference latency, GPU utilization, or system-level scaling.
Strong proficiency in Python, with the ability to work in C++, CUDA, Rust, or Go as the work demands.
Experience working directly with ML researchers or research engineers, including productionizing research code.
A product engineer's instincts for iteration speed and developer experience — applied to the systems researchers use every day.
Strong judgment about what to build and what to leave alone, particularly when research requirements are ambiguous or shifting.
High-ownership mindset; you measure yourself by outcomes shipped, not by tickets closed.

Nice To Haves

Experience at an AI lab or ML-native company, working on systems used directly by researchers.
Experience with large-scale training or inference systems — GPU orchestration, distributed training, or high-throughput inference.
Experience with low-level performance optimization — profiling, kernel-level tuning, memory and bandwidth optimization, distributed communication primitives.
Experience building developer experience tooling for research — notebooks, experiment tracking, reproducibility infrastructure.
Experience in early-stage or high-growth environments where scope and priorities shift frequently.

Responsibilities

Design and build training infrastructure, data infrastructure, and data processing and sourcing pipelines.
Productionize models for serving and own parts of the inference stack.
Build internal tools and services that increase engineering and research velocity.
Debug hard problems across training, inference, and performance — including distributed systems issues under real research workloads.
Optimize throughput, latency, GPU utilization, and system-level scaling.
Improve research iteration speed and developer experience — cut debugging time, raise reliability, and make it faster for researchers to ship experiments.
Raise the engineering bar across research and platform code alike.