The problem we saw AI inference today is slow, expensive, and stateless. Send a query, wait, get a response, reset. That's fine for batch — but AI is becoming interactive, and interactive means inference has to respond instantly, hold context across a session, and be steerable in real time. Nobody had built an infrastructure that does all three at once. The bottleneck isn't the models. It's the runtime underneath them. What we're building to fix it uRun — Universal Runtime is the layer that makes real-time, stateful inference possible. Our platform lets AI respond instantly, hold context across a session, and be directed as it runs. We prove it through the hardest problem in the stack: real-time AI video generation. Not pre-rendered clips. Not queued jobs. Live, steerable, continuous video that responds as you speak. Solve that, and the rest of the inference stack follows, and that's what we've done. We're an infrastructure company; we build the layer model labs, builders, and research teams ship on top of.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Entry Level
Education Level
No Education Listed