Infra Engineer - API

General Intuition & MedalNew York, NY
$250,000 - $400,000Onsite

About The Position

General Intuition is a frontier research lab focused on building foundation models for environments requiring deep spatial and temporal reasoning. The company has raised $133M from General Catalyst and Khosla to develop next-generation AI agents, world models, and video understanding models. This role is for an Infra Engineer to own the company's API, transforming research models into a production-ready API that is low-latency, highly available, reliable, and scalable. The engineer will work directly with the founding team and have end-to-end ownership of the API, including client libraries, frame reception and action streaming, request routing to GPUs, session management, Kubernetes cluster deployment, and GPU fleet scaling. This is a generalist infrastructure role requiring expertise in both API development and GPU infrastructure.

Requirements

  • A track record of personally scaling a high-traffic, low-latency API in production.
  • Deep Kubernetes experience, including multi-region deployments.
  • Comfort with SLOs and capacity planning.
  • Strong ownership instinct, with experience taking systems end-to-end.

Nice To Haves

  • Experience deploying streaming video or audio inference models.
  • Experience with low-latency game streaming or video streaming infra.
  • Experience scaling GPU fleets across providers (GCP, Coreweave, Lambda, etc.).
  • Experience with frontier model inference (LLMs, world models, multimodal).
  • Experience with on-device / edge inference (ExecuTorch, Core ML, etc.).

Responsibilities

  • Own the video streaming protocol, including frame reception from clients and efficient routing to servers.
  • Own the runtime layer of the API, encompassing stateful request routing, GPU session lifecycle, and inference orchestration.
  • Scale the Kubernetes footprint across multiple regions and lead new regional deployments.
  • Own the GPU hosting strategy, scaling from dozens to thousands of GPUs while managing costs and latency.
  • Drive improvements in latency and throughput for inference.
  • Partner with product engineering on developer-facing reliability, observability, metering, and billing-grade uptime.

Benefits

  • Competitive salary and meaningful equity
  • Comprehensive medical, dental, and vision coverage
  • 401(k)
  • Wellhub membership for fitness and wellness
  • Mental health support through Spring Health and Headspace
  • Fertility and maternal health benefits
  • Paid parental leave
  • Generous PTO, 11 paid company holidays, and paid sick time
  • Daily meals and commuter benefits at our NYC HQ
  • Learning and development stipend
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service