About The Position

You will help build core systems that allow customers to deploy and scale ML workloads across cloud VPCs and on-premise clusters. Focusing on orchestration and operational tooling, you will write production-grade Python services to abstract hardware complexity away from AI developers. This is a hands-on infrastructure role requiring strong systems fundamentals. Location: San Francisco, USA Why this role is remarkable: Work at the intersection of infrastructure and AI, building the foundational layers that power modern model inference. Join a well-funded team backed by top-tier VCs where you can influence the architecture of a scaling platform. Gain deep experience in distributed computing, GPU orchestration, and high-performance ML frameworks at an early career stage.

Requirements

  • Strong foundation in Python with the ability to write clean, tested code and work effectively with APIs.
  • Familiarity with systems fundamentals including Linux, networking (TCP/IP), and concurrency models like processes and threads.
  • Academic or project exposure to distributed computing, cloud VPCs, or ML inference frameworks and containerization.

Responsibilities

  • Build and improve automated deployment and upgrade flows for ML infrastructure in AWS, GCP, and Azure VPCs.
  • Integrate open-source inference engines like vLLM and SgLang into coordinated platform services and CLI tools.
  • Implement observability systems, including structured logging and metrics, to monitor distributed inference performance and reliability.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service