Senior Engineer, Inference Control Plane

DigitalOceanSeattle, WA
$139,000 - $174,000Hybrid

About The Position

We are seeking a Senior Engineer to implement and contribute to the design and optimization of our Serverless Inference infrastructure and APIs. In this role, you will tackle the challenges of large-scale AI workloads, focusing on throughput, GPU utilization, and fault tolerance to support next-generation inference needs of AI native enterprises.

Requirements

  • 5+ years of experience building and operating multi-tenant platforms or distributed backend systems
  • Strong experience operating high-scale distributed services in production environments
  • Deep understanding of SRE principles, including observability, incident management, reliability engineering, capacity planning, and operational automation
  • 1+ years of hands-on experience with Go / Golang in production systems
  • 1+ years of experience with Kubernetes
  • Strong understanding of cloud-native architectures, microservices, and distributed systems fundamentals
  • Experience debugging performance, scalability, and reliability issues in production systems
  • Observability Proficiency: Experience tracking infrastructure and inference metrics like Time To First Token (TTFT), Time Per Output Token (TPOT), and GPU utilization.

Nice To Haves

  • AI/ML Framework Knowledge: Understanding of modern LLM serving architectures and familiarity with engines like vLLM or Triton.
  • Experience with API gateways, traffic routing, or service mesh technologies
  • Familiarity with LLM serving stacks such as vLLM, TensorRT-LLM, or similar technologies
  • Experience building systems for inference optimization, rate limiting, routing, or workload orchestration

Responsibilities

  • Design and build scalable, multi-tenant services that power AI inference and intelligent routing workloads.
  • Develop and operate high-scale distributed systems with strong reliability, availability, and performance goals.
  • Strengthen platform resiliency through improved observability, capacity management, automation, and operational tooling.
  • Partner closely with platform, GPU infrastructure, and product engineering teams to deliver production-grade systems and highly available APIs.
  • Raise the engineering bar through strong software design, operational discipline, incident management, and continuous improvement practices.
  • Contribute to architecture decisions around traffic management, service orchestration, reliability, and platform scalability.
  • Participate in on-call rotations and lead efforts to reduce operator pain, improve service health, and prevent recurring incidents.

Benefits

  • Competitive array of benefits
  • Employee Assistance Program
  • Local Employee Meetups
  • Flexible time off policy
  • Reimbursement for relevant conferences, training, and education
  • Access to LinkedIn Learning's 10,000+ courses
  • Bonus in addition to base salary
  • Equity compensation
  • Equity grants upon hire
  • Option to participate in our Employee Stock Purchase Program
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service