About The Position

Each day, we enable billions of players to access their favorite games and experiences. Our Vector Gamer AI team is at the heart of this mission, managing ad ranking and bidding decisions across billions of daily impressions. This team is where large-scale machine learning and real-world impact converge. We are looking for a Staff Backend Engineer to design, build, and operate the infrastructure on which these models rely. The person selected for this role will design and operate the distributed systems that power billions of decisions every day, with a focus on the performance, reliability, and scalability of inference systems. Join us and help shape how billions of gaming experiences are discovered and monetized, and how creators are rewarded.

Requirements

  • 5+ years designing, deploying, and maintaining distributed systems at scale
  • Expertise in Golang for building high-performance, low-latency backend infrastructure
  • Hands-on experience with cloud infrastructure on GCP and workload orchestration with Kubernetes
  • Strong grounding in monitoring and observability tooling, including Prometheus and Grafana
  • Experience in ad tech, recommender systems, real-time personalization, or other performance-critical domains
  • Familiarity with microservice architectures, containerization (Docker), and CI/CD best practices
  • Familiarity with machine learning platforms, workflows, and serving infrastructure
  • Sufficient knowledge of English for professional verbal and written exchanges

Nice To Haves

  • Experience with ML inference servers like NVIDIA Triton Inference Server
  • Familiarity with auction mechanics or bidding systems in an ad tech context

Responsibilities

  • Design, develop, and deploy production-grade backend services and distributed systems powering large-scale online model inference at billions of daily requests
  • Drive technical direction of our inference platform, with a focus on low-latency, high-throughput serving infrastructure
  • Partner with ML engineers to ensure online serving infrastructure scales with growing model complexity and inference volumes, without compromising latency or throughput
  • Ensure the reliability, scalability, and efficiency of our systems in production using monitoring and observability tools like Prometheus and Grafana
  • Manage and optimize cloud infrastructure on GCP, orchestrating workloads with Kubernetes across a high-scale production environment
  • Promote and implement best practices for backend service development, testing, deployment, and monitoring (DevOps, SRE)

Benefits

  • Comprehensive health, life, and disability insurance
  • Commute subsidy
  • Employee stock ownership
  • Competitive retirement/pension plans
  • Generous vacation and personal days
  • Support for new parents through leave and family-care programs
  • Office food snacks
  • Mental Health and Wellbeing programs and support
  • Employee Resource Groups
  • Global Employee Assistance Program
  • Training and development programs
  • Volunteering and donation matching program
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service