Staff Backend Engineer, ML Inference Systems

Unity Technologies•Mountain View, CA

19h

About The Position

The Vector Gamer AI team at Unity is seeking a Staff Backend Engineer to build and operate the infrastructure that powers large-scale machine learning models for ad ranking and bidding decisions. This role involves designing and operating distributed systems that handle billions of daily decisions, with a strong focus on the performance, reliability, and scalability of inference systems. The engineer will contribute to how gaming experiences are discovered, monetized, and how creators are rewarded.

Requirements

5+ years designing, deploying, and maintaining distributed systems at scale
Expertise in Golang for building high-performance, low-latency backend infrastructure
Hands-on experience with cloud infrastructure on GCP and workload orchestration with Kubernetes
Strong grounding in monitoring and observability tooling, including Prometheus and Grafana
Experience in ad tech, recommender systems, real-time personalization, or other performance-critical domains
Familiarity with microservice architectures, containerization (Docker), and CI/CD best practices
Familiarity with machine learning platforms, workflows, and serving infrastructure

Nice To Haves

Experience with ML inference servers like NVIDIA Triton Inference Server.
Familiarity with auction mechanics or bidding systems in an ad tech context.
Experience embracing AI as a strategic advantage in engineering, following established best practices for code quality and security.

Responsibilities

Design, develop, and deploy production-grade backend services and distributed systems powering large-scale online model inference at billions of daily requests
Drive technical direction of our inference platform, with a focus on low-latency, high-throughput serving infrastructure
Partner with ML engineers to ensure online serving infrastructure scales with growing model complexity and inference volumes, without compromising latency or throughput
Ensure the reliability, scalability, and efficiency of our systems in production using monitoring and observability tools like Prometheus and Grafana.
Manage and optimize cloud infrastructure on GCP, orchestrating workloads with Kubernetes across a high-scale production environment
Promote and implement best practices for backend service development, testing, deployment, and monitoring (DevOps, SRE).