Infrastructure Engineer – Agents & Simulation Platform

PlayerZero•San Francisco, CA

About The Position

We're looking for a backend engineer who loves building the systems that make AI agents reliable, fast, and useful in production. You'll work on our real-time player sessions (interactive AI assistants), our code simulation engine, and the LLM orchestration layer that ties them together. If you enjoy designing distributed systems, tuning prompts and agents with tight feedback loops, and shipping infrastructure that developers actually trust—this role is for you.

Requirements

A builder mindset — you like owning projects end-to-end and are thoughtful about data models, performance, and long-term maintainability.
Experience transitioning prototypes to production with an understanding of tradeoffs in reliability and scale.
3–6+ years of experience building backend systems—ideally with real-time, stateful, or event- driven workloads.
You've worked with LLMs in production: prompt engineering, tool/function calling, streaming responses, managing token budgets, or building evaluation harnesses.
You've built indexing or retrieval systems that give agents the right context—embeddings, vector search, or hybrid retrieval pipelines.
You're comfortable with distributed systems primitives: message queues (Kafka, Pulsar), distributed locks, pub/sub, and the failure modes that come with them.
You have attention to detail and a debugging mindset—you dig into why something failed, trace through logs and state, and don't stop at surface-level fixes.

Nice To Haves

You've tuned prompts extensively and have opinions about structured outputs, chain-of-thought, and when to use tools vs. inline reasoning.
You've built or contributed to agent frameworks, orchestration systems, or evaluation pipelines.
Hands-on with vector databases (pgvector, Pinecone, Turbopuffer) or hybrid search systems— you know when embedding similarity isn't enough and keyword matching saves the day.
You've built chunking or indexing pipelines for unstructured data (code, documents, logs) and understand the tradeoffs in chunk size, overlap, and metadata.
Experience with Kotlin or JVM-based backend services (our primary stack).
Familiarity with MongoDB, Postgres, or Redis for stateful workloads.
Hands-on with WebSockets or other real-time communication patterns.
Experience with code analysis tooling—AST parsing, Tree-sitter, or building retrieval systems over codebases.

Responsibilities

Iterate on prompts and agent behavior: tune system prompts, tool definitions, and response parsing to improve agent reasoning, reduce hallucinations, and handle edge cases gracefully.
Design and build agent tools—the interfaces that let LLMs take action. You'll own the contract between what the model can ask for and what the system can deliver, balancing expressiveness with token efficiency.
Build retrieval and indexing pipelines that give agents the right context: chunking strategies for code, embedding pipelines, vector indexes, hybrid search (semantic + keyword), and reranking to surface what actually matters within tight token budgets.
Own the agent runtime—real-time sessions, simulation orchestration, lifecycle management, streaming, and the concurrency/cancellation mechanics that make agents reliable at scale.
Design for correctness under concurrency: distributed locks, pub/sub coordination, idempotent operations, and graceful degradation when things go wrong.
Collaborate across the stack with frontend engineers on the real-time UI, ML/AI folks on model selection and evaluation, and customers on what actually matters.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume