Principal Engineer, AI Platform

Epic GamesCary, NC
$223,199 - $327,358Onsite

About The Position

Epic Games is seeking a Principal Engineer for its AI Platform team. This role is crucial for building the next layer of infrastructure at Epic, focusing on an enterprise-grade stack of agentic AI systems. These systems will automate engineering workflows, accelerate developer productivity, and enable new forms of collaboration across Epic's diverse teams. The team is responsible for architecting and building production systems from the ground up, encompassing six interconnected platforms: Geppetto (team AI agents in Slack), EMA (compute infrastructure for agent harness runs), Hodor (AI tool orchestration gateway), Multipass (agent identity and credential vault), Vektor (org-wide memory plane), and Roost (software distribution and plugin marketplace). This foundational work will shape the future of AI at Epic for the next decade. The role involves owning the technical direction of the agent infrastructure stack, setting architecture, driving alignment, and solving complex distributed systems and security challenges. The Principal Engineer will write production code, design protocols, make critical decisions regarding agent authentication and authorization, and ensure system reliability. This is a hands-on role with significant architectural impact.

Requirements

  • 12+ years of software engineering experience, with at least 4 years at staff or principal scope.
  • Deep expertise in distributed systems: event-driven architectures, durable execution, service mesh, and multi-tenant platform design.
  • Production experience with authentication and authorization infrastructure: OAuth 2.0, OIDC, SPIFFE/SPIRE or equivalent workload identity, token exchange (RFC 8693), and policy engines (OPA, OpenFGA, or comparable).
  • Strong security engineering fundamentals: credential vaulting, secrets management (OpenBao/Vault), audit trail design, and least-privilege access at scale.
  • Fluency in at least one compiled, systems-capable language (Go preferred, Rust or C++ acceptable); comfort reading and writing Go microservices is essential.
  • Track record of owning multi-service platform architecture across a full product lifecycle.
  • Exceptional written communication: design documents and architecture reviews that are clear, precise, and influence without authority.
  • Hands-on experience building LLM-integrated systems: agent orchestration, tool-use frameworks, MCP (Model Context Protocol), or equivalent agent-to-tool middleware.
  • Experience with plugin or extension runtime design: WASM sandboxing, gRPC sidecar patterns, subprocess isolation, or comparable capability security models.
  • Familiarity with knowledge graph systems (Neo4j or comparable), vector databases, and hybrid retrieval (semantic + keyword + graph).
  • Experience operating Kubernetes-based platforms: scheduling, workload identity, sidecar injection, and multi-tenancy isolation.

Responsibilities

  • Own the end-to-end technical architecture across Geppetto, EMA, Hodor, Multipass, Vektor, and Roost, ensuring coherence and well-defined integration seams.
  • Drive architectural decisions for agent identity and workload authorization, translating security requirements into implementable designs.
  • Establish patterns for AI agent authentication, credential reception, tool execution, and auditing, maintaining correctness across the stack.
  • Lead design reviews for new capabilities, evaluate build vs. buy decisions, and identify technical risks.
  • Design and implement Cluster API and provider abstractions for EMA to manage headless agent runs.
  • Evolve Hodor's plugin runtime and gateway security posture.
  • Architect Vektor's knowledge graph, vector search, and memory consolidation pipeline.
  • Define durability, consistency, and isolation requirements for shared event-driven architectures.
  • Lead the Multipass proposal, defining separation of concerns and migrating the credential vault.
  • Hold the standard for credential security across the stack.
  • Work with Epic's security organization to ensure agent-to-service trust models meet enterprise standards.
  • Partner with product, ML, and enterprise platform teams to shape agent capabilities.
  • Mentor senior and staff engineers, conduct technical interviews, and raise the hiring bar.
  • Write design documents that serve as reference architecture.

Benefits

  • Generous benefit plans
  • Discretionary incentive programs
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service