Principal Engineer, Agent Infrastructure & Memory Architecture

OKX•San Jose, CA

65d•$313,055 - $450,000

About The Position

We are seeking an AI Infrastructure Engineer to design and build the foundational systems that power next-generation AI agents. This role sits at the intersection of research and production engineering, focusing on scalable long-term memory, knowledge graph infrastructure, and robust context management systems. You will transform cutting-edge research concepts (e.g., Context Graphs, GraphRAG, agent memory architectures) into auditable, maintainable, and production-grade infrastructure that enables reliable, stateful AI systems.

Requirements

PhD or Master's in Computer Science, AI, Machine Learning, Robotics or related field with at least 10 years of industry experience.
Possess a deep expertise in at least one of the following areas: long-term memory, knowledge graphs, or context management.

Nice To Haves

Experience in top-tier AI labs or Tech industry leaders
Experience in high-frequency trading, fintech, Web3, or crypto platforms is a strong advantage
Startups or research-intensive environments preferred; must be comfortable operating with ambiguity and speed

Responsibilities

Architect and implement long-term memory infrastructure for AI agents, including auditable, versioned, and rollback-capable middleware that supports persistent, stateful reasoning.
Design and own end-to-end GraphRAG and knowledge graph pipelines, from ingestion and schema design to graph indexing, retrieval optimization, and integration with LLM-based reasoning systems.
Build scalable context management infrastructure, elevating context from prompt-level techniques to structured, maintainable systems such as Context Graphs that enable long-horizon and cross-session reasoning.
Translate advanced agent research into production-grade systems, prototyping emerging architectures and establishing best practices for memory, retrieval, and infrastructure reliability.
Ensure reliability, observability, and governance of agent infrastructure, including traceable memory operations, evaluation frameworks, and safeguards against drift and degradation.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume