Principal Engineer, Agent Infrastructure & Memory Architecture

OKXSan Jose, CA
1d$313,055 - $450,000

About The Position

We are seeking an AI Infrastructure Engineer to design and build the foundational systems that power next-generation AI agents. This role sits at the intersection of research and production engineering, focusing on scalable long-term memory, knowledge graph infrastructure, and robust context management systems. You will transform cutting-edge research concepts (e.g., Context Graphs, GraphRAG, agent memory architectures) into auditable, maintainable, and production-grade infrastructure that enables reliable, stateful AI systems.

Requirements

  • PhD or Master's in Computer Science, AI, Machine Learning, Robotics or related field with at least 10 years of industry experience.
  • Possess a deep expertise in at least one of the following areas: long-term memory, knowledge graphs, or context management.

Nice To Haves

  • Experience in top-tier AI labs or Tech industry leaders
  • Experience in high-frequency trading, fintech, Web3, or crypto platforms is a strong advantage
  • Startups or research-intensive environments preferred; must be comfortable operating with ambiguity and speed

Responsibilities

  • Architect and implement long-term memory infrastructure for AI agents, including auditable, versioned, and rollback-capable middleware that supports persistent, stateful reasoning.
  • Design and own end-to-end GraphRAG and knowledge graph pipelines, from ingestion and schema design to graph indexing, retrieval optimization, and integration with LLM-based reasoning systems.
  • Build scalable context management infrastructure, elevating context from prompt-level techniques to structured, maintainable systems such as Context Graphs that enable long-horizon and cross-session reasoning.
  • Translate advanced agent research into production-grade systems, prototyping emerging architectures and establishing best practices for memory, retrieval, and infrastructure reliability.
  • Ensure reliability, observability, and governance of agent infrastructure, including traceable memory operations, evaluation frameworks, and safeguards against drift and degradation.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service