Principal Agentic Engineer

ZeroFox

About The Position

ZeroFox protects organizations from external threats across the public attack surface, and we're building agentic AI into the core of how that work gets done. You'll define and build the production agentic systems that power ZeroFox's product: the architectures, tooling, and practices that make agents manage state across steps, handle failures without losing context, and know when to escalate to a human. This is deep agent systems work. You'll be solving the hardest problems in multi-step orchestration, reliability, and evaluation for non-deterministic systems operating on adversarial data.

Requirements

7+ years of software engineering experience building production systems, with hands-on experience designing and deploying LLM-based agents
Deep knowledge of agent reliability patterns: state management, error boundaries, escalation logic, context management, tool-calling failure modes
Experience with agent orchestration frameworks and understanding of tradeoffs between existing frameworks and building custom
Strong backend engineering fundamentals: testing, monitoring, deployment, debugging, performance optimization
Hands-on experience with retrieval-augmented systems and understanding of how retrieval quality affects agent behavior
Experience building systems that handle adversarial or noisy input (cybersecurity, fraud detection, content moderation, or similar domains)
Familiarity with cloud-based AI deployment, including observability, reliability, and cost considerations

Nice To Haves

Experience defining agent governance practices (versioning, behavioral telemetry, rollback) in a production environment
Background in cybersecurity or adjacent domain with high-noise, adversarial data
Experience with tool-use architectures and integration patterns for connecting agents to external systems
Track record of raising engineering capability across a team through mentoring, pairing, or design review

Responsibilities

Design and implement production agent architectures: state management, error handling, retry logic, graceful degradation, human-in-the-loop escalation
Build evaluation and testing frameworks for non-deterministic agent workflows: offline tests, synthetic data generation, regression checks, and post-deploy monitoring
Implement orchestration patterns: multi-agent coordination, tool-calling chains, memory management, context window optimization
Define deployment and governance practices: agent versioning, rollback, behavioral telemetry, anomaly detection
Instrument agents with tracing, logging, and observability that make production behavior debuggable
Establish architectural standards and best practices for agentic development through design reviews, pairing, and mentorship

Benefits

Competitive compensation
Community-driven culture with employee events
Regular catered lunches for in-office work; snacks, drinks available daily
Generous time off
Comprehensive health benefits & 401(k) plan
Fun, modern workspace
Respectful and nourishing work environment, where every opinion is heard and everyone is encouraged to be an active part of the organizational culture

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume