Forward Deploy AI Engineer

Judgment Labs•San Francisco, CA

22d•Onsite

About The Position

Judgment Labs builds infrastructure for Agent Behavior Monitoring (ABM). While traditional observability focuses on logging exceptions and latency, our ABM surfaces behavioral anomalies such as instruction drifts and context retrieval loss in scaled production environments. Hundreds of teams building autonomous agents rely on Judgment to understand how their systems are behaving post-deployment. Instead of reactive incident triage, they cluster patterns across conversations and workflows, correlate regressions to specific interaction types, and pinpoint where reliability breaks down in their usage context. We’ve raised $30M+ across two rounds in the past five months. Our investors include Lightspeed, SV Angel, Valor Equity Partners, Nova Global, Chris Manning, Michael Ovitz, Michael Abbott, Cory Levy, Kevin Hartz, and others. The Role: Forward Deployed AI Engineers at Judgment Labs embed our agent behavior monitoring (ABM) infrastructure directly into customer production systems. You will work inside customer codebases to integrate monitoring and evaluation into real agent workflows, diagnose failures in live environments, and drive deployments to reliable production use. This role centers on deep technical execution and customer ownership. You will work directly with customer teams to reason about agent behavior, translate high-level goals into concrete ABM deployments, and own outcomes end-to-end across real production environments. The scope, judgment, and autonomy required in this role mirrors a training ground for what it takes to found or lead a technical company.

Requirements

You identify with at least one of the following: Experience deploying AI or LLM-based systems into real production environments
Ability to quickly learn new tools and systems, and integrate AI infrastructure into existing customer workflows and codebases
Ability to translate ambiguous customer goals into concrete technical solutions and evaluation strategies
Strong customer-facing skills, including explaining complex technical concepts clearly and building trust with both technical and non-technical stakeholders
Comfort owning deployments end-to-end, from initial integration through successful production adoption
You want to be a technical founder in the future.

Responsibilities

Deploy and embed Judgment Labs’ ABM platform and AI components directly into customer codebases and production AI systems
Work inside customer systems to integrate monitoring, evaluation, and agent-facing components into real workflows
Guide customers through technical decisions around agent monitoring, evaluation strategy, and integrating these capabilities into existing production systems.
Own multiple customer engagements end-to-end, ensuring successful integration and sustained adoption of monitoring and evaluation systems within production agent workflows.