Judgment Labs builds infrastructure for Agent Behavior Monitoring (ABM). While traditional observability focuses on logging exceptions and latency, our ABM surfaces behavioral anomalies such as instruction drifts and context retrieval loss in scaled production environments. Hundreds of teams building autonomous agents rely on Judgment to understand how their systems are behaving post-deployment. Instead of reactive incident triage, they cluster patterns across conversations and workflows, correlate regressions to specific interaction types, and pinpoint where reliability breaks down in their usage context. We’ve raised $30M+ across two rounds in the past five months. Our investors include Lightspeed, SV Angel, Valor Equity Partners, Nova Global, Chris Manning, Michael Ovitz, Michael Abbott, Cory Levy, Kevin Hartz, and others. The Role: Forward Deployed AI Engineers at Judgment Labs embed our agent behavior monitoring (ABM) infrastructure directly into customer production systems. You will work inside customer codebases to integrate monitoring and evaluation into real agent workflows, diagnose failures in live environments, and drive deployments to reliable production use. This role centers on deep technical execution and customer ownership. You will work directly with customer teams to reason about agent behavior, translate high-level goals into concrete ABM deployments, and own outcomes end-to-end across real production environments. The scope, judgment, and autonomy required in this role mirrors a training ground for what it takes to found or lead a technical company.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed