We are Judgment. We build infrastructure for Agent Behavior Monitoring (ABM): surfacing silent behavioral issues, understanding how agents behave in production, and turning interaction data into actionable signals. Hundreds of teams building autonomous agents rely on Judgment to understand how their systems are behaving post-deployment. When something breaks, they’re not stuck in reactive incident triage. They can see which behaviors are trending, which configurations caused regressions, and what to actually fix. We've raised $30M+ across two rounds in the past five months. Our investors include Lightspeed, SV Angel, Valor Equity Partners, Nova Global, Chris Manning, Michael Ovitz, Michael Abbott, Cory Levy, Kevin Hartz, and others. The Role: We are looking for Research Engineers to build AI systems that use agent interaction data to help us understand how agents behave, evaluate them at scale, and improve them through learning and feedback. Your research will not live on a whiteboard. You'll work directly with real-world agent data, apply frontier methods in production, and see your work ship immediately into the product. By making agent behavior measurable and debuggable, your systems will support teams deploying agents across finance, legal, operations, and other high-stakes workflows. You will own projects end-to-end, with significant autonomy, and work closely with the team to build self-improving agent systems.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed