Our client is the monitoring platform built specifically for AI agents. Engineering teams at some of the fastest-growing AI companies use the product to get alerts when their AI agents fail silently — so engineers can dig into the conversations or traces, understand the root cause, and fix it fast. The company is YC-backed (W24) and funded by top-tier investors including a major Tier 1 venture firm, leading AI-focused funds, and angel investors who founded several of the most well-known dev tools and AI companies. AI agents fail constantly — in ways both hilarious and terrifying. Regular software throws exceptions. AI agents fail silently, leaving engineers with almost no visibility into how their agents actually perform. The current status quo is sifting through millions of logs and debugging flaky evals that don't match real-world results. Evals are like unit tests — they confirm a model got specific test cases right. But in the real world, agents call thousands of tools, run for hours, and encounter millions of unpredictable actions. That's where this product comes in. It learns the unique shape of each AI agent's issues — starting from presets like Laziness, Forgetting, or Task Failure, and automatically tuning itself to each agent to discover the unknown unknowns. With one click, AI engineers can track issues or topics across 100% of their production data: frequency over time, how many users are affected, relevant properties, and more. To process hundreds of millions of events, the team gradually trains small custom models — private to each customer — that learn to uniquely understand how each customer's product is used. As part of the early team, you'll play a fundamental role in shaping the company — from strategy and product decisions, to scaling the team, to shaping the future of AI agents.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Entry Level
Education Level
No Education Listed