Agent Dev Velocity builds the tooling and evaluation backbone that helps Notion ship high-quality AI faster and more safely. We build the infrastructure that makes AI evaluations easy to create, cheap to run, and hard to ignore, so engineers across the AI org can iterate with confidence. In this role, you will work at the intersection of developer tooling, distributed systems, and measurement. You will build systems for running and maintaining evals at scale, and you will help create durable benchmarks and datasets that keep us honest about quality over time. You will help evolve evals into a system, by enabling reusable eval workspaces and data-driven workflows that surface issues through data mining and continuous measurement.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
No Education Listed