We want to build agents that autonomously validate code changes. Today that looks like AI that reviews pull requests in GitHub, catching bugs and enforcing standards. We’re reviewing close to 1B lines of code a month now for over 2,500 companies. Problems we’re excited about Coding standards can be idiosyncratic and are often poorly documented; can we build agents that learn them through osmosis like a new hire might? Can we identify for each customer what types of PR feedback they do and don’t care about, perhaps using some sample efficient RL, in order to increase signal-to-noise ratio? Some bugs are best caught by running the code, potentially against discerning AI-generated E2E tests. Can we autonomously deploy feature branches and use agents to parallel try to break the application to detect bugs? Trajectory Went from 0 ---> XM in <12 Months and growing >25% MoM 2,500+ customers Raised 30M+ led by Benchmark, along with continued support from YC, Paul Graham, Initialized, SV Angel, etc. Team We have assembled a small, talent dense team who have scaled critical functions at companies like Stripe, Google, Figma, LinkedIn, etc.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed