AfterQuery builds the training data and evaluation infrastructure that frontier AI labs use to make their models better. We work with the world's leading labs to design high signal datasets and run rigorous evaluations that go beyond static benchmarks. We are a small, early team (post Series A) where individual contributors have a direct impact on how the next generation of models learn and improve. The Role You'll design the datasets and evaluation frameworks that shape how frontier models are trained and measured. Working directly with research teams at top AI labs, you'll experiment with data collection strategies, diagnose model failure modes, and develop the metrics that determine whether a model is actually getting better. This is hands-on, high leverage work: you'll go from hypothesis to live experiment quickly, and your output will directly influence model training runs at scale.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Entry Level