We're building autonomous research agents for recursive self-improvement (multi-agent systems that propose, run, and analyze machine learning experiments). We're a small team based in San Francisco, on-site. You'll write and optimize the GPU kernels and supporting systems software that makes our training and inference workloads fast. This is deep, low-level work (performance counters, memory bandwidth, warp-level scheduling) applied to the specific shapes and patterns our models actually use. We hire kernel engineers because the gap between "this works" and "this is fast on the hardware we have" is enormous, and that gap directly bounds what our researchers can try. You'll close that gap.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Education Level
No Education Listed