Inception creates the world’s fastest, most efficient AI models. Our Mercury model is the world’s fastest reasoning LLM and first commercially available diffusion LLM, delivering 5x greater speed and efficiency than today’s LLMs, with best-in-class quality. We are the AI researchers and engineers behind such breakthrough AI technologies as diffusion models, flash attention, and DPO. We are looking for engineers and scientists to design, optimize, and maintain the compute foundations that power large-scale language model training. You will develop high-performance ML kernels (e.g., CUDA, CuTe, Triton), enable efficient low-precision arithmetic, and improve the distributed compute stack that makes training large models possible. Your work will make inference faster, more cost-effective, and more reliable.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
Ph.D. or professional degree