OpenAI’s Inference team ensures that our most advanced models run efficiently, reliably, and at scale. We build and optimize the systems that power our production APIs, internal research tools, and experimental model deployments. As model architectures and hardware evolve, we’re expanding support for a broader set of compute platforms - including AMD GPUs - to increase performance, flexibility, and resiliency across our infrastructure. We are forming a team to generalize our inference stack - including kernels, communication libraries, and serving infrastructure - to alternative hardware architectures.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Number of Employees
1,001-5,000 employees