LLM Inference Engineer

Periodic Labs•Menlo Park, CA

90d

About The Position

You will integrate, optimize, and operate large-scale inference systems to power AI scientific research. You will build and maintain high-performance serving infrastructure that delivers low-latency, high-throughput access to large language models across thousands of GPUs. You will work closely with researchers and engineers to integrate cutting-edge inference into large-scale reinforcement learning workloads. You will build tools and directly support frontier-scale experiments to make Periodic Labs the world’s best AI + science lab. You will make contributions to open-source LLM inference software.

Requirements

Experience optimizing inference for the largest open-source models.
Familiarity with high-performance model serving frameworks such as TensorRT-LLM, vLLM, SGLang.
Knowledge of distributed inference techniques including tensor/expert/pipeline parallelism, speculative decoding, and KV cache management.
Experience optimizing GPU utilization and latency for reinforcement learning.

Responsibilities

Integrate, optimize, and operate large-scale inference systems.
Build and maintain high-performance serving infrastructure for large language models.
Deliver low-latency, high-throughput access to models across thousands of GPUs.
Work closely with researchers and engineers on large-scale reinforcement learning workloads.
Build tools to support frontier-scale experiments.
Contribute to open-source LLM inference software.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

LLM Inference Engineer

About The Position

Requirements

Responsibilities

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company