Machine Learning Engineer — Inference

Fundamental Research Labs•Menlo Park, CA

91d

About The Position

As our Machine Learning Engineer (Inference), you’ll push the limits of frameworks, refine our agent architecture, and build the benchmarks that define performance at scale. You’ll help take our frontier models from the lab into lightning-fast production-ready services. If you relish experimenting with the latest serving research, building optimizations, and shipping infrastructure for researchers, then we invite you to apply!

Requirements

Strong experience in distributed systems and low-latency ML serving
Skilled with performance optimization tools and techniques, and experienced in developing solutions for critical performance gains
Hands-on with vLLM, SGLang, or equivalent frameworks
Familiarity with GPU optimization, CUDA, and model parallelism
Comfort working in a high-velocity, ambiguity-heavy startup environment

Responsibilities

Architect and optimize high-performance inference infrastructure for large foundation models
Benchmark and improve latency, throughput, and agent responsiveness
Work with researchers to deploy new model architectures and multi-step agent behaviors
Implement caching, batching, and prioritization to handle high-volume requests
Build monitoring and observability into inference pipelines

Benefits

Generous salary
Additional benefits to be discussed during the hiring process

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Machine Learning Engineer — Inference

About The Position

Requirements

Responsibilities

Benefits

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company