Machine Learning Engineer — Inference

Fundamental Research LabsMenlo Park, CA
91d

About The Position

As our Machine Learning Engineer (Inference), you’ll push the limits of frameworks, refine our agent architecture, and build the benchmarks that define performance at scale. You’ll help take our frontier models from the lab into lightning-fast production-ready services. If you relish experimenting with the latest serving research, building optimizations, and shipping infrastructure for researchers, then we invite you to apply!

Requirements

  • Strong experience in distributed systems and low-latency ML serving
  • Skilled with performance optimization tools and techniques, and experienced in developing solutions for critical performance gains
  • Hands-on with vLLM, SGLang, or equivalent frameworks
  • Familiarity with GPU optimization, CUDA, and model parallelism
  • Comfort working in a high-velocity, ambiguity-heavy startup environment

Responsibilities

  • Architect and optimize high-performance inference infrastructure for large foundation models
  • Benchmark and improve latency, throughput, and agent responsiveness
  • Work with researchers to deploy new model architectures and multi-step agent behaviors
  • Implement caching, batching, and prioritization to handle high-volume requests
  • Build monitoring and observability into inference pipelines

Benefits

  • Generous salary
  • Additional benefits to be discussed during the hiring process
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service