Machine Learning Engineer — Infrastructure

Fundamental Research LabsMenlo Park, CA
98d

About The Position

As our Machine Learning Infrastructure Engineer, you’ll design and scale the platforms that power cutting-edge AI: from high-performance inference engines to the underlying agent technologies and large-scale compute clusters that keep everything running. You’ll collaborate closely with researchers and product engineers to push the limits of inference performance, build reliable foundations for AI agents, and advance the next generation of training and post-training pipelines.

Requirements

  • Expertise in one or more of: inference engines, GPU optimization, cluster scheduling, or cloud-native infra
  • Familiarity with modern ML frameworks (PyTorch, vLLM, Verl, etc.)
  • Startup-ready mindset (adaptable, fast-moving, high-ownership)

Responsibilities

  • Speed up research development, help researchers explore SOTA and new techniques on day one
  • Build and optimize model training pipeline including data collection, data loading, SFT and RL
  • Optimize a high-performance inference platform on top of both open-source and proprietary inference engines
  • Develop and scale technologies for large-scale cluster scheduling, high-performance distributed training, and AI networking
  • Build a strong engineering discipline across observability and reliability at scale
  • Collaborate with research and product teams to translate breakthroughs into robust, production-ready infrastructure

Benefits

  • Generous salary
  • Additional benefits to be discussed during the hiring process
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service