Member of Technical Staff, ML Systems

NetpremeSanta Clara, CA
Onsite

About The Position

We’re looking for a motivated LLM Systems Engineer willing to explore new and unconventional inference systems based on emerging hardware. This role is part engineering, part research – you’ll be responsible for searching and prototyping various algorithms suitable for our inference hardware, as well as guiding our hardware team on the product definition. The ideal candidate has a proven track of record of pursuing ML systems research, and is very familiar with industry-standard LLM inference systems. This role will be performed on-site from one of our offices in Santa Clara, CA or Boston, MA.

Requirements

  • MS or PhD in computer systems, ideally with a focus on LLM inference and/or distributed systems.
  • Prior experience contributing to the core LLM inference infrastructures (vLLM, SGLang, TensorRT, etc.).
  • Prior experience in accelerator programming (e.g. CUDA, JAX/Pallas, ROCm).

Nice To Haves

  • Advanced computer architectures and performance engineering skills is a big plus.

Responsibilities

  • Prototype and optimize emerging ML inference systems.
  • Develop novel memory models for expandable vRAM.
  • Write efficient GPU kernels for data movement.
  • Perform design-space exploration, implementation, and benchmarking of inference engines, both in simulations and on real hardware.

Benefits

  • Competitive salary commensurate with experience including base salary, incentive-based bonus, and early stage equity grant.
  • Comprehensive benefits including health, dental, vision, and life insurance.
  • Well-equipped, sunny offices in Santa Clara, CA and Boston, MA.
  • Relocation assistance and visa sponsorship.
  • Perks include a daily lunch stipend, 401k match, and more.
  • A collaborative, continuous-learning work environment with smart, dedicated colleagues engaged in developing the next generation of architecture for high-performance computing.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service