Staff ML Engineer - Infrastructure

ChipStackSan Jose, CA
14d

About The Position

This role offers a unique opportunity to be part of the founding team at ChipStack, where we are reinventing how modern silicon chips are designed. You will work alongside highly experienced chip designers who have built complex chips, ML scientists who have trained LLMs at scale, and top-notch infrastructure and software engineers. You will get to leverage your experience building ML and data infrastructure and apply it to some of the hardest problems in chip design.

Requirements

  • 5+ years of experience in ML infrastructure or adjacent roles
  • Deep expertise in Python and experience with training frameworks like PyTorch or TensorFlow
  • Strong systems engineering skills and experience with distributed training, data pipelines, and performance optimization
  • Experience deploying ML models to production (REST APIs, batch jobs, streaming pipelines)
  • Proficiency with cloud platforms (e.g., GCP, AWS) and containerized systems (Docker, Kubernetes)
  • Experience managing GPU/TPU workloads efficiently
  • Good communication skills and the ability to work directly with engineers and customers
  • Prior experience training or fine-tuning LLMs

Nice To Haves

  • Exposure to chip design fundamentals (via coursework or elsewhere)
  • Experience at an early-stage startup
  • Experience setting up observability, monitoring, and evaluation pipelines for ML models

Responsibilities

  • Building the core infrastructure that enables training, fine-tuning, evaluation, and deployment of LLMs across cloud and on-premise environments.
  • Your work will directly impact product capabilities and speed of iteration.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service