Member of Technical Staff, Pre-Training

InceptionSan Francisco, CA
9h

About The Position

The Role We seek experienced scientists and engineers with deep expertise in pre- and mid-training large language models. You will advance our diffusion-based LLM models, developing novel training techniques and pushing the boundaries of parallel token generation.

Requirements

  • BS/MS/PhD in Computer Science or a related field (or equivalent experience).
  • At least 2 years of experience working on ML projects in PyTorch (or equivalent), preferably in a research lab or engineering role.
  • Excellent familiarity with transformers and core LLM concepts (autoregressive pretraining, instruction tuning, in-context learning, KV caching).
  • Familiarity with training and inference in diffusion models.
  • Experience training deep learning models at scale in distributed computing environments.

Nice To Haves

  • Extensive experience training transformer-based language models from scratch.
  • Knowledge of advanced training techniques (mixed precision, gradient accumulation, etc.).
  • Experience with multi-modal learning and cross-modal architectures.
  • Background in optimization theory and neural network architecture design.
  • Experience with LLM serving frameworks like vLLM, SGLang, or TensorRT.

Responsibilities

  • Design, develop, and optimize architectures for diffusion-based language models.
  • Implement innovative training objectives and loss functions for discrete diffusion LLMs.
  • Research and implement techniques for controlled text generation and constraint satisfaction.
  • Develop methods for multi-modal integration within the diffusion framework.
  • Improve model efficiency, reduce training time, and optimize inference throughput.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service