About The Position

ByteDance, founded in 2012, is on a mission to inspire creativity and enrich life through its diverse suite of products, including TikTok and various platforms tailored for the Chinese market. The company emphasizes the importance of creation, innovation, and teamwork in achieving its goals. The Doubao (Seed) Team, established in 2023, focuses on pioneering advanced AI foundation models, with research areas including deep learning, reinforcement learning, and AI safety. The Machine Learning (ML) System sub-team is dedicated to developing and maintaining distributed ML training and inference systems globally, providing high-performance and reliable systems for LLM/AIGC/AGI.

Responsibilities

  • Responsible for the design and development of Machine Learning infrastructure for LLM/AIGC, etc
  • Build up a super large machine learning system integrating GPUs, RDMA networking, and high-performance storage
  • Responsible for solving technical problems such as high stability and availability of the system
  • Organize and coordinate multiple teams to complete the construction of the system, including Data center team, network team, computing team, storage team, resource team.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service