About The Position

The Seed LLM Post Training team is responsible for researching cutting-edge posttrain technologies and providing core posttrain capabilities for unified multimodal large models. The team's goal is to research and explore next-generation advanced technologies such as SFT, RM, RL, and self-learning during the posttrain phase, while significantly optimizing and improving key areas including reasoning, coding, agent, and omni model. PhD internships at ByteDance provide students with the opportunity to actively contribute to our products and research, and to the organization's future plans and emerging technologies. Our dynamic internship experience blends hands-on learning, enriching community-building and development events, and collaboration with industry experts. Applications will be reviewed on a rolling basis - we encourage you to apply early. Please state your availability clearly in your resume (Start date, End date).

Responsibilities

  • Design and train reward models that reflect nuanced human preferences in LLM outputs.
  • Develop and evaluate components of a Reward Model System that integrates model predictions, verifier feedback, tool usage, and agent signals to produce reliable, generalizable reward estimates.
  • Develop reward models to enhance controllability and instruction-following performance, especially in scenarios involving complex, multi-part user requests.
  • Contribute to data selection and synthesis pipelines that improve post-training data quality, leveraging reward signals to expand the model's capabilities.
  • Research scalable methods for learning from pairwise comparisons, rankings, or human demonstrations across diverse tasks.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Career Level

Intern

Industry

Publishing Industries

Education Level

No Education Listed

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service