Research Engineer, Reward Models Training

Anthropics Technology LtdSan Francisco, CA
5d$350,000 - $500,000Hybrid

About The Position

Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. Reward models are a critical component of how we align our AI systems with human values and preferences, serving as the bridge between human feedback and model behavior. In this role, you'll build the infrastructure that enables us to train reward models efficiently and reliably, scale to increasingly large model sizes, and incorporate diverse forms of human feedback across multiple domains and modalities. You will own the end-to-end engineering of reward model training at Anthropic. You'll work at the intersection of machine learning systems and alignment research, partnering closely with researchers to translate novel techniques into production-grade training pipelines. This is a high-impact role where your work directly contributes to making Claude more helpful, harmless, and honest. Note: For this role, we conduct all interviews in Python.

Requirements

  • Have significant experience building and maintaining large-scale ML systems
  • Are proficient in Python and have experience with ML frameworks such as PyTorch
  • Have experience with distributed training systems and optimizing ML workloads for efficiency
  • Are comfortable working with large datasets and building data pipelines at scale
  • Can balance research exploration with engineering rigor and operational reliability
  • Enjoy collaborating closely with researchers and translating research ideas into reliable engineering systems
  • Are results-oriented with a bias towards flexibility and impact
  • Can navigate ambiguity and make progress in fast-moving research environments
  • Adapt quickly to changing priorities, while juggling multiple urgent issues
  • Maintain clarity when debugging complex, time-sensitive issues
  • Pick up slack, even if it goes outside your job description
  • Care about the societal impacts of your work and are motivated by Anthropic's mission

Nice To Haves

  • Training or fine-tuning large language models
  • Reinforcement learning from human feedback (RLHF) or related techniques
  • GPUs, Kubernetes, and cloud infrastructure (AWS, GCP)
  • Building systems for human-in-the-loop machine learning
  • Working with multimodal data (text, images, audio, etc.)
  • Large-scale ETL and data processing frameworks (Spark, Airflow)

Responsibilities

  • Own the end-to-end engineering of reward model training, from data ingestion through model evaluation and deployment
  • Design and implement efficient, reliable training pipelines that can scale to increasingly large model sizes
  • Build robust data pipelines for collecting, processing, and incorporating human feedback into reward model training
  • Optimize training infrastructure for throughput, efficiency, and fault tolerance across distributed systems
  • Extend reward model capabilities to support new domains and additional data modalities
  • Collaborate with researchers to implement and iterate on novel reward modeling techniques
  • Develop tooling and monitoring systems to ensure training quality and identify issues early
  • Contribute to the design and improvement of our overall model training infrastructure

Benefits

  • competitive compensation and benefits
  • optional equity donation matching
  • generous vacation and parental leave
  • flexible working hours
  • a lovely office space in which to collaborate with colleagues
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service