Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. Reward models are a critical component of how we align our AI systems with human values and preferences, serving as the bridge between human feedback and model behavior. In this role, you'll build the infrastructure that enables us to train reward models efficiently and reliably, scale to increasingly large model sizes, and incorporate diverse forms of human feedback across multiple domains and modalities. You will own the end-to-end engineering of reward model training at Anthropic. You'll work at the intersection of machine learning systems and alignment research, partnering closely with researchers to translate novel techniques into production-grade training pipelines. This is a high-impact role where your work directly contributes to making Claude more helpful, harmless, and honest. Note: For this role, we conduct all interviews in Python.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Number of Employees
11-50 employees