Research Engineer / Research Scientist, Vision

Anthropic•San Francisco, CA

30d•$1 - $2•Hybrid

About The Position

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. We’re looking for research engineers with a strong computer vision background who believe that visual and spatial reasoning are core to fully unlocking the capabilities of LLMs. In this role, you'll work on research, development, and evaluation for state-of-the-art Claude models, with a focus on visual and spatial capabilities. This role is highly collaborative and will touch many aspects of our broader research efforts, taking a full-stack approach across pretraining, RL, and runtime techniques like agentic harnesses. Additionally, you’ll partner with the product org to ensure that the vision improvements you deliver impact Claude’s performance on real-world tasks.

Requirements

Have 7+ years of ML, computer vision, and software engineering experience through industry, academia, or other projects
Are familiar with the architecture, training, and operation of large vision language models
Have experience creating and evaluating large synthetic and real-world visual training datasets
Have experience engaging in systematic prompting, finetuning, or evaluation
Are results-oriented, with a bias towards flexibility and impact
Enjoy pair programming and cross-team collaboration
Care about the societal impacts of your work

Nice To Haves

Large-scale pretraining, SL, and RL on language models
Deep learning research on images, video, or other modalities
Developing complex agentic systems using LLMs
High-performance ML systems (GPUs, TPUs, JAX, PyTorch)
Large-scale ETL and data pipeline development

Responsibilities

Run experiments to evaluate architectural variants, data strategies, and SL and RL techniques to improve Claude’s vision
Develop and test tools, skills, and agentic infrastructure that enable Claude to reason over visual inputs
Create evaluations and benchmarks that measure progress on multimodal capabilities across training and deployment
Work with our product org to find solutions to our most vexing API customer challenges related to vision and spatial reasoning