Software Engineer, Research Data Platform

AnthropicSan Francisco, CA
Hybrid

About The Position

The Research Data Platform team builds the tools that Anthropic's researchers use every day to manage, query, and analyze the data that goes into training and evaluating frontier models. We power the internal applications researchers rely on to monitor RL runs, explore finetuning datasets, and understand what's happening inside their experiments. We're looking for engineers who love working directly with users and who excel at building data products — the pipelines that move data out of training runs into queryable storage, and the APIs, libraries, and services researchers use to manage and explore it. This role sits closer to the research workflow than a typical data infrastructure position: you'll often embed with research teams, build ML-specific tooling alongside them, and leverage what our Data Infrastructure team has already built rather than reinventing it. We do not require prior ML or AI training experience. If you enjoy working closely with technical users, learning new domains quickly, and building tools people actually want to use, you'll pick up the research context fast.

Requirements

  • Have significant software engineering experience, particularly building data-intensive applications or internal tooling
  • Enjoy working directly with users, gathering requirements iteratively, and shipping things that get adopted
  • Are results-oriented, with a bias towards flexibility and impact
  • Pick up slack, even if it goes outside your job description
  • Want to learn more about machine learning research
  • Care about the societal impacts of your work
  • Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience
  • Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience
  • Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position

Nice To Haves

  • Large-scale ETL, columnar storage formats, and query engines (e.g., Spark, BigQuery, DuckDB, Parquet)
  • High-volume time series data — ingestion, storage, and efficient querying
  • Data cataloging, lineage, or metadata management systems
  • ML experiment tracking or metrics platforms
  • Working in environments where engineers partner closely with quantitative users — research labs, trading firms, observability or analytics startups
  • Complex data visualization and full-stack web application development

Responsibilities

  • Build and operate data pipelines that extract data from research training runs and land it in storage systems that are easy and fast to query
  • Work closely with researchers to design and build APIs, libraries, and web interfaces that support data management, exploration, and analysis
  • Develop dataset management, data cataloging, and provenance tooling that researchers use in their day-to-day work
  • Embed with research teams to understand their workflows, identify high-leverage tooling opportunities, and ship solutions quickly
  • Collaborate with adjacent teams to build on existing systems rather than reinventing them

Benefits

  • competitive compensation and benefits
  • optional equity donation matching
  • generous vacation and parental leave
  • flexible working hours
  • a lovely office space in which to collaborate with colleagues
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service