Harvey-posted about 1 month ago
$320,000 - $360,000/Yr
Full-time • Director
San Francisco, CA
251-500 employees

At Harvey, we’re building the AI platform for the world’s top legal and professional services teams. As we scale, our data team sits at the heart of this mission—turning raw data and research into robust, intelligent systems that power reasoning at scale. Our Data Team powers Harvey’s ability to understand and leverage both public and private data at scale — building the infrastructure that ingests, transforms, and retrieves millions of documents to make our AI systems smarter every day. We’re looking for a Director of Engineering, Data to lead this function into its next chapter. You’ll shape the strategy, architecture, and team behind the systems that make advanced reasoning possible. The Data team owns end-to-end retrieval-augmented generation (RAG) stacks across complex domains — including Case Laws, Legislation, and Tax codes across 50+ international jurisdictions. As generation and reasoning improve, retrieval quality has become the new frontier. Solving it at scale is one of Harvey’s top priorities. If you’re excited by large-scale data engineering, complex information retrieval, and building the backbone of cutting-edge AI systems, we’d love to talk.

  • Lead and scale the Data organization from a single high-performing team into multiple specialized teams.
  • Partner closely with leadership to define the strategic roadmap for Harvey’s data ecosystem and ensure it scales with our global growth
  • Own and evolve Harvey’s end-to-end data architecture — from ingestion to transformation, storage, retrieval, and delivery — ensuring performance, reliability, and scalability to power LLMs and downstream applications.
  • Design and oversee large-scale data ingestion pipelines that aggregate, normalize, and maintain data from thousands of heterogeneous, publicly available legal and regulatory sources across global jurisdictions.
  • Integrate private and partner data sources, ensuring robust access controls, lineage tracking, and compliance with security and privacy requirements.
  • Evaluate and implement data infrastructure technologies to support large-scale document processing, embedding generation, vector storage, and retrieval optimization.
  • Collaborate closely with the Applied AI team to drive experimentation and model improvements that directly enhance AI quality and differentiation across Harvey’s products.
  • Drive the development of end-to-end research experiences that weave together our retrieval, reasoning, and UX layers — transforming AI insights into intuitive, lawyer-friendly workflows that redefine how professionals engage with complex information.
  • Partner cross-functionally with Product Engineering, Applied AI, Research, and Platform teams to deliver high-quality, production-ready systems.
  • You have 10+ years of experience in data engineering, data architecture, or platform engineering, with 5+ years of leading high-performance teams.
  • You’ve led data or ML infrastructure teams through scale — from startup to multi-team org.
  • You have a proven track record of building and scaling distributed data systems handling large, complex, and heterogeneous datasets.
  • You bring depth in backend, data infrastructure, or information retrieval, with a strong appreciation for applied AI.
  • You value clarity, craftsmanship, and high trust as the foundations of great engineering.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service