Senior Open Source Engineer

LanceDBSan Francisco, CA
1dRemote

About The Position

LanceDB is a developer-friendly, open-source database for multimodal AI. From hyper-scalable vector search to advanced retrieval for RAG, from streaming training data to interactive exploration of large-scale AI datasets, LanceDB is the best foundation for your AI application, and powers some of the most groundbreaking applications and challenging requirements today. We’re looking for a Senior Open Source Engineer to help expand the reach of Lance and LanceDB within the broader data infrastructure ecosystem. You’ll work at the intersection of high-performance computing, big data, and open-source systems—driving integrations, improving distributed operations, and contributing to projects across the Apache and AI communities.

Requirements

  • 10+ years of experience building high-performance databases, big data systems, or large-scale data services
  • Deep understanding of internals of open-source Big Data or AI training systems (e.g., Hadoop, Spark, Flink, Ray, Iceberg, Delta Lake, Hudi, ClickHouse, Trino, Presto, PyTorch, or JAX)
  • Strong experience with high-performance computing in Java or Scala
  • Experience with Rust (or willingness to learn it)
  • Proven ability to move fast, work independently, and collaborate with a high-caliber team

Nice To Haves

  • Contributor, committer, or PMC member in Apache or other large open-source projects
  • Experience with Java, Rust, C++, Apache Arrow, DataFusion, Parquet, Iceberg, or Delta Lake
  • Track record of driving large features or integrations in distributed systems
  • Strong community presence and passion for open-source collaboration

Responsibilities

  • Driving open-source community efforts to integrate the Lance format with Spark, Hive Metastore, Presto, Trino, Ray, and other data infrastructure systems
  • Designing and maintaining efficient distributed Lance dataset operations
  • Building efficient indices to enable predicate pushdown and accelerate queries in Spark, Ray, or Trino
  • Working on table formats, data encodings, and various aspects of the Lance format in Rust
  • Operating and improving internal data processing infrastructure
  • Promoting the Lance format in open-source communities and at Big Data conferences

Benefits

  • A key role shaping an open-source project with real production usage
  • Remote-first team with flexible hours
  • Competitive compensation, equity, and benefits
  • Generous learning budget and support for open-source contributions
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service