Systems PhD - Software Engineer

Databricks•Seattle, WA

20d

About The Position

Databricks is radically simplifying the entire data lifecycle, from ingestion to generative AI and everything in-between. We’re doing it cross-cloud with a unified platform, currently serving over 10k customers, processing exabytes of data/day on 15+ million VMs, and growing exponentially. To make it happen we’re building multi-cloud systems at every corner of the data ecosystem, from query engines, vector databases, training pipelines, and storage systems, down to the infrastructure that allows them to scale like auto-sharders, caches, and load balancers, just to name a few. We also build and support the tooling, languages, and stacks that bring it together. Basically, we do it all. The space we work in and the problems we solve are massive, complex, and very deep (our published work on Lakehouse, Delta lake, and Photon are a testament to that). We’re looking for practitioners who are eager to work with the best in industry to push the boundaries of what’s possible for our customers. If you’re truth seeking, data driven, and love to operate from first principles (head fake: our core values), then Databricks is the place for you. As a part of the Database Engine team, there are opportunities to design and implement in many areas that leapfrog existing state-of-the-art systems: Query compilation & optimization Distributed query execution and scheduling Vectorized engine execution Data security Resource Management Transaction coordination Efficient storage structures (encoding, indexes) Automatic physical data optimization

Requirements

PhD in databases or systems
A passion for database systems, storage systems, distributed systems, language design, and/or performance optimization
Motivated by delivering customer value and impact

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume