About The Position

The AI Data Infrastructure team sits between Stack’s data and its engineers. We provide tooling and infrastructure to process petabytes of data per month into actionable insights for ML and autonomy engineers. In particular, we provide data pipelines to run tasks on new data, services and frontend to search for data, and tooling to stream and transform data in our various data storage mediums. Data Processing: we are building state of the art infrastructure to support our scaling fleet and ML engineering needs. For this role, you would be responsible for ensuring our processing pipelines are customer friendly, fault tolerant and performant. Data Search: we provide our end users with a platform that combines searching across our metadata stores, vector database, and parts of our data warehouse and then visualize the data (think YouTube). You would be coming in and working with ML and autonomy stakeholders to improve the reliability of the pipelines that feed into our search platform. Data Lineage: we are getting started with building our data lineage system that will track the usage of all data at Stack. You would work with our stakeholders to ensure our system is widely adopted and address customer pain points.

Requirements

  • Proven track record of building scalable, reliable infrastructure in a fast-paced environment.
  • Ability to collaborate effectively across teams.
  • Strong development experience with Python and SQL.

Nice To Haves

  • Prior experience with Apache Iceberg, Trino, SQLMesh, Flyte / Airflow, and Kubernetes are a plus.
  • Prior experience building and managing data platforms for multimodal ML needs is a plus.
  • Prior experience with agentic workflows is a plus.
  • Prior experience in autonomous vehicles (AV) is a plus.

Responsibilities

  • Play a key role in designing and building the next generation of data infrastructure.
  • Build low latency/high throughput, fault tolerant batch or stream processing systems.
  • Build scalable backend services for data search and data curation systems.
  • Write high quality Python and SQL.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service