About The Position

The AI Data Infrastructure team sits between Stack’s data and its engineers. We provide tooling and infrastructure to process petabytes of data per month into actionable insights for ML and autonomy engineers. In particular, we provide data pipelines to run tasks on new data, services and frontend to search for data, and tooling to stream and transform data in our various data storage mediums. Data Processing: we are building state of the art infrastructure to support our scaling fleet and ML engineering needs. For this role, you would be responsible for ensuring our processing pipelines are customer friendly, fault tolerant and performant. Data Search: we provide our end users with a platform that combines searching across our metadata stores, vector database, and parts of our data warehouse and then visualize the data (think YouTube). You would be coming in and working with ML and autonomy stakeholders to improve the reliability of the pipelines that feed into our search platform. Data Lineage: we are getting started with building our data lineage system that will track the usage of all data at Stack. You would work with our stakeholders to ensure our system is widely adopted and address customer pain points.

Requirements

  • Experience building and managing data platforms for multimodal ML needs.
  • Proven track record of building scalable, reliable infrastructure in a fast-paced environment.
  • Ability to collaborate effectively across teams.
  • Experience building or using ML infrastructure for a large number of customer teams.
  • Deep understanding of design trade-offs with the ability to articulate those trade-offs and achieve alignment with others.

Nice To Haves

  • Experience in building ML models or infrastructure in domains such as autonomous vehicles, perception, and decision-making.
  • Prior experience in autonomous vehicles (AV) is a plus.

Responsibilities

  • Architect data processing pipelines that handle multimodal (video, point cloud, audio, text) data for a wide base of customers.
  • Manage core libraries used by the entire company for data transformations and processing.
  • Build low latency/high throughput, fault tolerant batch or stream processing systems.
  • Build scalable backend services for data search and data curation systems.
  • Write high quality Python, SQL and Golang code.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service