In the ML Data team, our mission is to provide trusted and useful data to efficiently power all of Stack's ML applications end-to-end from labeling to training to safety evaluation. We work hand in hand with AV autonomy teams to provide cutting edge solutions to all their data needs, working across data engineering, ML modeling, and ML infrastructure. In particular, we provide services to find (data mining), curate (datasets), annotate (data labeling), and serve (high throughput data access) data for all ML needs. Training: We are building state of the art infrastructure to support machine learning training and inference workloads using OSS components such as Ray, Spark, and Iceberg. Data Mining: We are building a framework and infrastructure to find interesting events quickly and flexibly. As part of this mission, you would be setting the direction for and helping us build an inference service using LLMs and vector db. Labeling: You would set the direction and build towards auto-labeling. You would be the owner driving labeling needs of the entire company.