Staff Software Engineer, ML Platform

Stack AV•Pittsburgh, PA

240d

About The Position

In the ML Data team, our mission is to provide trusted and useful data to efficiently power all of Stack's ML applications end-to-end from labeling to training to safety evaluation. We work hand in hand with AV autonomy teams to provide cutting edge solutions to all their data needs, working across data engineering, ML modeling, and ML infrastructure. In particular, we provide services to find (data mining), curate (datasets), annotate (data labeling), and serve (high throughput data access) data for all ML needs. Training: We are building state of the art infrastructure to support machine learning training and inference workloads using OSS components such as Ray, Spark, and Iceberg. Data Mining: We are building a framework and infrastructure to find interesting events quickly and flexibly. As part of this mission, you would be setting the direction for and helping us build an inference service using LLMs and vector db. Labeling: You would set the direction and build towards auto-labeling. You would be the owner driving labeling needs of the entire company.

Requirements

Experience with both ML platforms and building ML-based applications (modeling experience is a bonus).
Proven track record of building scalable, reliable infrastructure in a fast-paced environment.
Ability to collaborate effectively across teams.
Experience building or using ML infrastructure for a large number of customer teams.
Deep understanding of design trade-offs with the ability to articulate those trade-offs and achieve alignment with others.
Experience in building ML models or infrastructure in domains such as autonomous vehicles, perception, and decision-making (desirable but not required).
Experience with model training, model optimization, or large data processing pipelines.
Prior experience in autonomous vehicles (AV) is a plus.

Responsibilities

Push the GPU to its limit from Python to CUDA kernel level.
Build the inference or training loop for large models, ideally with LLM flavor.
Ship ML products (NLP, computer vision, recommender systems, etc.) at scale to make a business impact.
Develop data platform infrastructure for real-time querying/vector databases and batch/stream processing using technologies like Ray, Spark, or similar.
Create Parquet-based object storage solutions (data lake/data warehouse).
Build low latency/high throughput batch or stream processing pipelines.
Write (readable) high-performance C++ code.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Number of Employees

101-250 employees

Staff Software Engineer, ML Platform

About The Position

Requirements

Responsibilities

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company