Senior Staff Software Engineer, Indexing & Retrieval Platform

4h•Remote

About The Position

The ML Indexing & Retrieval Platform team at Reddit is responsible for building and scaling the core infrastructure that powers machine learning driven recommendations. We design and maintain systems for ML data ingestion, low-latency retrieval services, and end-to-end lifecycle management of data. With a focus on performance, reliability, and scalability, we enable real-time access to high-quality data that supports a wide range of applications, including Content Understanding, Semantic, Lexical retrieval & GenAI applications. You’ll lead the development of next-generation ML Indexing & Retrieval systems, owning the full lifecycle from ideation to production and going beyond incremental improvements to reimagine core platform capabilities. As part of a high-impact, cross-functional team, you’ll solve complex technical challenges to build scalable, reliable platforms that empower developers to efficiently ship critical ML features.

Requirements

10+ years of experience in software engineering, specializing in Indexing and Retrieval systems.
3+ years in technical leadership, architecting and scaling distributed systems in production environments.
Deep expertise in large-scale data platforms, including batch indexing and stream processing.
Proven experience designing and operating large-scale, low-latency retrieval services.
Expertise in lexical and vector search retrieval technologies, such as Milvus, Vespa, or Elasticsearch.
Skilled in designing cloud-native architectures and managing containerized workloads using Kubernetes and AWS/GCP.
Adept at translating complex technical challenges into clear, actionable strategies.
Strong communicator and mentor who leads through collaboration, influence, and technical excellence.
Languages: Go, Java, Python, or any object oriented programming language
Frameworks: Flink, Airflow, Spark for large scale batch & stream processing
Databases: Familiarity with Vector, Lexical & Key-Value Databases
Tools: Kubernetes, Docker, AWS, GCP

Responsibilities

Lead the technical strategy, architecture, and implementation of Reddit’s next-generation ML Indexing & Retrieval engine, integrating capabilities across lexical and vector indexing, low-latency retrieval, and emerging GenAI applications.
Partner closely with product engineers across Content Understanding, Search, Feeds, Ads, Growth, and Safety to deliver high-quality experiences.
Define best practices for observability, reliability, and operational excellence in large-scale distributed systems.
Mentor and guide engineers in designing scalable infrastructure and adopting robust DevOps and SRE principles.
Collaborate with infrastructure, and ML teams to ensure the platform evolves to meet the needs of Reddit’s growing user base and diverse content ecosystem.

Benefits

Comprehensive Healthcare Benefits and Income Replacement Programs
401k with Employer Match
Global Benefit programs that fit your lifestyle, from workspace to professional development to caregiving support
Family Planning Support
Gender-Affirming Care
Mental Health & Coaching Benefits
Flexible Vacation & Paid Volunteer Time Off
Generous Paid Parental Leave
medical, dental, and vision insurance
401(k) program with employer match
generous time off for vacation
parental leave