About The Position

The ML Indexing & Retrieval Platform team at Reddit is responsible for building and scaling the core infrastructure that powers machine learning driven recommendations. We design and maintain systems for ML data ingestion, low-latency retrieval services, and end-to-end lifecycle management of data. With a focus on performance, reliability, and scalability, we enable real-time access to high-quality data that supports a wide range of applications, including Content Understanding, Semantic, Lexical retrieval & GenAI applications. You’ll lead the development of next-generation ML Indexing & Retrieval systems, owning the full lifecycle from ideation to production and going beyond incremental improvements to reimagine core platform capabilities. As part of a high-impact, cross-functional team, you’ll solve complex technical challenges to build scalable, reliable platforms that empower developers to efficiently ship critical ML features.

Requirements

  • 10+ years of experience in software engineering, specializing in Indexing and Retrieval systems.
  • 3+ years in technical leadership, architecting and scaling distributed systems in production environments.
  • Deep expertise in large-scale data platforms, including batch indexing and stream processing.
  • Proven experience designing and operating large-scale, low-latency retrieval services.
  • Expertise in lexical and vector search retrieval technologies, such as Milvus, Vespa, or Elasticsearch.
  • Skilled in designing cloud-native architectures and managing containerized workloads using Kubernetes and AWS/GCP.
  • Adept at translating complex technical challenges into clear, actionable strategies.
  • Strong communicator and mentor who leads through collaboration, influence, and technical excellence.
  • Languages: Go, Java, Python, or any object oriented programming language
  • Frameworks: Flink, Airflow, Spark for large scale batch & stream processing
  • Databases: Familiarity with Vector, Lexical & Key-Value Databases
  • Tools: Kubernetes, Docker, AWS, GCP

Responsibilities

  • Lead the technical strategy, architecture, and implementation of Reddit’s next-generation ML Indexing & Retrieval engine, integrating capabilities across lexical and vector indexing, low-latency retrieval, and emerging GenAI applications.
  • Partner closely with product engineers across Content Understanding, Search, Feeds, Ads, Growth, and Safety to deliver high-quality experiences.
  • Define best practices for observability, reliability, and operational excellence in large-scale distributed systems.
  • Mentor and guide engineers in designing scalable infrastructure and adopting robust DevOps and SRE principles.
  • Collaborate with infrastructure, and ML teams to ensure the platform evolves to meet the needs of Reddit’s growing user base and diverse content ecosystem.

Benefits

  • Comprehensive Healthcare Benefits and Income Replacement Programs
  • 401k with Employer Match
  • Global Benefit programs that fit your lifestyle, from workspace to professional development to caregiving support
  • Family Planning Support
  • Gender-Affirming Care
  • Mental Health & Coaching Benefits
  • Flexible Vacation & Paid Volunteer Time Off
  • Generous Paid Parental Leave
  • medical, dental, and vision insurance
  • 401(k) program with employer match
  • generous time off for vacation
  • parental leave
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service