Reddit is a community of communities. It’s built on shared interests, passion, and trust, and is home to the most open and authentic conversations on the internet. Every day, Reddit users submit, vote, and comment on the topics they care most about. With 100,000+ active communities and approximately 126 million daily active unique visitors, Reddit is one of the internet’s largest sources of information. For more information, visit www.redditinc.com. Who We Are: The Machine Learning Platform team at Reddit is a high-impact team that owns the infrastructure that powers recommendations, content discovery, user and content quantification, while directly impacting other teams such as Growth, Ads, Feeds, and Core Machine Learning teams. What You’ll Do: As a Staff Machine Learning Engineer, you will lead the development of a large-scale ML Inference Platform at Reddit. Lead the end-to-end design, implementation, and maintenance of a highly available, low-latency GPU-based model serving system for search, ranking, and LLMs supporting Millions of QPS. Design and develop ML and Generative AI systems in cloud-based production environments on Kubernetes at scale. Rapidly develop prototypes and develop a high-performance feature hydration and processing system as a part of the inference stack - including routing, caching, and batching. Lead a unified GPU model export framework to support converting trained models into optimized GPU inference models. Strong understanding of real-time ML observability to track feature/model performance. Experience working with LLM serving online at scale. Built an E2E inference performance benchmarking framework Deep Understanding of multi-cluster compute environment and network topology that is specific to ML inference use cases.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
No Education Listed