Scribd, Inc. is seeking a Senior Software Engineer with deep experience in building event-driven, distributed, and scalable systems in Python. This role is part of the ML Data Engineering team, which powers metadata extraction, enrichment, and content understanding across all Scribd brands. The team processes vast amounts of data, including hundreds of millions of documents and billions of images, to deliver high-quality metadata for content discovery and trust. The systems operate at a massive scale, supporting diverse datasets like user-generated content, ebooks, and audiobooks. This role involves working at the intersection of machine learning, data engineering, and distributed systems, collaborating with applied research and product teams to deploy scalable ML and LLM-powered solutions. The engineer will design and optimize large-scale data and service pipelines on AWS, supporting content enrichment and metadata systems, and work with cross-functional teams to design reliable backend services integrating ML and LLM components. This is an opportunity to work on cutting-edge generative AI and metadata enrichment problems at a global scale.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior