The Content Foundations team builds the systems that power how content is uploaded, processed, and delivered across Scribd products. This includes everything from ingestion, metadata extraction, early quality controls, and the core artifacts that power search, recommendations, AI/ML systems, and the reading and listening experience. This role is interesting because you'll be joining an experienced team working at the boundary between messy, real-world content and highly structured systems, where file formats vary and metadata inconsistencies become amplified at scale. Scribd operates a hybrid catalog of premium publisher content and user-generated uploads, spanning diverse formats, decade-old systems, and modern services evolving alongside them. Your contributions will directly impact downstream teams and ultimately our customers. Current focus areas include: Improvements to upload flow Content quality and early-stage validation OCR and content extraction for downstream ML/LLM use cases Evolving content formats to support downstream AI workflows Making content and metadata more accessible for downstream systems to consume
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
No Education Listed