About The Position

About The Company: At Scribd Inc. (pronounced “scribbed”), our mission is to spark human curiosity. Join our team as we create a world of stories and knowledge, democratize the exchange of ideas and information, and empower collective expertise through our four products: Everand, Scribd, Slideshare, and Fable. This posting reflects an approved, open position within the organization. We support a culture where our employees can be real and be bold; where we debate and commit as we embrace plot twists; and where every employee is empowered to take action as we prioritize the customer. When it comes to workplace structure, we believe in balancing individual flexibility and community connections. It’s through our flexible work benefit, Scribd Flex, that employees – in partnership with their manager – can choose the daily work-style that best suits their individual needs. A key tenet of Scribd Flex is our prioritization of intentional in-person moments to build collaboration, culture, and connection. For this reason, occasional in-person attendance is required for all Scribd Inc. employees, regardless of their location. So what are we looking for in new team members? Well, we hire for “GRIT”. The textbook definition of GRIT is demonstrating the intersection of passion and perseverance towards long term goals. At Scribd Inc., we are inspired by the potential that this can unlock, and ask each of our employees to pursue a GRIT-ty approach to their work. In a tactical sense, GRIT is also a handy acronym that outlines the standards we hold ourselves and each other to. Here’s what that means for you: we’re looking for someone who showcases the ability to set and achieve G oals, achieve R esults within their job responsibilities, contribute I nnovative ideas and solutions, and positively influence the broader T eam through collaboration and attitude. About the Team and Role The ML Data Engineering team is the backbone of Scribd’s commitment to a safe and trustworthy library. We build high-throughput, ML-driven data pipelines that process hundreds of millions of documents to detect, classify, and mitigate untrustworthy content. As the Manager of ML Data Engineering , you will lead a specialized team of engineers responsible for building scalable ML based foundations that can detect and deal with harmful content. You aren't just moving data; you are building the infrastructure that allows ML models to reason across our entire corpus in batch and real-time. Your team’s work ensures that our safety classifiers, and automated policy enforcement tools are performant, scalable, and resilient. You will sit at the intersection of Big Data, AI, MLOps, and Platform Integrity, directly impacting the safety of millions of our users.

Requirements

  • Leadership Experience: 8+ years of total engineering experience, with 3+ years specifically in a people management or technical lead role within a Data or ML Engineering organization.
  • Scale Expertise: Proven track record of building and operating production-grade data pipelines at massive scale (100M+ entities) using technologies like Spark, Flink, Kafka, or Airflow.
  • ML Infrastructure Fluency: Deep understanding of the ML lifecycle, including feature engineering, model deployment (MLOps), and vector databases (e.g., Pinecone, Milvus, or Weaviate).
  • Trust & Safety Context: Prior experience building systems for content moderation, fraud detection, spam prevention, or digital rights management.
  • Technical Breadth: Strong proficiency in Python, Scala, or Go, and experience with cloud-native infrastructure (AWS/GCP, Kubernetes, and Snowflake/BigQuery).
  • Strategic Communication: Ability to explain complex architectural trade-offs to non-technical stakeholders in Legal, Policy, and Product.

Nice To Haves

  • LLM Pipelines: Experience building RAG (Retrieval-Augmented Generation) pipelines or managing the data infra for fine-tuning Large Language Models.
  • UGC Experience: Background working with large-scale User Generated Content (UGC) ecosystems and the unique challenges of unstructured document data.
  • Regulatory Knowledge: Familiarity with the technical requirements of global safety regulations such as the Digital Services Act (DSA) or the UK Online Safety Act.
  • Adversarial Mindset: Experience building systems that must defend against malicious actors and evolving platform abuse patterns.

Responsibilities

  • Lead and grow a high-performing engineering team: Manage, mentor, and recruit a world-class team of data and ML engineers. Foster a culture of technical excellence, operational rigor, and deep empathy for the user safety mission.
  • Architect scalable ML data pipelines: Design and oversee the development of distributed data processing systems capable of handling hundreds of millions of documents. Ensure these pipelines support both batch and real-time inference for content moderation and risk detection.
  • Build the "Trust" scores: Develop and maintain the foundational data layers - including semantic embeddings, metadata extracts, and behavioral signals - that power our Content Trust ML models.
  • Partner on AI/LLM Integration: Work closely with the Search & Discovery and Applied Research teams to integrate ML/LLM-based reasoning into our trust pipelines, enabling more nuanced understanding of complex policy violations.
  • Drive Operational Excellence: Establish SLAs for infrastructure, ensuring our automated enforcement systems are both fast and explainable.
  • Cross-functional Leadership: Collaborate with Product Managers (Content Trust), Legal/Policy teams, and Data Science to translate evolving regulatory requirements (like the DSA) into robust technical architectures.

Benefits

  • Healthcare Insurance Coverage (Medical/Dental/Vision): 100% paid for employees
  • 12 weeks paid parental leave
  • Short-term/long-term disability plans
  • 401k/RSP matching
  • Onboarding stipend for home office peripherals + accessories
  • Learning & Development allowance
  • Learning & Development programs
  • Quarterly stipend for Wellness, WiFi, etc.
  • Mental Health support & resources
  • Free subscription to the Scribd Inc. suite of products
  • Referral Bonuses
  • Book Benefit
  • Sabbaticals
  • Company-wide events
  • Team engagement budgets
  • Vacation & Personal Days
  • Paid Holidays (+ winter break)
  • Flexible Sick Time
  • Volunteer Day
  • Company-wide Employee Resource Groups and programs that foster an inclusive and diverse workplace.
  • Access to AI Tools: We provide free access to best-in-class AI tools, empowering you to boost productivity, streamline workflows, and accelerate bold innovation.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service