Senior AI Data Engineer

Scribd•San Francisco, CA

49d•Hybrid

About The Position

Scribd, Inc. is looking for a Senior AI Data Engineer to lead AI engineering workstreams on the Data Platform team. This role involves building data infrastructure for AI use cases, supporting stakeholders in building data products with AI, and accelerating the team's development through AI tooling. The Data Platform team is responsible for the data infrastructure, governance, and enablement that powers the company, focusing on making trusted data accessible and driving AI adoption.

Requirements

5+ years of data engineering experience, with at least 1 year focused on AI/ML infrastructure or LLM-powered applications.
Strong proficiency in Python and SQL; comfort working across the full data stack from ingestion to serving.
Hands-on experience with Databricks and cloud data platforms (Unity Catalog experience a strong plus).
Experience building or integrating NLP/LLM-based systems (RAG pipelines, semantic search, agent frameworks, or natural language interfaces).
Working knowledge of how modern LLMs are trained, aligned, and evaluated (RLHF, fine-tuning, prompt engineering, retrieval patterns) and the judgment to know when each approach is the right tool.
A solid understanding of data governance, access control, and building on top of trusted data.
A security-first mindset when building AI surfaces, including secret management, encryption, and responsible handling of sensitive and PII data.
The ability to work autonomously on ambiguous problems and drive them to production.
Strong communication skills; ability to explain complex systems to technical and non-technical audiences.

Nice To Haves

Familiarity with metrics-as-code frameworks or semantic layer tooling (e.g., Statsig, dbt Semantic Layer).
Prior experience in a platform or infrastructure team serving internal stakeholders.
Experience evaluating AI model outputs, including building eval harnesses, defining quality metrics, and catching regressions before they reach production.
Working knowledge of terraform, or strong systems thinking to reason confidently about infrastructure changes without necessarily owning the code.

Responsibilities

Own the deployment path for Databricks Apps, creating infrastructure and guardrails for safe and consistent production deployment by non-technical users.
Build the AI layer on top of Scribd's Medallion Architecture and Semantic Layer, connecting AI agents to governed data and enabling self-service answers.
Build AI skills and agents on top of existing declarative tooling to help platform stakeholders ship pipelines faster.
Partner with teams to identify AI tools, frameworks, and agentic patterns that accelerate data product development and AI adoption.
Identify and embed AI tools into the Data Platform team's engineering workflow for AI-assisted development.
Establish guardrails to ensure AI-generated code, queries, and pipelines are correct, consistent, and production-ready.
Help define and evolve data modeling and metadata patterns for AI use cases (e.g., context, documentation, discoverability).
Mentor other engineers and help define AI data engineering standards at Scribd.

Benefits

Scribd Flex (flexible work model)
Comprehensive health, dental, and vision coverage
Mental health support and disability coverage
Generous paid time off, including vacation, sick time, holidays, winter break, volunteer time, and sabbaticals
Paid parental leave and family support benefits
Retirement matching and employee equity
Learning and development programs and professional growth opportunities
Wellness and home office stipends
Complimentary access to the Scribd, Inc. suite of products
Enterprise access to leading AI tools