About The Position

We are hiring a Senior Lead Data Engineer to build and scale the data foundations that power Paramount’s next-generation personalization systems across Home, Search/Browse, Notifications, and Artwork. This role sits at the core of the Content Engineering vertical, partnering closely with Applied ML, ML Platform, and Causal Science teams to deliver highly reliable, ML-ready data at global scale. You will design and operate pipelines processing billions of daily events, petabyte-scale feature stores, and real-time engagement streams that support ranking and recommendations. This is a high-impact role for an engineer who thrives in distributed systems, large-scale ETL/streaming, and delivering production-grade infrastructure aligned with cutting-edge personalization. Paramount is investing heavily in a unified personalization operating model. In this role, you will directly shape: The Data Backbone: Building the core of our personalization ecosystem. The User Experience: Defining the feature sets that identify what millions of users view. Innovation Velocity: Enabling ML teams to innovate quickly and safely through high-quality experimentation data.

Requirements

  • 7+ years of experience in large-scale data or software engineering.
  • Hands-on Expertise: Deep experience with Spark (PySpark/Scala), Databricks, Airflow, and Kafka.
  • ML Data Modeling: Proficiency in feature pipelines, temporal joins, and mitigating training-serving skew.
  • Cloud Ecosystems: Experience with AWS/Azure/GCP and high-performance engines like Snowflake or Redshift.
  • Technical Foundations: Proficient programming skills in Python and SQL with a focus on performance optimization.

Nice To Haves

  • Experience in personalization domains (search, ranking, or recommender systems).
  • Experience supporting petabyte-scale data lakehouses or feature stores.
  • Familiarity with GenAI/RAG systems, multimodal content, or Delta Live Tables.
  • Knowledge of Causal Inference, experimentation signals, or ML evaluation workflows.
  • Experience with Terraform for governed, repeatable deployments.

Responsibilities

  • Build & Operate Large-Scale Feature Pipelines: Design and maintain batch/streaming pipelines (Spark, Flink, Databricks, Airflow) producing ML features for ranking models.
  • Ensure Point-in-Time Correctness: Develop feature sets that enable unbiased offline training and credible online inference.
  • Develop Embedding & Content Pipelines: Build scalable workflows for metadata, imagery, and multimodal representations; partner with Science teams to operationalize new models.
  • Architect Data Foundations: Design Delta/Parquet data models and medallion layers, optimizing storage layout and partitioning for latency and cost.
  • Real-Time Engineering: Build Kafka-based systems for real-time features and user-activity aggregations, ensuring robust handling of out-of-order events and exactly-once semantics.
  • Governance & Leadership: Define data quality rules and schema evolution processes while collaborating across ML pods to translate model needs into infrastructure.

Benefits

  • medical
  • dental
  • vision
  • 401(k) plan
  • life insurance coverage
  • disability benefits
  • tuition assistance program
  • PTO
  • bonus eligible
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service