Lead Data Engineer

Create Music Group
CA$120,000 - CA$150,000

About The Position

The Lead Data Engineer will play a central role in the buildout of CMG's next-generation data platform. This is a high-ownership role on a small, senior team, working directly with the SVP of Data & AI to design and implement a scalable lakehouse architecture on Google Cloud Storage (GCS) and Databricks, spanning bronze, silver, and gold layers. The role emphasizes domain-driven design, data contracts, and proactive communication with both internal stakeholders and external vendors.

Requirements

  • 4+ years of data engineering experience, with at least 1–2 years focused on data platform or lakehouse architecture
  • Hands-on experience with Databricks — including Delta Lake, PySpark, and ideally Unity Catalog
  • Experience with GCS or equivalent cloud object storage as a lakehouse foundation layer
  • Hands-on experience with domain-driven design applied to data modeling
  • Strong command of SQL and at least one transformation framework (dbt preferred)
  • Experience with medallion or lakehouse architectures (bronze/silver/gold or equivalent)
  • Familiarity with GCP-native tooling — Pub/Sub, Dataflow, or Dataplex a plus
  • Excellent written communication — able to write design docs non-engineers can understand and status updates executives can act on
  • Demonstrated ability to work independently in ambiguous environments
  • Track record of flagging risks early with proposed solutions

Nice To Haves

  • Experience in music/media/entertainment data
  • Familiarity with data contracts or schema validation (Unity Catalog, Great Expectations, dbt tests)
  • Experience with external dev vendors

Responsibilities

  • Lead the technical design and implementation of CMG's Medallion 2.0 lakehouse architecture — bronze ingestion, silver transformation, and gold domain layers — built on GCS and Databricks (Delta Lake), with clear data contracts at each boundary
  • Design and manage data pipelines using Astro (Airflow), PySpark, and Delta Live Tables, ensuring reliability and scalability across ingestion and transformation layers
  • Govern the lakehouse using Databricks Unity Catalog — managing access controls, data lineage, and schema enforcement across domains
  • Apply domain-driven design principles to partition and model data domains (e.g., royalty, asset, artist, distribution)
  • Collaborate with the analytics team to ensure the gold layer reflects real business needs — reducing workarounds
  • Coordinate with external vendors (e.g., DataArt) and internal stakeholders across DevOps, product, and analytics
  • Proactively identify architectural risks, data quality issues, and dependency blockers with proposed resolutions
  • Maintain clear, impact-first documentation and status updates for both technical and non-technical stakeholders
  • Other duties as assigned
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service