Staff Data Engineer

Crunchyroll, LLCSan Francisco, CA
4hHybrid

About The Position

We are hiring a Staff Data Engineer to play a crucial role in our mission to establish a world-class Data Engineering team within the Center for Data and Insights (CDI). You will be a key contributor, advancing our data engineering capabilities in the AWS and GCP ecosystems.Your responsibilities include collaborating with partners, guiding and mentoring fellow data engineers, and working hands-on in various domains such as data architecture, data lake infrastructure, data and ML job orchestration.Your contributions will ensure the consistency and reliability of data and insights, aligning with our objective of enabling well-informed decision-making. You will demonstrate an empathetic and service-oriented approach, fostering a thriving data and insights culture while enhancing and safeguarding our data infrastructure.You will have a unique opportunity to build and strengthen our data engineering platforms at a global level. If you are an experienced professional with a passion for impactful data engineering initiatives and a commitment to driving transformative changes, we encourage you to explore this role. Be a subject-matter expert on critical datasets: field stakeholder questions, explain lineage/assumptions, and help partners interpret data accurately Own projects end-to-end: drive scoping, design, implementation, launch, and iteration of data products from raw sources to analytics-ready datasets (requirements → modeling → pipelines → metrics/reporting enablement) Architect and build scalable pipelines on Databricks using Spark + SQL (batch and/or streaming as needed), focusing on correctness, performance, and cost Design strong data models (lakehouse/warehouse-style) that serve analytics and self-service use cases; define entities, grains, dimensions, metrics, and contracts Integrate diverse data sources (internal systems + vendor platforms), manage schema evolution, and produce clean, well-documented curated datasets for the Analytics team Establish data quality and reliability standards: testing, reconciliation, anomaly detection, SLAs/SLOs, monitoring/alerting, and incident response; continuously improve time-to-detect and time-to-recover (FAANG-style data orgs explicitly call out delivering “high quality data” reliably at scale) Performance tune Spark + SQL: optimize joins, partitioning, file layout, clustering/z-ordering, caching strategy, and job configuration; benchmark and remove bottlenecks Partner with stakeholders (Product, Engineering, Growth, Finance, Analytics) to translate ambiguous questions into concrete data deliverables; communicate tradeoffs and drive alignment Raise the engineering bar: set patterns for pipeline templates, CI/CD, code reviews, and operational playbooks; mentor other engineers via technical leadership and examples (even without direct reports) In the role of Staff Data Engineer, you will report to the Senior Director, Data Engineering. We are considering applicants for the location of Los Angeles or San Francisco.

Requirements

  • You have 12+ years of hands-on experience in data engineering and/or software development
  • You are highly skilled in programming languages like Python, Spark & SQL
  • You are comfortable using BI tools like Tableau, Looker, Preset
  • You are proficient in utilizing event data collection tools such as Snowplow, Segment, Google Tag Manager, Tealium, mParticle, and more
  • You have xomprehensive expertise across the entire lifecycle of implementing compute and orchestration tools like Databricks, Airflow, Talend, and others
  • You are experienced in working with streaming OLAP engines like Druid, ClickHouse, and similar technologies
  • You have experience using AWS services including EMR Spark, Redshift, Kinesis, Lambda, Glue, S3, and Athena, among others (exposure to GCP services like BigQuery, Google Storage, Looker, Google Analytics, a plus)
  • You understand building real-time data systems, as well AI/ML personalization products
  • You have experience with Customer Data Platforms (CDPs) and Data Management Platforms (DMPs), contributing to holistic data strategies
  • You have a familiarity with high-security environments like HIPAA, PCI, or similar contexts, highlighting a commitment to data privacy and security
  • You are accomplished in managing large-scale data sets, handling Terabytes of data and billions of records effectively
  • You hold a Bachelor's Degree in Computer Science, Information Systems, or a related field

Responsibilities

  • Be a subject-matter expert on critical datasets: field stakeholder questions, explain lineage/assumptions, and help partners interpret data accurately
  • Own projects end-to-end: drive scoping, design, implementation, launch, and iteration of data products from raw sources to analytics-ready datasets (requirements → modeling → pipelines → metrics/reporting enablement)
  • Architect and build scalable pipelines on Databricks using Spark + SQL (batch and/or streaming as needed), focusing on correctness, performance, and cost
  • Design strong data models (lakehouse/warehouse-style) that serve analytics and self-service use cases; define entities, grains, dimensions, metrics, and contracts
  • Integrate diverse data sources (internal systems + vendor platforms), manage schema evolution, and produce clean, well-documented curated datasets for the Analytics team
  • Establish data quality and reliability standards: testing, reconciliation, anomaly detection, SLAs/SLOs, monitoring/alerting, and incident response; continuously improve time-to-detect and time-to-recover (FAANG-style data orgs explicitly call out delivering “high quality data” reliably at scale)
  • Performance tune Spark + SQL: optimize joins, partitioning, file layout, clustering/z-ordering, caching strategy, and job configuration; benchmark and remove bottlenecks
  • Partner with stakeholders (Product, Engineering, Growth, Finance, Analytics) to translate ambiguous questions into concrete data deliverables; communicate tradeoffs and drive alignment
  • Raise the engineering bar: set patterns for pipeline templates, CI/CD, code reviews, and operational playbooks; mentor other engineers via technical leadership and examples (even without direct reports)

Benefits

  • Receive a great compensation package including salary plus performance bonus earning potential, paid annually.
  • Flexible time off policies allowing you to take the time you need to be your whole self.
  • Generous medical, dental, vision, STD, LTD, and life insurance
  • Health Saving Account HSA program
  • Health care and dependent care FSA
  • 401(k) plan, with employer match
  • Employer paid commuter benefit
  • Support program for new parents
  • Pet insurance and some of our offices are pet friendly!
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service