Senior Data Engineer

ArctiqNew York, NY
7dRemote

About The Position

We are looking for a Data Engineer to lead the development of scalable data pipelines within the Databricks ecosystem. You will be responsible for architecting robust ETL/ELT processes using a "configuration-as-code" approach, ensuring our data lakehouse is governed, performant, and production-ready. This is a 3-month contract opportunity at 25 hours per week.

Requirements

  • Expertise: Deep mastery of PySpark and advanced SQL.
  • Platform: Extensive experience in the Databricks environment (Workflows, Delta Lake).
  • Cloud: Familiarity with AWS infrastructure and cloud-native data patterns.

Responsibilities

  • Pipeline Architecture: Design and implement declarative data pipelines using Lakeflow and Databricks Asset Bundles (DABs) to ensure seamless CI/CD.
  • Data Ingestion: Build efficient, scalable ingestion patterns using AutoLoader and Change Data Capture (CDC) to handle high-volume data streams.
  • Governance & Security: Manage metadata, lineage, and access control through Unity Catalog.
  • Orchestration: Develop and maintain complex workflows using Databricks Jobs and orchestration tools.
  • Infrastructure as Code: (Asset) Utilize Terraform to manage AWS resources (S3, EC2) and Databricks workspaces.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service