Senior Data Engineer

Qode•Mahwah, NJ

4h•Hybrid

About The Position

We are looking for a hands-on Senior Data Engineer to build and operate data pipelines for MDM —an enterprise master data foundation that produces trusted “golden records,” supports stewardship workflows, and publishes curated master data to downstream consumers. This role will engineer scalable ELT patterns using Azure, Snowflake, dbt, and Python, with strong focus on data quality, reliability, observability, and audit-ready delivery.

Requirements

4–7 years of data engineering experience with strong hands-on delivery ownership
Strong expertise in Snowflake (modeling, performance tuning, cost control)
Strong expertise in dbt (models, tests, macros, documentation, CI)
Proficient in Python for pipeline utilities, validations, automation, and troubleshooting
Experience implementing data quality and production monitoring practices
Strong SQL (advanced joins, window functions, profiling, reconciliation)
Experience with GenAI is required.

Nice To Haves

Experience with MDM / Master Data concepts (golden record, dedupe, survivorship, stewardship workflows)
Experience with Azure data ecosystem tools (ADF/Synapse/Functions/Key Vault/Monitoring)
Experience with event-driven publishing or change-data outputs for downstream systems
Exposure to regulated/audit-heavy delivery environments (traceability, approvals, evidence)

Responsibilities

Design and implement ingestion patterns from source systems into Snowflake (batch and incremental/CDC where applicable).
Build scalable landing/staging/curated layers with clear lineage and reprocessing strategies.
Implement orchestration patterns on Azure (scheduling, parameterization, retries, idempotency).
Develop and maintain dbt models for:o canonical master entities and relationshipso standardization/enrichment transformationso exception/validation views and stewardship-ready outputs
Implement dbt best practices: modular models, tests, exposures, documentation, and CI checks.
Implement data quality rules and automated tests (dbt tests + custom checks in Python where needed).
Create exception datasets/metrics to support stewardship queues and remediation.
Build reconciliation routines (source vs curated counts, duplicate metrics, completeness/consistency trends).
Support match/merge workflows by producing standardized inputs (keys, standardized attributes, dedupe candidates).
Implement survivorship logic where defined (source priority, recency, completeness) or enable it through curated datasets.
Produce “publish” datasets and change detection outputs for downstream consumption.
Optimize Snowflake workloads (warehouse sizing, clustering strategies where relevant, query tuning, cost governance).
Build robust operational patterns backfills, re-runs, error handling, SLAs, data freshness checks.
Implement monitoring for pipeline health, data freshness, DQ failures, and exception volumes.
Create runbooks and operational playbooks for production support and hypercare.
Participate in go-live cutover, hypercare triage, and transition to BAU support.
Work closely with Data Architect, BSA/Data Analyst, Backend/UI teams, QA, and DevOps.
Participate in sprint planning, code reviews, and architecture/design reviews.
Maintain high-quality documentation and version control (Git-based workflow).