Senior Data Engineer

Fracht USA•Houston, TX

About The Position

The Senior Data Engineer designs, builds, and owns the enterprise data platform that powers analytics and data-driven decision-making across the organization's global operations. This is a full-stack data-engineering role spanning the complete Microsoft Fabric medallion architecture (Bronze, Silver, Gold, and Warehouse), covering source-system ingestion and Spark and PySpark transformation through dimensional modeling, the Power BI semantic layer, and Row-Level Security. The position requires deep, specialized technical expertise, the ability to work independently on highly complex projects with minimal supervision, sound architectural judgment on production systems, and a commitment to comprehensive documentation and governance. The role partners closely with business stakeholders, internal IT, and external delivery vendors, and sets technical standards and mentors other members of the analytics and engineering team.

Requirements

Bachelor's degree required in Computer Science, Data Engineering, Software Engineering, Information Systems, Data Analytics, or a closely related technical field.
Minimum of 5 years of progressive experience in data engineering, analytics engineering, or business intelligence; or a Master's degree in a related field plus a minimum of 3 years of relevant specialized experience.
Demonstrated experience building production data pipelines and dimensional data warehouses on cloud-based data platforms (Microsoft Fabric and/or Azure).
Proven experience developing complex Power BI semantic models and Row-Level Security in enterprise, multi-site environments.
Experience working with both cloud and on-premises source systems, and creating and maintaining technical documentation.
Advanced proficiency with Microsoft Fabric components: Workspaces, OneLake, Lakehouse, Data Warehouse, Data Pipelines, Notebooks, and Semantic Models.
Strong hands-on Spark and PySpark for large-scale data transformation in notebooks.
Expert-level SQL and T-SQL for complex queries, stored procedures, optimization, and analysis.
Proficiency with dbt (data build tool) for SQL-based transformations, including model development, incremental materialization strategies, testing, documentation generation, and integration with version-controlled deployment workflows.
Advanced knowledge of DAX, including complex measures, time intelligence, and optimization; proficiency with Tabular Editor, ALM Toolkit, and DAX Studio.
Strong command of data-warehouse and dimensional-modeling concepts: medallion architecture, star schema, fact, dimension, and bridge design, slowly changing dimensions, surrogate keys, and incremental-load patterns.
Working knowledge of data-replication mechanisms (Change Data Capture and mirrored or current-state replication) and their analytical trade-offs.
Proficiency in Power Query (M) and Microsoft Excel for data preparation; familiarity with Dataflows Gen2 and/or Azure Data Factory.
Experience with version control (Git) and deployment and CI/CD practices for data and BI solutions, including environment promotion (Dev to UAT to Prod).
Familiarity with Azure and Microsoft Entra ID, workspace governance, and Fabric capacity and licensing concepts.
Excellent technical-writing skills for clear, comprehensive documentation, and strong analytical and troubleshooting ability on complex technical issues.
Exceptional attention to detail and a strong commitment to data accuracy.
Excellent written and verbal communication, with the ability to translate complex technical concepts for non-technical stakeholders and present to varied audiences.
Self-motivated, organized, and able to manage multiple priorities and deadlines, working independently with minimal supervision.
Collaborative team player with a positive, knowledge-sharing attitude and a strong service orientation.

Nice To Haves

Master's degree in Computer Science, Data Analytics, Business Analytics, Information Systems, or a related field is strongly preferred.
Relevant advanced certifications (Fabric Data Engineer Associate DP-700, Fabric Analytics Engineer Associate DP-600, Azure Data Engineer Associate DP-203) are a plus.
Experience supporting global or multi-site organizations preferred.
Experience with logistics, supply chain, or freight-forwarding systems (such as CargoWise) is a plus.

Responsibilities

Designs, builds, and maintains end-to-end data pipelines across the Microsoft Fabric medallion architecture (Bronze, Silver, Gold, and Warehouse) using Spark and PySpark notebooks orchestrated through Fabric Data Pipelines.
Develops and owns ingestion from multiple heterogeneous source systems, including transactional and TMS databases, CosmosDB document stores, and file-based feeds, applying appropriate replication strategies (Change Data Capture, mirrored or current-state replication, and watermark-based incremental loads).
Implements robust ELT and ETL patterns: incremental and full-load logic, idempotent MERGE operations, surrogate-key generation, deduplication, typed-NULL handling, and runtime parameterization of load identifiers and lakehouse targets.
Optimizes pipeline performance, compute and capacity usage, and cost, and ensures the platform scales reliably for a global user base.
Designs dimensional models (star schema, fact and dimension tables, bridge tables, and slowly changing dimensions) in the Fabric Data Warehouse to support analytical and reporting workloads.
Authors and maintains performant T-SQL stored procedures, views, and warehouse tables with explicit, well-documented contracts (explicit column lists, typed columns, and semantic ordering).
Defines data requirements and business logic for new data products and translates them into sound, maintainable warehouse structures.
Creates and optimizes Power BI semantic models over the warehouse, including complex DAX measures and calculated columns, time intelligence, and Direct Lake or Import configurations.
Designs, implements, and troubleshoots Row-Level Security (RLS) and user access controls across multiple organizational levels and tenants, including embedded scenarios.
Provides the engineering foundation for interactive dashboards and supports BI developers and analysts consuming the platform.
Performs comprehensive data reconciliation and validation, identifies and root-causes discrepancies and anomalies, and remediates defects across the pipeline.
Owns production incident response for data pipelines and semantic-model refreshes, diagnosing failures, restoring service, and implementing durable fixes.
Tests changes thoroughly before deployment to ensure accuracy, performance, and functionality across scenarios and environments.
Establishes and enforces deployment discipline across Dev, UAT, and Production, including source control (Git), branch protection, promotion rules, and release governance for internal and vendor contributors.
Creates and maintains comprehensive technical documentation covering data sources, lineage, relationships, DAX measures, business logic, runbooks, and known issues, accessible to both technical and non-technical audiences.
Configures monitoring, alerting, and validation routines on critical pipelines and contributes to the team's standards, patterns, and knowledge base.
Works directly with global business stakeholders to gather requirements and translate them into technical specifications and data products.
Coordinates with and provides technical oversight of external delivery vendors, and participates in and may lead sprint planning, grooming, and effort estimation.
Stays current with Microsoft Fabric, Power BI, and industry trends; evaluates and recommends new tools and approaches; builds reusable components and frameworks; mentors junior team members; and responds to stakeholder communications within the same business day.
Performs other related duties as assigned.