Fabric Data Engineer — Workplace Engineering

The Vanguard GroupWayne, PA
Hybrid

About The Position

Vanguard is establishing Microsoft Fabric as the enterprise data and analytics foundation for its Workplace AI, Power BI, and cross-cloud analytics initiatives. This role is part of a collaboration with Microsoft on a CDAO-led Fabric Enablement engagement, utilizing F256 Reserved capacity and integrating with Vanguard's existing data, identity, and security infrastructure. The position involves direct, hands-on data engineering within OneLake, focusing on building and maintaining scalable data products such as lakehouses, warehouses, pipelines, notebooks, and Delta tables ready for semantic models. The engineer will be responsible for the entire lifecycle, governance, and operational health of the Fabric platform, working closely with an AI Engineer to ensure data is governed, monitored, and prepared for AI applications. This is a strategic role focused on engineering and implementation, not a support position, and requires close collaboration with various internal teams and the Microsoft CDAO Fabric Enablement team.

Requirements

  • 8+ years of professional software / data / platform engineering experience.
  • 5+ years building production data solutions on the Microsoft and / or Azure data stack.
  • Hands-on production experience with at least three of: Microsoft Fabric (Lakehouse, Warehouse, Pipelines, Notebooks, Real-Time Intelligence), Azure Synapse, Azure Data Factory, Databricks, Power BI semantic models, Azure SQL / SQL Server.
  • Strong skills in SQL, PySpark, and KQL.
  • Comfort moving between batch, streaming, and interactive analytics workloads.
  • Demonstrable experience designing and shipping CI/CD for data platforms: Git workflows, automated deployment, environment promotion, secret-less authentication, and infrastructure-as-code.
  • Working knowledge of Terraform (preferred) or Bicep for cloud platform automation, including provider versioning, state management, and policy-as-code patterns.
  • Experience implementing security and compliance controls in a regulated environment: Purview, Sentinel, Defender, Conditional Access, MIP, DLP, RBAC, RLS / CLS / OLS, dynamic data masking.
  • Identity fluency with Entra ID (Azure AD) and federated IdPs (Okta preferred); experience with service principals, managed identities, and Workload Identity Federation.
  • Experience working in financial services, healthcare, or another heavily regulated environment, or a credible plan to come up to speed quickly.
  • Bachelor's degree in Computer Science, Engineering, or equivalent practical experience.

Nice To Haves

  • DP-700 (Microsoft Certified: Fabric Data Engineer Associate) required or in-progress within 6 months of hire; DP-600 (Fabric Analytics Engineer Associate) and AZ-305 (Azure Solutions Architect Expert) preferred.
  • Hands-on experience with the Microsoft fabric-cicd Python library and the microsoft/fabric Terraform provider.
  • Experience operating a Fabric Center of Excellence, Power BI CoE, or comparable data-platform CoE.
  • Experience with cross-cloud data integration patterns (OneLake ↔ AWS S3, mirroring, shortcuts) and BCDR for analytics platforms at enterprise scale.
  • Experience configuring Prep for AI on semantic models and partnering with AI / agent engineers on certified data-product handoffs.
  • Background contributing to internal communities of practice, champions networks, or developer enablement programs.
  • Prior experience as a hands-on engineer in a Fusion Team (engineers + product + data + analysts) or Data / AI Center of Excellence model.
  • Additional vendor certifications welcomed but not required: AZ-204, SC-100, DP-203 (legacy, retired March 2025 but still relevant context).

Responsibilities

  • Design and implement scalable data storage in OneLake using Lakehouses (Delta) and Warehouses (T-SQL), configuring SQL analytics endpoints, shortcuts, and OneLake security.
  • Build and maintain Spark notebooks (PySpark), Data Factory pipelines, Dataflows Gen2, Copy Jobs, and mirroring for batch and incremental ingestion at enterprise scale.
  • Build Real-Time Intelligence solutions including Eventstreams, Eventhouses / KQL databases, Activator reflexes, and Spark structured streaming for low-latency workloads.
  • Optimize Lakehouse tables (e.g., OPTIMIZE, V-Order, Z-Order, partitioning) and Direct Lake semantic-model-ready datasets for predictable performance.
  • Implement source control, branching, and CI/CD using native Fabric Git integration (Azure DevOps and GitHub), Fabric Deployment Pipelines, and the Microsoft fabric-cicd Python library.
  • Automate Dev / Test / Prod promotion against the Fabric REST API using service principals and Workload Identity Federation, codifying environment-aware bindings via Variable Libraries and parameter.yml.
  • Operate a Feature → Dev → UAT → Prod branching pattern with mandatory PR review, cherry-pick promotion, and one repo per team.
  • Own the lifecycle of Fabric data components from creation through retirement, ensuring reproducible environments from the GitHub pipeline.
  • Operate the Fabric F256 capacity, monitoring CU consumption, managing smoothing windows, diagnosing throttling, and right-sizing workloads.
  • Build telemetry using the Monitoring Hub, per-workspace Workspace Monitoring, Eventhouse monitoring, and the Admin Monitoring Workspace to track failures and health.
  • Define dashboards and alerts for ingestion, transformation, refresh, and capacity health, driving root-cause analysis and feeding lessons back into platform standards.
  • Define and operate the on-call model for production data pipelines and Fabric items in partnership with Tier 3 Engineering.
  • Define and enforce Fabric platform standards through Terraform-based IaC, workspace templates, naming and tagging conventions, and automated CI policy checks.
  • Manage tenant settings, domains, and capacity allocation in partnership with the Fabric Center of Excellence, aligning identity with Entra ID and Okta federation.
  • Implement RBAC patterns separating control-plane and data-plane roles, and operate RLS, CLS, OLS, dynamic data masking, and item-level sharing.
  • Integrate Microsoft Purview for sensitivity labels, DLP, metadata scanning, lineage, and impact analysis, managing endorsement for trusted datasets.
  • Build cross-cloud integration patterns (e.g., OneLake Direct Lake against AWS S3, Mirrored Databases) and shortcuts.
  • Publish governed, AI-ready data products with Prep for AI configured on semantic models.
  • Coordinate with Data, Cloud, Identity, and Security domain teams on data-sharing patterns, private link configuration, and on-prem data gateway operations.
  • Serve as Tier 3 escalation for complex Fabric, OneLake, pipeline, capacity, and Direct Lake issues.
  • Provide deep technical consultation to teams onboarding workloads to Fabric.
  • Build reusable patterns, reference implementations, and internal playbooks for ingestion, modeling, deployment, and capacity operations.
  • Lead proof-of-concept work for new Fabric capabilities.
  • Partner with the Microsoft CDAO Fabric Enablement engagement to bring product roadmap insights back into Vanguard's implementation.
  • Contribute to the Workplace AI and enterprise Data roadmap and operating model, and partner with champions and train-the-trainer initiatives.

Benefits

  • Hybrid working model
  • Enhanced flexibility
  • In-person learning, collaboration, and connection
  • Mission-driven and highly collaborative culture
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service