Data Engineer (MS Fabric)

CapgeminiNew York, NY
1d

About The Position

Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues around the world, and where you’ll be able to reimagine what’s possible. Join us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world.Job DescriptionExperience: 8-12+ years (with deep hands-on delivery)Key ResponsibilitiesData Platform & Fabric Engineering• Design and implement end-to-end data pipelines using Microsoft Fabric (Data Factory, Data Engineering, Lakehouse).• Build and maintain Fabric Lakehouse architectures (Bronze/Silver/Gold) optimized for marketing and BI use cases.• Implement incremental loads, CDC patterns, and data freshness strategies for large-scale analytical datasets.• Optimize storage formats (Delta/Parquet), partitioning, and performance tuning in Fabric.Data Engineering & Transformation• Develop robust data transformation logic using: PySpark / Spark SQL, SQL-based transformations• Perform data cleansing, standardization, enrichment, and deduplication across multiple marketing and customer data sources.• Implement data quality checks, validation rules, and anomaly detection within pipelines.• Maintain reusable transformation frameworks and shared data assets.Marketing & Business Data Integration• Ingest and model data from marketing and customer platforms such as: Digital analytics (web, app, events), Campaign platforms (email, paid media, CRM, CDPs), Internal business systems (sales, finance, operations)• Create conformed dimensions and fact tables for marketing performance, attribution, funnel analysis, and customer insights.• Enable cross-channel reporting and identity-aware analytics.Power BI & Semantic Modeling• Design and optimize Power BI semantic models (datasets) for enterprise reporting.• Build star schemas, calculation groups, and optimized DAX measures.• Ensure report performance, scalability, and refresh reliability.• Support self-service BI while enforcing enterprise data governance standards.• Collaborate with analysts and business users on dashboard requirements and usability.Governance, Security & Operations• Implement workspace strategies, environment separation (Dev/Test/Prod), and deployment pipelines in Fabric.• Enforce data access controls, row-level security (RLS), and sensitivity labels.• Establish monitoring, logging, and alerting for pipeline health and data reliability.• Document data models, pipelines, and operational runbooks.• Participate in on-call or production support rotations as needed.Collaboration & Leadership• Influence data architecture decisions and analytics best practices.• Work closely with product managers, marketing leaders, and BI teams to prioritize and deliver high-impact data products.• Contribute to standards for data modeling, naming conventions, and pipeline design.

Requirements

  • 8+ years of experience in data engineering / analytics engineering roles.
  • Strong hands-on expertise with Microsoft Fabric: Lakehouse, Data Factory, Data Engineering (Spark), Workspaces and deployment pipelines
  • Advanced SQL skills and experience with large analytical datasets.
  • Strong experience with Power BI: Semantic models, DAX, Performance optimization
  • Proficiency in PySpark or Spark SQL.
  • Deep understanding of: Data modeling (star/snowflake schemas), ETL / ELT design patterns, Incremental processing and CDC, Data quality and validation frameworks
  • Experience operating data platforms in production environments.

Nice To Haves

  • Experience supporting marketing analytics, customer analytics, or growth analytics.
  • Comfortable working in agile, fast-moving environments with evolving requirements.
  • Experience with enterprise marketing stacks (CDPs, CRM, campaign tools).
  • Familiarity with data governance frameworks and privacy-aware data design.
  • Experience migrating from legacy BI platforms to Microsoft Fabric.
  • Azure ecosystem experience beyond Fabric (Synapse, ADLS, etc.).

Responsibilities

  • Design and implement end-to-end data pipelines using Microsoft Fabric (Data Factory, Data Engineering, Lakehouse).
  • Build and maintain Fabric Lakehouse architectures (Bronze/Silver/Gold) optimized for marketing and BI use cases.
  • Implement incremental loads, CDC patterns, and data freshness strategies for large-scale analytical datasets.
  • Optimize storage formats (Delta/Parquet), partitioning, and performance tuning in Fabric.
  • Develop robust data transformation logic using: PySpark / Spark SQL, SQL-based transformations
  • Perform data cleansing, standardization, enrichment, and deduplication across multiple marketing and customer data sources.
  • Implement data quality checks, validation rules, and anomaly detection within pipelines.
  • Maintain reusable transformation frameworks and shared data assets.
  • Ingest and model data from marketing and customer platforms such as: Digital analytics (web, app, events), Campaign platforms (email, paid media, CRM, CDPs), Internal business systems (sales, finance, operations)
  • Create conformed dimensions and fact tables for marketing performance, attribution, funnel analysis, and customer insights.
  • Enable cross-channel reporting and identity-aware analytics.
  • Design and optimize Power BI semantic models (datasets) for enterprise reporting.
  • Build star schemas, calculation groups, and optimized DAX measures.
  • Ensure report performance, scalability, and refresh reliability.
  • Support self-service BI while enforcing enterprise data governance standards.
  • Collaborate with analysts and business users on dashboard requirements and usability.
  • Implement workspace strategies, environment separation (Dev/Test/Prod), and deployment pipelines in Fabric.
  • Enforce data access controls, row-level security (RLS), and sensitivity labels.
  • Establish monitoring, logging, and alerting for pipeline health and data reliability.
  • Document data models, pipelines, and operational runbooks.
  • Participate in on-call or production support rotations as needed.
  • Influence data architecture decisions and analytics best practices.
  • Work closely with product managers, marketing leaders, and BI teams to prioritize and deliver high-impact data products.
  • Contribute to standards for data modeling, naming conventions, and pipeline design.

Benefits

  • Paid time off based on employee grade (A-F), defined by policy: Vacation: 12-25 days, depending on grade, Company paid holidays, Personal Days, Sick Leave
  • Medical, dental, and vision coverage (or provincial healthcare coordination in Canada)
  • Retirement savings plans (e.g., 401(k) in the U.S., RRSP in Canada)
  • Life and disability insurance
  • Employee assistance programs
  • Other benefits as provided by local policy and eligibility
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service