Azure Data Engineer - Pyspark

CapgeminiBridgewater, NJ
2d$110,841 - $145,000Hybrid

About The Position

Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues around the world, and where you’ll be able to reimagine what’s possible. Join us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world.Job Location - Newark NJ (Day One Onsite - Hybrid - 3 days in a week)

Requirements

  • Hands on expertise in PySpark (DataFrames, Spark SQL, performance tuning) on Azure Databricks-this is the primary skill focus.
  • Strong SQL and data modeling for analytical workloads (star/snowflake, lakehouse patterns).
  • Proven delivery with Azure Data Factory/Synapse for pipeline orchestration and scheduling.
  • Solid knowledge of Azure storage (ADLS Gen2, partitions, file formats-Parquet/Delta).
  • Version control and CI/CD with Git/Azure DevOps; automated testing in data pipelines.
  • Experience operating pipelines in production (monitoring, alerting, reliability).

Nice To Haves

  • Microsoft Fabric exposure is nice to have (not mandatory)
  • Data governance tools (e.g., Purview), Power BI integration, Delta Live Tables.
  • Python packaging best practices; basic PowerShell for automation.
  • Domain experience in financial services/asset management.

Responsibilities

  • Design and implement batch and streaming data pipelines on Azure Databricks (PySpark); author Spark SQL for transformations and analytics.
  • Orchestrate workflows with Azure Data Factory and/or Synapse pipelines; integrate with Lake Storage.
  • Model and maintain lakehouse structures (e.g., Delta Lake), ensuring robust partitioning, schema evolution, and performance.
  • Implement data quality checks, observability, and SLAs across pipelines.
  • Optimize jobs for cost and performance (cluster sizing, caching, shuffle reduction, partition strategy).
  • Collaborate with architecture/platform teams on CI/CD (Azure DevOps), secrets management (Key Vault), and security (RBAC, PIM).
  • Contribute to governance and metadata practices; document lineage and technical design.
  • Support release cycles, incident triage, and production hardening; drive continuous improvements.

Benefits

  • Flexible work
  • Healthcare including dental, vision, mental health, and well-being programs
  • Financial well-being programs such as 401(k) and Employee Share Ownership Plan
  • Paid time off and paid holidays
  • Paid parental leave
  • Family building benefits like adoption assistance, surrogacy, and cryopreservation
  • Social well-being benefits like subsidized back-up child/elder care and tutoring
  • Mentoring, coaching and learning programs
  • Employee Resource Groups
  • Disaster Relief
  • Vacation: 12-25 days, depending on grade, Company paid holidays, Personal Days, Sick Leave
  • Medical, dental, and vision coverage (or provincial healthcare coordination in Canada)
  • Retirement savings plans (e.g., 401(k) in the U.S., RRSP in Canada)
  • Life and disability insurance
  • Employee assistance programs
  • Other benefits as provided by local policy and eligibility

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service