Data Engineer - Azure Databricks

Capgemini•Dallas, TX

About The Position

We are seeking a highly skilled Senior Data Engineer to design, build, and optimize scalable data solutions across our cloud and analytics ecosystem. The ideal candidate will have deep expertise in Microsoft Azure, Databricks, Python, SQL, Azure Data Factory (ADF), and Snowflake, with a strong ability to architect robust data pipelines that support analytics, reporting, and machine learning initiatives.Your Role• Design, develop, and maintain end-to-end data pipelines using Azure services, Databricks, and Snowflake.• Build scalable and reliable ETL/ELT workflows leveraging ADF and orchestrating data movement across multiple systems.• Develop high performance data processing solutions using Python and SQL.• Optimize and tune Databricks notebooks, Delta Lake tables, and distributed data processing workloads.• Implement data quality checks, validation frameworks, and automated monitoring solutions.• Collaborate closely with data architects, analysts, and business stakeholders to translate requirements into technical solutions.• Manage and optimize data models, schemas, and Snowflake environments for performance and cost efficiency.• Ensure best practices in data governance, privacy, and security across the entire data platform.• Participate in code reviews, peer technical discussions, and continuous improvement of team standards.

Requirements

8+ years of experience as a Data Engineer or similar role.
Strong, hands-on expertise with: o Azure Cloud Services (Azure Data Lake, ADF, Azure Functions, Event Hub, etc.) o Databricks (Spark, Delta Lake, job orchestration) o Python for data engineering and ETL/ELT development o SQL (advanced query writing, performance tuning) o Snowflake (data loading, optimization, Snowpipe, warehouses)
Solid understanding of distributed processing, data modeling techniques, and modern data architecture principles.
Experience working in CI/CD, Git-based development workflows, and DevOps practices.
Ability to design highly scalable, resilient, and maintainable data systems.

Nice To Haves

Professional level Certification in any of these Azure/Databricks/Snowflake
Experience with Delta Live Tables, Databricks Workflows, or MLflow.
Knowledge of streaming data pipelines (Kafka, Event Hub, Spark Structured Streaming).
Strong problem-solving skills and the ability to work in dynamic, fast-paced environments.

Responsibilities

Design, develop, and maintain end-to-end data pipelines using Azure services, Databricks, and Snowflake.
Build scalable and reliable ETL/ELT workflows leveraging ADF and orchestrating data movement across multiple systems.
Develop high performance data processing solutions using Python and SQL.
Optimize and tune Databricks notebooks, Delta Lake tables, and distributed data processing workloads.
Implement data quality checks, validation frameworks, and automated monitoring solutions.
Collaborate closely with data architects, analysts, and business stakeholders to translate requirements into technical solutions.
Manage and optimize data models, schemas, and Snowflake environments for performance and cost efficiency.
Ensure best practices in data governance, privacy, and security across the entire data platform.
Participate in code reviews, peer technical discussions, and continuous improvement of team standards.

Benefits

Paid time off based on employee grade (A-F), defined by policy: Vacation: 12-25 days, depending on grade, Company paid holidays, Personal Days, Sick Leave
Medical, dental, and vision coverage (or provincial healthcare coordination in Canada)
Retirement savings plans (e.g., 401(k) in the U.S., RRSP in Canada)
Life and disability insurance
Employee assistance programs
Other benefits as provided by local policy and eligibility

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume