Senior Data Engineer

CapgeminiVancouver, BC
$76,200 - $176,590

About The Position

Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues around the world, and where you’ll be able to reimagine what’s possible. Join us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world. Job Summary We are seeking an experienced Data Engineer with strong expertise in Databricks, Apache Airflow, Python, and PySpark to design, build, and maintain scalable, high-performance data solutions. The ideal candidate will be responsible for developing efficient data pipelines, orchestrating workflows, and ensuring the reliability and quality of data systems that support analytics and business operations.

Requirements

  • Strong hands-on experience in Python programming
  • Expertise in PySpark and Apache Spark for large-scale data processing
  • Experience with Apache Airflow for workflow scheduling and orchestration
  • Practical experience in Databricks platform (notebooks, jobs, clusters, Delta Lake)
  • Solid understanding of ETL/ELT concepts and data pipeline architecture
  • Proficiency in SQL and working with relational and non-relational databases
  • Experience working with cloud platforms (Azure preferred; AWS/GCP acceptable)
  • Strong debugging and root cause analysis skills
  • Familiarity with data modeling concepts
  • Understanding of data security and governance best practices

Nice To Haves

  • Experience with Azure Data Factory, Azure Data Lake, or similar services
  • Knowledge of CI/CD pipelines and DevOps practices
  • Exposure to streaming technologies (Kafka, Spark Streaming, etc.)
  • Experience with version control tools (Git)
  • Familiarity with monitoring tools and logging frameworks

Responsibilities

  • Design, develop, and maintain scalable ETL/ELT data pipelines using Python and PySpark
  • Build and manage workflow orchestration using Apache Airflow
  • Develop and optimize data processing solutions on Databricks, leveraging Spark and Delta Lake
  • Perform data ingestion from multiple sources (databases, APIs, files, cloud systems)
  • Implement data transformations, cleansing, and aggregation to support downstream analytics
  • Monitor, troubleshoot, and resolve job failures and performance issues
  • Optimize jobs for performance, scalability, and cost efficiency
  • Ensure data quality, consistency, and governance across all pipelines
  • Collaborate with data analysts, data scientists, and business stakeholders to deliver data solutions
  • Maintain proper documentation for workflows, pipelines, and data models
  • Support deployment processes and ensure smooth release of data solutions

Benefits

  • Vacation: 12-25 days, depending on grade
  • Company paid holidays
  • Personal Days
  • Sick Leave
  • Medical, dental, and vision coverage
  • Retirement savings plans (e.g., 401(k) in the U.S., RRSP in Canada)
  • Life and disability insurance
  • Employee assistance programs
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service