Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues around the world, and where you’ll be able to reimagine what’s possible. Join us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world.Job DescriptionContribute to the design and implementation of scalable data architectures, such as a Lakehouse, using Delta Lake and Unity Catalog.• Manage and maintain the underlying data infrastructure, which typically exists on a major cloud platform like AWS, Azure, or GCP.• Implement data governance practices, including data lineage, metadata management, and access controls integrating with other 3rd party products like Immuta, Protegrity for tokenization etc.• Adhere to software engineering best practices, including participating in code reviews and CI/CD (Continuous Integration/Continuous Deployment) automation.• Stay up to date with the latest trends and technologies in data engineering and the Databricks ecosystem.• Design and build robust ETL (Extract, Transform, Load) and ELT workflows to ingest, transform, and load structured and unstructured data from various sources.• Utilize Databricks features like Delta Live Tables and Databricks Workflows to orchestrate and manage complex data processes.• Optimize and tune Apache Spark jobs for performance and cost efficiency on large datasets.• Implement and enforce data security, access control, and compliance policies in Databricks and Azure.• Hands on experience working on streaming technologies (Kafka, Event Hubs, Kinesis).• Experience architecting machine learning platforms, advanced analytics workloads and develop and deployment of MLOPS on Databricks.• Expertise with enterprise security models, networking, and cost governance.• Should have experience working with DevOps teams to establish deployment practices using Terraform or similar.• Optimize Databricks performance (auto-scaling, caching, delta optimization, job tuning, cost optimization).• Prior experience in leading enterprise Azure Databricks implementations
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed
Number of Employees
5,001-10,000 employees