This role focuses on designing and implementing scalable data pipelines for ingesting and transforming data, primarily using Databricks and leveraging PySpark notebooks, Spark SQL, and Python. The position involves developing ETL processes to extract, transform, and load data from diverse sources into a data lakehouse architecture on Databricks. A key aspect of this role is analyzing the existing integration landscape, which includes Teradata (TPT, BTEQ), Talend, and IBM Sterling. The Big Data Lead will define the ingestion and integration strategy for Databricks, ensure seamless data flow from source systems to the Lakehouse, and lead integration design while overseeing pipeline migration. Optimization of data processing workflows for performance and efficiency using Databricks capabilities is crucial. The role also requires ensuring data security and compliance with data privacy regulations, delivering high-quality data products, collaborating with stakeholders to understand requirements and create technical solutions using the Microsoft Azure stack, and creating comprehensive documentation.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
No Education Listed