We are seeking a highly skilled and experienced Data Engineer to join our data engineering team. The ideal candidate will have deep expertise in building scalable data pipelines, optimizing big data workflows, and integrating Databricks with AWS services. You will play a key role in designing and implementing cloud-native data solutions that drive business insights and innovation. Key Responsibilities: Proficient in Databricks SQL for executing complex ad-hoc queries on large-scale lake house datasets, enabling rapid data exploration, trend identification, and actionable insights for business decision-making. Skilled in leveraging Databricks notebooks to design and implement scalable data transformation workflows, integrating PySpark and SQL to cleanse, enrich, and prepare large datasets for downstream analytics and reporting. Design, develop, and maintain scalable data pipelines using Apache Spark on Databricks. Architect and implement ETL/ELT workflows leveraging AWS services such as S3, Glue, Lambda, Redshift, and EMR. Optimize Spark jobs for performance and cost-efficiency in a cloud environment. Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver robust solutions. Implement CI/CD pipelines for Databricks notebooks and jobs using tools like GitHub Actions, Azure DevOps, or Jenkins. Ensure data quality, security, and governance using tools like Unity Catalog, Delta Lake, and AWS Lake Formation. Monitor and troubleshoot production data pipelines and jobs. Mentor junior engineers and contribute to best practices and standards.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Number of Employees
5,001-10,000 employees