The position involves designing and building scalable data pipelines using technologies such as PySpark, SQL, and Hadoop. The role requires developing and implementing data quality rules, validation checks, and monitoring dashboards to ensure data integrity. Collaboration with data architects, analysts, and quality engineering (QE) engineers is essential to maintain end-to-end data integrity. Additionally, the candidate will establish coding standards, reusable components, and version control practices for data engineering workflows. Performance optimization of ETL/ELT processes and troubleshooting data issues in production environments are key responsibilities. The role also supports regulatory compliance and data governance by integrating data lineage, metadata, and audit capabilities.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Number of Employees
5,001-10,000 employees