This role involves designing, developing, and maintaining scalable and efficient data pipelines using advanced technologies like Python, PySpark, and Databricks. The position requires integrating data from diverse sources while ensuring high standards of data quality, consistency, and reliability. A key responsibility is formulating and implementing comprehensive data architecture strategies, including data modeling, schema design, and data storage solutions, as well as optimizing data processing workflows for enhanced performance, scalability, and cost-efficiency. Collaboration with data scientists, analysts, and stakeholders is essential to understand data requirements and deliver tailored data solutions. The role also includes identifying and resolving data-related issues, supporting data infrastructure, and maintaining detailed documentation of all data pipelines, architecture, and processes. Additionally, the position drives the design and implementation of data models to enhance business decision-making by generating insights from internal and external data assets, defining data requirements, and mining and validating large-scale structured and unstructured datasets using cloud-based tools. The role supports both standard and customized data analyses, develops robust mechanisms for data ingestion, analysis, validation, normalization, and cleaning, and upholds best practices in data engineering while contributing to advanced data analytics and visualization initiatives.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior