Design, build, and maintain data pipelines that extract data from various sources such as databases (PostgreSQL, Cassandra, Iceberg, and hadoop ), APIs, data lakes, cloud storage or log files to Collect and consolidate data from multiple sources into a central data warehouse for reporting, analytics, and business intelligence purposes. Understand data sources, configure data extraction processes, manage data ingestion using Pyspark or Python, and automate the pipelines using Airflow to power data sources for analytics platforms like Tableau. Collaborate with machine learning engineers, data scientists, analysts, software engineers and managers to understand their data requirements and deliver them with reliable, distributed data pipelines to feed into data analytics and data visualization Platforms thereby allowing Apple’s stakeholders to easily leverage data in self-served manner. Perform data transformation tasks, including data cleaning, normalization, aggregation, and enrichment to prepare data for analytics and reporting pipelines. Utilize tools like SQL, scripting languages (Python), or ETL (Extract, Transform, Load) tools to manipulate and prepare data for predictive, statistical and trend analysis. Develop new and creative methodologies, such as self-optimizing data pipelines, and unified data pipeline that integrates and harmonizes data streams from various sources in real time, to evaluate test coverage and test pass rate to constantly improve Siri, by notifying and delivering feedback to engineering partners. Optimize existing data pipelines and database queries to improve performance and minimize latency of tableau dashboards. Identify and resolve bottlenecks, optimize data transformation processes, and implement indexing strategies to optimize data retrieval performance in databases.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Number of Employees
5,001-10,000 employees