Data Engineer

AppleCupertino, CA
3h

About The Position

Apple's Media, Graphics, and Compute Technologies Group (MGC) is looking for a talented and dedicated big data engineer to join our Data Engineering team. The Data Engineering team within the MGC organization plays a critical role in supporting data-driven analytics by providing data collection, warehousing, and analytics at big data scale. Our team provides the infrastructure to power numerous trend and operational dashboards as well as other ad-hoc use cases in support of services like Apple TV, Apple Music, and FaceTime. We are leveraging Generative AI and Machine Learning technologies to provide best-in-class data analytics and monitoring. This role offers the opportunity to help design, enhance, and develop our very-high-volume processing pipeline. You'll work with talented engineers within our team as well as cross-functional teams in an agile and dynamic environment that values engineering excellence, creativity, and innovation, and you will be a key contributor to our next generation of processing pipeline and data analytics platform. DESCRIPTION Our team leverages modern Data Engineering, Generative AI and Machine Learning technologies to deliver actionable insights. You will be: • Collaborating with data scientists across functional teams to define and enhance performance metrics that provide valuable insights for stakeholders • Building and maintaining: - Ingestion pipelines for real-time data processing - Real-time applications driving operational monitoring - Batch ETL/ELT applications populating our data warehouse • Applying Generative AI and Retrieval Augmented Generation (RAG) techniques to enhance data analytics capabilities • Applying Machine Learning technologies for anomaly detection

Requirements

  • Bachelor's degree in Computer Science or equivalent professional experience
  • Experience in building large scale distributed systems in Java/Python or similar languages
  • Proficient in SQL
  • Experience with data warehouse architectures and dimensional modeling
  • Demonstrated ability to conduct performance analysis and troubleshoot large scale distributed systems
  • Strong collaboration skills with ability to understand complex architectures and work effectively across teams
  • Hands-on experience with Docker and Kubernetes

Nice To Haves

  • Production experience with Apache Kafka, Spark, or Flink
  • Working knowledge of Trino or similar distributed query engines
  • Experience building multi-agent AI systems or agentic workflows
  • Familiarity with Retrieval Augmented Generation (RAG) techniques working in conjunction with LLMs
  • Experience with creating and consuming Model Context Protocol (MCP) services

Responsibilities

  • Collaborating with data scientists across functional teams to define and enhance performance metrics that provide valuable insights for stakeholders
  • Building and maintaining ingestion pipelines for real-time data processing
  • Building and maintaining real-time applications driving operational monitoring
  • Building and maintaining batch ETL/ELT applications populating our data warehouse
  • Applying Generative AI and Retrieval Augmented Generation (RAG) techniques to enhance data analytics capabilities
  • Applying Machine Learning technologies for anomaly detection
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service