About The Position

The Apple Services Engineering team is a key part of Apple's integration of art and technology, powering major platforms like the App Store, Apple TV, Apple Music, Apple Podcasts, and Apple Books. This team operates at a massive scale, delivering entertainment in over 35 languages to more than 150 countries, with a strong commitment to Apple’s privacy policy. Despite the scale, teams remain small, flexible, and multi-functional. This role is within the Commerce & Growth Intelligence team at Apple Services Engineering, which focuses on cutting-edge innovations across the user journey, from account creation to marketing, personalized offers, subscription ranking, churn modeling, and lifetime value optimization. The team tackles problems of unprecedented scale and complexity using advanced machine learning and AI, including large language models (LLMs), in a dynamic and collaborative environment. In this position, you will be responsible for operationalizing machine learning models, which includes building real-time and batch inference pipelines, and optimizing system performance, reliability, and experimentation velocity. You will bridge the gap between research and production by developing the necessary infrastructure, tooling, and monitoring to safely and efficiently deploy ML-driven features. This is an opportunity for an engineer who enjoys scaling ML solutions, building production-grade services, and driving experimentation for billions of users worldwide.

Requirements

  • MS or PhD in Computer Science, Software Engineering, or related field—or equivalent industry experience.
  • 2+ years of experience in production machine learning systems, especially for personalization or recommendations.
  • Proficiency in object-oriented programming languages such as Java, Scala, or C++.
  • Experience building and maintaining large-scale distributed systems for ML workloads.
  • Deep understanding of ML model deployment pipelines, runtime optimization, and system integration.
  • Familiarity with A/B testing frameworks, experimental design, and online evaluation.
  • Experience with big data and stream processing frameworks like Spark, Flink, or Kafka.
  • Strong focus on system reliability, latency, and observability in production environments.

Nice To Haves

  • Experience in batch and real-time inference serving, including autoscaling and traffic management.
  • Background in content recommendation systems, search ranking, or user engagement optimization.
  • Experience with CI/CD workflows for ML systems, including safe model rollouts and shadow testing.
  • Exposure to containerized deployments and orchestration (Kubernetes, Docker).
  • Prior experience working on consumer-scale media products (apps, games, books, music, or video).

Responsibilities

  • Partner with ML researchers and product teams to transition models into production, ensuring reliability, scalability, and low latency.
  • Design and implement robust inference services using object-oriented languages (e.g., Java, Scala, C++) that operate at scale across Apple platforms.
  • Build and manage data pipelines and model execution frameworks to support both batch and streaming use cases.
  • Develop tooling and infrastructure for model deployment, versioning, rollback, and online evaluation.
  • Lead A/B testing efforts, including integration, metric tracking, experiment validation, and performance analysis.
  • Collaborate with infrastructure teams to improve observability, alerting, and model health monitoring.
  • Drive continuous improvement in latency, throughput, fault tolerance, and overall system reliability.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service