ML Engineer, Apple Foundation Models

Apple•Cupertino, CA

13h

About The Position

Join the team shaping the data foundation and intelligence for Apple's frontier foundation models. We believe that breakthrough AI capabilities are driven not only by model architecture and scale, but by the quality, diversity, and intelligence of the data used to train them. As part of the Apple Foundation Model team, you will help define how next-generation foundation models learn, reason, plan, and interact with the world, powering intelligent experiences used by billions of people. This is a rare opportunity to work at the intersection of cutting-edge AI research, large-scale training and data systems, and impactful consumer products. As a member of Apple's Foundation Models team, you will develop the data strategies, pipelines, and methodologies that drive model capability across the full training lifecycle, including pre-training, mid-training, and post-training. You will work closely with researchers, engineers, and product teams to identify capability gaps, design data-centric solutions, and create high-quality training signals for reasoning, agentic behavior, multimodal understanding, tool use, and alignment. Your work may span large-scale data curation, synthetic data generation, data recipe development, model ablation, benchmark-driven optimization, reward modeling, evaluation systems, and data flywheels that continuously improve model performance. Every dataset, evaluation, and insight you contribute will directly influence the capabilities of the foundation models powering Apple's next generation of intelligent experiences.

Requirements

Demonstrated expertise in LLM or Multi-modal LLM with a publication record in relevant conferences (e.g., NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, KDD, ACL, ICASSP, InterSpeech) or a track record in applying deep learning techniques to products.
Proficient programming skills in Python.
Proficient in one of the deep learning toolkits such as JAX, PyTorch, or Tensorflow.
Ability to work in a collaborative environment.
Ph.D. in Computer Science, Machine Learning, Artificial Intelligence, or a related technical field, or equivalent practical experience.

Nice To Haves

Experience developing data-centric solutions for foundation models, especially large-scale data flywheels.
Experience improving foundation models using user interaction data, private data, or other real-world feedback signals while maintaining strong privacy and data governance standards.
Experience building agentic systems, tool-use capabilities, and reasoning models.
Experience with model self-improvement techniques.
Experience developing or improving multimodal foundation models across text, vision, audio, and video.

Responsibilities

Develop data strategies, pipelines, and methodologies that drive model capability across the full training lifecycle (pre-training, mid-training, and post-training).
Work closely with researchers, engineers, and product teams to identify capability gaps.
Design data-centric solutions.
Create high-quality training signals for reasoning, agentic behavior, multimodal understanding, tool use, and alignment.
Engage in large-scale data curation.
Perform synthetic data generation.
Develop data recipes.
Conduct model ablation studies.
Optimize using benchmark-driven approaches.
Develop reward modeling.
Build evaluation systems.
Create data flywheels that continuously improve model performance.