Apple’s Platform Acceleration & Compute Efficiency (PACE) is a high-leverage team operating at the critical intersection of our ML organizations, underlying compute infrastructure, and core platform tooling. Our mission is to empower Apple’s software engineering teams with efficient, scalable compute. By driving out operational friction and optimizing the broader machine learning ecosystem, we directly accelerate the pace of development across the company. As foundation models become increasingly central to Apple's user experiences, maximizing the efficiency of our ML compute is paramount. In this role, you will focus relentlessly on compute efficiency, ensuring that Apple’s models run as fast, reliably, and cost-effectively as possible. You will tackle massive optimization challenges, from maximizing hardware utilization across GPUs, TPUs, and custom Apple Silicon, to shaping workload scheduling and capacity allocation for large model serving. We are seeking a Senior Architect with deep expertise in ML infrastructure to act as a linchpin for Apple’s foundational inference strategy. You will be instrumental in defining, establishing, and monitoring compute efficiency metrics across the software engineering organization. By partnering closely with model developers and infrastructure providers, your work will directly reduce serving costs, shape core engineering decisions, and enable the highly efficient, scalable inference required to power Apple Intelligence for hundreds of millions of users.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
Ph.D. or professional degree
Number of Employees
5,001-10,000 employees