About The Position

Do you want to shape the platform that enables the next generation of intelligent experiences on Apple products & services? In Apple’s Machine Learning Platform Technology & Infra team we have built the platform that Apple uses for developing machine learning, artificial intelligence, and computer vision applications. As a team, we have a variety of technical backgrounds, from machine learning PhDs to builders of large-scale production systems. Specifically in this role you will be working on optimizing end-to-end system performance of distributed machine learning workloads. This is a highly collaborative role and you will be working with key partners across the company.

Requirements

  • Experience working with large scale parallel and distributed accelerator-based systems
  • Experience optimizing performance and AI workloads at scale
  • Experience developing code in one or more of training frameworks (such as PyTorch, TensorFlow or JAX)
  • Strong communicator with ability to analyze complex and ambiguous problems
  • Programming and software design skills (proficiency in C/C++ and/or Python)
  • Experience working in a high-level collaborative environment and promoting a teamwork mentality
  • Bachelor's degree in Computer Science and 7+ years of work experience

Nice To Haves

  • Deep understanding of computer systems and the interactions between HW and SW
  • Experience in performance analysis and optimization experience in Cloud accelerators
  • Advanced degree in CS

Responsibilities

  • Engage with ML researchers to optimize end-to-end performance of large scale distributed ML workloads
  • Analyze workload metrics to identify sources of inefficiencies and work with users to understand and optimize ML workloads
  • Conduct workload analysis based on benchmarking key workloads on deployed systems
  • Improve large scale training resiliency by optimizing applications and frameworks for improved recovery from failures and preemptions
  • Influence architecture, design, development, and operations of next generation ML accelerator systems based on workload insights

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Career Level

Senior

Education Level

Bachelor's degree

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service