The Speech Team within the Siri organization drives major speech recognition,synthesis and speech to speech model changes for various features deeplyembedded throughout Apple’s ecosystem. Our mission is to build cutting-edgeinfrastructure, datasets, and models that empower Siri conversational AI,dictation and various speech enabled Apple Intelligence features with powerfulcapabilities across natural language understanding, dialog generation, speechrecognition, and multi-modal interaction. We apply these technologies to createengaging, intelligent, and personalized conversational experiences for millionsof Apple users. We believe that the most impactful breakthroughs in deeplearning emerge when we address real-world problems at scale. We develop speechto speech experiences and the underlying multimodal foundation model technologyfor current and future speech-enabled features across Apple’s software,hardware, and services ecosystem. This allows for cutting edge applied researchanchored in Apple specific production needs, while improving speech interactionexperiences for Apple’s customers around the world. You will work alongside a fast-growing team of world-class engineers andscientists to tackle core problems in dialog systems and foundation models—ranging from natural language understanding and multi-turn context tracking, tothe integration of speech, text, and other modalities. You will develop anddeploy novel deep learning technologies that make Siri more intelligent,natural, and useful. You’ll help us advance the state of the art in naturallanguage processing, speech and audio modeling, and multi-modal learning, with astrong focus on bringing your innovations into production. Your ideas willdirectly impact the daily lives of billions of users through Siri.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
Ph.D. or professional degree