Deep Learning Engineer - Perception Algorithms

Apple•Sunnyvale, CA

61d

About The Position

Do you have a passion for deep learning and computer vision problems? We are looking for someone who thrives on collaboration and wants to push the boundaries of what is possible today! Join our team of committed deep learning engineers in the Video Computer Vision group! We are a centralized applied research and engineering organization responsible for developing real-time on-device Computer Vision, Machine Perception, and Generative technologies across Apple products. Our shipped technologies power features in ARKit, MeasureApp, RoomScan, Accessibility, and multiple VisionPro features. As a member of the Video Computer Vision group you will develop new technologies in the area of scene understanding and for Apple’s next generation products. DESCRIPTION We are looking for a skilled Deep Learning Engineer for our team. In this role, you will perform research and development work to design algorithms for challenging real world problems in the domain of scene understanding.

Requirements

BS in Computer Science or related field with a minimum of 3 years of relevant industry experience.
Experience in designing and training deep learning networks for image understanding tasks, e.g. image classification, object detection, semantic segmentation, panoptic segmentation, etc.
Experience in developing downstream perception algorithms with vision-language models, e.g. CLIP Solid mathematical foundation of machine learning and deep learning techniques.
Strong coding skills in python (with pytorch) and C/C++.
Solid mathematical foundation of machine learning and deep learning techniques.

Nice To Haves

PhD degree with focus on machine learning, computer vision, robotics or MS with a comparable industry career of 3+ years.
Consistent track record of researching, inventing and/or shipping advanced machine learning algorithms.
Experience in language-guided image understanding tasks e.g. open-vocabulary image classification, language-guided visual grounding, open-vocabulary semantic segmentation, etc.
Experience with designing and training with pipelines which consume large (billion scale) data for training efficient vision language models for edge-devices. This includes data curation for training vision language models, writing efficient data loading pipelines, utilizing distributed GPU training framework.
Experience with advanced task-specific quality optimization techniques (few-shot learning, meta-learning, domain adaptation, knowledge-distillation, fine-grained learning) for improving network performance and handling specific failure cases (long-tailed distributions/under-represented classes) for downstream tasks.
Experience in designing and optimizing network towards inference efficiency.
Strong coding skills in ObjectiveC.
Excellent communication and collaboration skills.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume