Research Scientist

MetaRedmond, WA
$184,000 - $257,000

About The Position

Reality Labs at Meta is seeking a Research Scientist with expertise in multi-modal understanding to advance AI-powered interactions. We're building next-generation capabilities that integrate vision, language, audio, and sensor modalities. This is a unique opportunity to conduct cutting-edge multi-modal research with direct product impact.

Requirements

  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • Currently has, or is in the process of obtaining, a PhD in Computer Science, Machine Learning, Computer Vision, or a related technical field. Degree must be completed prior to joining Meta
  • Demonstrated expertise in multi-modal learning — including architecture design, training, and cross-modal alignment techniques
  • Programming experience in Python and hands-on experience with deep learning frameworks such as PyTorch
  • Experience developing machine learning models at scale from inception to impact
  • 5+ years of research experience working autonomously on ML problems involving multiple modalities (vision, language, audio, or sensor data)

Nice To Haves

  • Deep expertise in vision-language models, cross-modal attention mechanisms, or contrastive learning approaches
  • First-authored publications at peer-reviewed AI conferences (e.g., CVPR, NeurIPS, ICML, ICLR, ACL, ECCV)
  • Experience with on-device or edge multi-modal model optimization (quantization, sparsity, distillation)
  • Demonstrated software engineering experience via internship, work experience, or widely used contributions in open source repositories
  • Experience bringing multi-modal AI products from research to production
  • Proven track record of developing multi-modal models that fuse vision, language, and/or audio for real-world applications

Responsibilities

  • Lead the design, development, and optimization of multi-modal models that integrate vision, language, audio, and sensor inputs
  • Set technical direction for multi-modal research projects
  • Conduct research and experiments to improve cross-modal alignment and fusion strategies
  • Collaborate with cross-functional teams (engineering, HCI, product) to transition multi-modal research into production
  • Explore and adopt novel model optimization, quantization, and efficiency techniques
  • Stay current with state-of-the-art advances in multi-modal learning, vision-language models, and related fields

Benefits

  • bonus
  • equity
  • benefits
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service