As a Research Engineer focused on Multi-Modal Understanding, you will develop advanced algorithms that integrate computer vision with other modalities such as language, audio, and sensor data. You will also drive the curation of multi-modal datasets and ground truth annotation pipelines to support model training and evaluation. You will work closely with our research team to bring innovative multi-modal solutions to production, bridging the gap between visual perception and holistic contextual understanding for immersive applications.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level