Multimodal AI PhD Intern (Spring 2026)

Reality Defender•New York, NY

53d•Remote

About The Position

Reality Defender is an award-winning cybersecurity company helping enterprises and governments detect deepfakes and AI-generated media. Utilizing a patented multi-model approach, Reality Defender is robust against the bleeding edge of generative platforms producing video, audio, imagery, and text media. Reality Defender's API-first deepfake detection platform empowers teams and developers alike to identify fraud, disinformation campaigns, and harmful deepfakes in real time. Backed by world class investors including DCVC, Illuminate Financial, Y Combinator, Booz Allen Hamilton, IBM, Accenture, Rackhouse, and Argon VC, Reality Defender works with leading enterprise clients, financial institutions, and governments in order to ensure AI-generated media is not used for malicious purposes. The 4-month internship is designed for current PhD students and candidates to partner with Reality Defender's AI team to conduct cutting-edge research and publish peer-reviewed papers. Your primary collaborators will be Surya Koppisetti and Yi Zhu , who will guide and advise your efforts within multi-modal deepfake detection. This internship can be performed remotely, although you're welcome to work from our HQ in New York City.

Requirements

PhD student in a relevant technical field, preferably three or more years into the program
Experience in multi-modal learning, such as in audio-visual classification and audio-language reasoning.
Proficient in Python and in building deep learning models with PyTorch.
Published peer-reviewed research papers in reputable AI and speech venues, e.g. CVPR, NeurIPS, ACL, Interspeech.
Excited about Reality Defender's mission to build a best-in-class and comprehensive deepfake and AI-generated content detection platform.
Available to start in Spring 2026, for a minimum duration of 4 months.

Responsibilities

Investigate and propose new methods for detecting generative multi-modal content, spanning audio and vision modalities.
Perform research on multi-modal deepfake detection and reasoning tasks.
Collaborate with researchers in the team.
Write up results of research for internal reports and submission to academic journals/workshops.
Independently implement and evaluate ideas on modern deep learning stack - Python, PyTorch, and GPU-enabled cloud compute, like AWS/GCP.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume