Multimodal AI Researcher, Audio

Dolby Laboratories, Inc.•Atlanta, GA

104d•$130,700 - $163,000

About The Position

Join the leader in entertainment innovation and help us design the future. At Dolby, science meets art, and high tech means more than computer code. As a member of the Dolby team, you’ll see and hear the results of your work everywhere, from movie theaters to smartphones. We continue to revolutionize how people create, deliver, and enjoy entertainment worldwide. To do that, we need the absolute best talent. We’re big enough to give you all the resources you need, and small enough so you can make a real difference and earn recognition for your work. We offer a collegial culture, challenging projects, and excellent compensation and benefits, not to mention a Flex Work approach that is truly flexible to support where, when, and how you do your best work. The Advanced Technology Group (ATG) is the research division of the company. ATG’s mission is to look ahead, deliver insights, and innovate technological solutions that will fuel Dolby’s continued growth. Our researchers have a broad range of expertise related to computer science and electrical engineering, such as AI/ML, algorithms, digital signal processing, audio engineering, image processing, computer vision, data science & analytics, distributed systems, cloud, edge & mobile computing, computer networking, and IoT. Dolby is looking for a talented Multimodal AI Researcher, Audio to join Dolby’s research efforts and drive innovation in multimodal AI for audio applications, multimodal representations, and generative modeling for audio, speech, and music. You will join the Machine Reasoning and Perception team to join a team of top-tier researchers working on challenging problems in multimodal AI for entertainment applications. You will focus on the creation and implementation of multimodal and audio AI technologies from the underlying theoretical concepts to the development of prototypes and demonstrations, with the goal to create new experiences. You will drive key innovations for Dolby’s core business which allow Dolby and its customers to build products that push the boundaries of sound and multimedia experiences.

Requirements

Ph.D. in Computer Science or similar field.
A strong background in deep learning, both in terms of conceptual understanding, as well as practical experience.
Technical knowledge of audio fundamentals.
Deep passion for audio, music, and multimedia applications.
Deep knowledge on current machine learning literature.
Strong publication record, with publications in major machine learning conferences (e.g. NeurIPS, ICLR, ICML) or top domain-specific conferences is desirable (e.g., ACL, CVPR, ICASSP, Interspeech).
Highly skilled in Python and one or more popular deep learning frameworks (TensorFlow or PyTorch).
Ability to envision new technologies and turn them into innovative products.
Good communication and collaboration skills.

Nice To Haves

Experience with generative modeling for audio applications (diffusion models, autoregressive models, masked generative transformers).
Experience in multimodal semantic understanding and multimodal reasoning.
Experience with multimodal representations (audio-video, audio-text, audio-video-text).
Experience with multimodal AI architectures, with a focus on generating audio, music, and speech (text-to-audio, video-to-audio, image-to-audio).
Experience with self and semi-supervised learning.
Experience in AI driven audio enhancement, processing, and generation (for speech and music), such as speech enhancement and analysis, source separation, text-to-speech, text-to-music, music information retrieval, audio classification.
Experience with LLMs for audio applications.

Responsibilities

Partner closely with other domain experts to refine and execute Dolby’s technical strategy in artificial intelligence and machine learning.
Use deep learning to create new solutions (including foundation models) and enhance existing applications.
Push the state-of-the-art and develop intellectual property.
Transfer technology to product groups.
Establish research collaborations with external university partners.
Mentor interns on novel research problems.
Publish papers in top-tier conferences and journals.
Advise internal leaders on recent deep learning advancements in the industry and academia to further influence research direction and business decisions.

Benefits

Competitive salary range of $130,700-$163,000 plus bonus and benefits.
Flex Work approach that supports where, when, and how you do your best work.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Education Level

Ph.D. or professional degree

Number of Employees

1,001-5,000 employees

Multimodal AI Researcher, Audio

About The Position

Requirements

Nice To Haves

Responsibilities

Benefits

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company