Research Scientist, Sound Team

DeepMindMountain View, CA
3d$141,000 - $202,000

About The Position

Members of the Sound Team are a group of researchers working on audio understanding, editing, and generation. We are part of Frontier AI, the unit responsible for building and scaling the next generation of our core models. Research includes, but is not limited to, sound understanding, joint audio-video generation, audio-visual editing, and long-context modeling. Work with us to create a future where speech, music, and general audio are central to AI understanding, generation, and modification. About Us Artificial Intelligence could be one of humanity’s most useful inventions. At Google DeepMind, we’re a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority. The Role Research Scientists at Google DeepMind lead our efforts in developing novel algorithmic architectures towards the end goal of solving and building Artificial General Intelligence. We seek individuals who are passionate about audio and developing novel architectures to push the state of art. In this role, you will make key contributions advancing research in sound understanding, joint audio-video generation, and audio editing.

Requirements

  • PhD in Computer Science, or a related Machine Learning field.
  • Audio understanding and/or generation experience.
  • A proven track record of research and publications in some of the following areas: audio generation, video generation, LLMs.

Nice To Haves

  • Experience working with LLMs.
  • A real passion for Audio and Sound!

Responsibilities

  • Data: Unlocking new audio capabilities within the model, both in pre-training and post-training.
  • Models: Improving quality of models for understanding and generation. This includes research to improve our tokenizers, better techniques for generation quality, and looking at joint audio and visual representations.
  • Evals: Better evaluation methods (human, auto raters, automated metrics) to measure quality of open-ended tasks.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service