Senior Applied Researcher, Audio Generation

CartesiaSan Francisco, CA
111d

About The Position

We are seeking a Senior Applied Researcher to contribute to the development of our next-generation speech models. You will be responsible for designing, training, and deploying novel generative models for tasks like multi-lingual text-to-speech (TTS), voice conversion, music generation, and sound effect synthesis. The challenge is no longer just about creating high-fidelity audio; it's about generating it with near-zero latency and giving users precise creative control. We aim to set new standards for accuracy, speed, and usability in production systems.

Requirements

  • Proven experience in developing and training novel generative models, preferably for audio or speech.
  • Clear understanding of the architectural trade-offs between model quality, inference speed, and memory footprint.
  • Hands-on experience with model conditioning and control mechanisms.

Responsibilities

  • Develop & optimize speech and audio models for production.
  • Work with engineering to ship and scale your models across our target platforms: cloud, on-premise, and on-device.
  • Develop model architectures and inference strategies specifically for low-latency, real-time performance on consumer hardware.
  • Implement and refine mechanisms for fine-grained controllability, allowing for the manipulation of attributes like speaker identity, emotion, prosody, and acoustic style.
  • Pioneer the latest research on new architectures for generative modeling.

Benefits

  • Lunch, dinner and snacks at the office.
  • Fully covered medical, dental, and vision insurance for employees.
  • 401(k).
  • Relocation and immigration support.
  • Your own personal Yoshi.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service