Machine Learning Engineer, TTS Systems

BlandSan Francisco, CA
1d$160,000 - $250,000Remote

About The Position

At Bland.com, we empower enterprises to build and scale AI phone agents. As a fast-growing team in San Francisco, our mission is to advance customer interactions with businesses through natural, reliable, and highly human-like voice technologies. Backed by $65M in funding from leading Silicon Valley investors, including Emergence Capital, Scale Venture Partners, Y Combinator, and founders of Twilio, Affirm, and ElevenLabs. As an ML Engineer focused on Text To Speech (TTS), you will own the deployment, optimization, and maintenance of our production TTS systems. Your work will transform advanced research models into highly performant, scalable, and robust real-world solutions serving millions of real-time voice interactions daily. You will collaborate with research and engineering teams to implement inference-optimized TTS models, streamline deployment processes, and monitor live systems to ensure best-in-class performance for enterprise clients.

Requirements

  • Hands-on experience deploying large-scale neural TTS models in cloud or on-prem production settings.
  • Deep expertise in TTS inference optimization (e.g., quantization, kernel optimization, batching strategies, GRPO).
  • Strong understanding of real-time, low-latency audio processing pipelines and their challenges.
  • Working knowledge of distributed systems, GPU acceleration, and scalable production infrastructure.
  • Ability to diagnose and resolve quality, performance, and reliability issues in deployed voice systems.
  • Comfortable working in fast-paced, startup environments and taking full ownership from deployment through system maintenance.

Nice To Haves

  • Contributions to open-source TTS systems or production audio frameworks.
  • Prior work in telephony, streaming, or live enterprise communication environments.

Responsibilities

  • Deploy and optimize large-scale TTS models into production environments for reliable, low-latency inference.
  • Implement and refine post training techniques (Like DPO, GRPO, and RLHF) and other modern inference techniques to maximize throughput and audio quality.
  • Collaborate with cross-functional teams to ensure seamless rollout, A/B testing, and iterative improvement of production models.
  • Maintain high availability and scalable infrastructure for multi-speaker, expressive, and controllable TTS use cases.
  • Design and document best practices for efficient TTS inference and system reliability.

Benefits

  • Healthcare, dental, vision
  • Meaningful equity in a fast-growing company
  • Every tool you need to succeed
  • Beautiful office in Jackson Square, SF with rooftop views
  • Competitive salary: $160,000 to $250,000
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service