Senior AI Engineer, Voice Platform

ClickUp
$200,000 - $250,000

About The Position

At ClickUp, we’re not just building software. We’re architecting the future of work! In a world overwhelmed by work sprawl, we saw a better way. That’s why we created the first truly converged AI workspace, unifying tasks, docs, chat, calendar, and enterprise search, all supercharged by context-driven AI, empowering millions of teams to break free from silos, reclaim their time, and unlock new levels of productivity. At ClickUp, you’ll have the opportunity to learn, use, and pioneer AI in ways that shape not only our product, but the future of work itself. Join us and be part of a bold, innovative team that’s redefining what’s possible! 🚀 Role Overview You'll own and evolve the AI systems behind ClickUp's voice platform: real-time streaming transcription, intelligent reformatting, context-aware mention detection, and voice-to-action pipelines. This is a high-impact, hands-on role where you'll push the boundaries of what voice interfaces can do inside a productivity tool used by millions.

Requirements

  • Ambition, grit, and a passion for improving the way people work.
  • Potential impact they can have.
  • The best people for the job and support each person’s journey to build their boldest career.

Responsibilities

  • Design, build, and optimize real-time speech-to-text pipelines (streaming ASR, VAD, audio processing)
  • Improve transcription accuracy through context injection (user names, teams, custom vocabulary, language detection)
  • Develop and maintain LLM-powered post-processing (grammar correction, filler removal, mention resolution, formatting)
  • Build voice-to-action systems that parse natural language into structured workspace commands
  • Evaluate, benchmark, and integrate ASR models (Whisper, AssemblyAI, Fireworks, etc.) for cost, latency, and accuracy
  • Collaborate with product and platform teams to ship voice features across MAX Desktop, Mobile, Web, and Browser Extension
  • Explore multimodal AI capabilities (screen + voice + text) for next-gen assistant experiences
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service