Machine Learning Engineer, Data

Rime LabsSan Francisco, CA
Remote

About The Position

Rime builds voice AI for enterprises running customer experiences at scale. Our text-to-speech models are purpose-built for high-volume conversational deployments, engineered for the pronunciation accuracy, latency, and deployment flexibility that production environments actually demand. We started from a different premise than the rest of the field: voice AI isn't bottlenecked by model architecture. It's bottlenecked by data. So before we trained a single model, we built our own corpus: full-duplex, studio-quality conversational speech, recorded and annotated by PhD linguists. That's our moat. It's also why enterprises pick Rime when pilots need to convert into production. We're backed by top-tier investors including Unusual Ventures, and we've built a team at the intersection of product, research, and craft. Building voice models is an art. We intend to master it.

Requirements

  • Strong software engineering fundamentals: Python, distributed systems, comfort across the stack.
  • Database design fluency: you reach for the right schema and have operated Postgres or similar in production.
  • Production data pipelines on cloud-native infrastructure (GCP preferred). Our data stack is currently GCP-dominant.
  • Operational comfort: containers, CI/CD, IAM, cost-aware infrastructure choices, etc.
  • Strong attention to detail on data quality.
  • Comfort being out of your depth at the boundary. You'll sometimes debug code you didn't write in tools you don't use daily. You should find this energizing, not threatening.
  • Bias toward building the abstractions so the modeling team doesn't stay stuck doing data work by hand.

Nice To Haves

  • Multilingual data pipeline experience.
  • Audio DSP, signal processing, or speech recognition background.
  • Large-scale training infra (FSDP, DeepSpeed, Ray).
  • Annotation tooling and human-in-the-loop systems.
  • Comfort working close to research teams.

Responsibilities

  • End-to-end audio annotation pipeline: Currently some stages exist as prototypes; productionizing and rebuilding them is work that’s currently in flight.
  • Quality systems: Automated tooling to catch annotation errors, alignment drift, and silent regressions before training runs.
  • Dataset versioning and experimenter tooling: the model team will want to subset the vetted pool ("speakers X/Y/Z, duration 3–12s, quality > 0.8") into reproducible training manifests. The query interface, manifest format, and lineage tracking are all yours.
  • Linguist- and annotation-team-facing tooling: annotation UI, PM workflow for project management, QC dashboards.
  • Pipelines for full- and half-duplex training data

Benefits

  • Meaningful equity upside.
  • Competitive base + meaningful early-stage equity
  • Remote-friendly
  • Visa sponsorship available
  • Access to a proprietary, full-duplex, studio-quality conversational speech corpus
  • Compute and tooling to do the work
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service