Audio Solutions Architect

Innodata Inc.

6d•$150,000 - $230,000•Hybrid

About The Position

Innodata builds the high-quality voice and audio datasets that power the world's leading speech AI — text-to-speech, speech recognition, and the new generation of speech-to-speech and conversational voice models. We're hiring an Audio Solutions Architect to be both the technical partner to our customers in presales and the external technical voice of our audio practice. This is a hybrid role with two equally weighted halves. In presales, you sit with a frontier lab or enterprise team, understand what they're trying to train, and shape the data collection program that gets them there. In thought leadership, you keep us at the frontier of speech AI — producing go-to-market research and content, speaking at conferences, and establishing Innodata as the most technically credible audio data partner in the market. The two reinforce each other.

Requirements

Deep working knowledge of speech/audio AI: how TTS, ASR, and speech-to-speech systems are trained and evaluated, and what data they require.
Experience in a solutions engineering, solutions architect, technical presales, or applied/forward-deployed role — or a technical audio/speech background plus strong commercial instincts.
Demonstrated ability (and appetite) to produce public-facing technical content and represent a company externally — writing, speaking, or community engagement.
Ability to shape ambiguous requirements into precise specs and communicate them to both researchers and business stakeholders.
Strong presence and persuasion; comfortable being the technical authority in a sales conversation and on a conference stage.
Familiarity with audio technical specifications (sample rates, LUFS, formats), transcript/metadata schemas, and quality metrics (WER, DER).
A public body of work in speech/audio: talks, papers, blog posts, benchmarks.
Hands-on experience with speech datasets, annotation, or audio production.
Background working with or at a frontier AI lab or voice-AI product company.
Multilingual / localization exposure.

Responsibilities

Partner with customers in presales to understand their model objectives, current data gaps, and technical constraints.
Shape requirements: define acoustic specs, language/accent coverage, speaker demographics, emotional/paralinguistic range, transcript and metadata schema, and QA targets (WER/DER, LUFS, etc.).
Translate requirements into scoped execution plans — volumes, timelines, methodology, pricing inputs — in partnership with delivery.
Serve as the credible technical voice in the room: explain tradeoffs (studio vs. real-world vs. telephonic, scripted vs. spontaneous, single vs. multi-speaker) and defend methodology choices.
Build reusable solutioning assets: scoping frameworks, spec templates, reference architectures for common audio data use cases.
Stay at the tip of the spear on speech-AI developments (TTS, ASR, speech-to-speech) and what data the next generation of models will need.
Produce go-to-market material: technical blog posts, white papers, benchmark reports, and reference content that demonstrates Innodata's depth.
Represent Innodata externally: speak at and work conferences (Interspeech, ICASSP, industry events), engage the speech-AI community, and build our public technical profile.
Feed market intelligence back into strategy — advise on emerging data categories and where to invest ahead of demand.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Education Level

No Education Listed

Audio Solutions Architect

About The Position

Requirements

Responsibilities

What This Job Offers

Job Search Resources

Similar Audio Solutions Architect job opportunities

Tools

Templates & Examples

Resources

Comparisons

Company