Applied Data Scientist

Vouch Insurance

36d

About The Position

We’re looking for an Applied Data Scientist who is excited about using data and modern AI – especially large language models (LLMs) – to build and iterate on product features. We’re looking for someone who genuinely enjoys working with messy, imperfect, real-world data – the kind that never quite fits the schema, arrives late, has surprises hidden inside it, and reflects actual user behavior. You should find energy in tracking down anomalies, debugging unexpected patterns, and getting to the root cause of data issues that affect product decisions and AI features. This role also requires a high-ownership mindset: you don’t just answer questions – you help define which behaviors matter. You proactively identify data quality issues, measurement gaps, and opportunities for product improvement, and you drive these changes across the organization with persistence and clarity. You’ll work with real-world transactional data (both SQL and NoSQL), design and ship LLM-powered experiences, and own the product analytics that measure their impact. You’ll help define what to build, how to measure it, and what to do next based on the data. This is neither a research scientist nor software engineering role – strong SQL, Python, proof-of-concept development, and product analytics skills as well as experience working with production data systems are what matter most.

Requirements

A track record of high ownership: taking responsibility for problems end-to-end, improving systems rather than just describing them, and pushing initiatives across product, engineering, and data.
A genuine love for messy, real-world data, and the curiosity to dig into anomalies until you understand what's really happening.
Hands-on experience with real-world transactional data in production environments, including messy, incomplete, or biased data.
Demonstrated experience improving data quality in production environments.
Demonstrated experience shipping LLM-based product features, such as:
Using hosted LLM APIs or in-house models
Designing prompts and workflows
Evaluating and iterating on LLM behavior using real user data
Experience in product analytics, including:
Defining and tracking product KPIs and feature-level metrics
Building and interpreting funnels, cohorts, and retention/engagement analyses
Influencing product decisions and roadmaps with data-driven insights
Experience measuring and improving data quality, and working with engineering to fix upstream issues.
Strong communication skills: ability to work cross-functionally and explain technical decisions and trade-offs to non-technical partners.
Strong SQL skills: complex joins, window functions, CTEs, and performance-aware querying.
Solid Python skills for data and AI work (e.g., pandas, NumPy, scikit-learn; OpenAI, Anthropic, and Gemini LLM libraries/frameworks).
Formal education in machine learning concepts, such as:
Supervised/unsupervised learning
Model selection and regularization
Evaluation methodologies (train/validation/test splits, cross-validation, experiment design)

Nice To Haves

Experience with LLM tooling and patterns (e.g., RAG, vector databases, prompt/tool orchestration frameworks).
Familiarity with experimentation platforms and A/B testing frameworks.
Exposure to MLOps / LLMOps: model versioning, monitoring model & LLM feature performance, feedback loops.
Experience with cloud data platforms (AWS, GCP, or Azure) and tools like Snowflake, dbt, or Dagster.

Responsibilities

Build and iterate on LLM & AI-powered product features
Design, prototype, and ship features that use LLMs (e.g., content generation, summarization, classification, semantic search, assistants, recommendations).
Work with engineers to integrate LLMs into the product via APIs or internal services (RAG, tools/functions calling, workflows, pipelines).
Define evaluation strategies for LLM features (e.g., human-in-the-loop evaluation, rubrics, prompt experiments, offline/online metrics).
Continuously refine prompts, data pipelines, and system design based on user behavior, quality metrics, and product goals.
Own product analytics for data- & AI-powered features
Partner with product managers and designers to define success metrics (e.g., adoption, engagement, conversion, retention, quality, time-to-value).
Instrument new features: define events, ensure proper logging, and validate that data is correct and trustworthy.
Analyze funnels, cohorts, user journeys, and experiment results to understand drivers of behavior and outcomes.
Translate insights into clear recommendations that influence roadmaps, prioritization, and feature iteration.
Work with real-world transactional data (SQL & NoSQL)
Explore, clean, and transform data from transactional (OLTP), analytical (OLAP), and event-based systems.
Work across SQL (e.g., Postgres, Snowflake) and NoSQL (e.g., Redis, document/Key-Value stores).
Design data assets and features that are usable by both analytics workflows and LLM/ML systems.
Data quality, measurement, and monitoring
Define and track data quality metrics (completeness, consistency, timeliness, drift, schema changes).
Build checks, monitors, and alerts to detect data issues that can affect analytics or AI/LLM performance.
Work with data and engineering teams to diagnose root causes and drive changes that improve data quality over time.
Applied ML fundamentals
Use core ML concepts (feature design, model evaluation, bias/variance, generalization) to reason about LLM and non-LLM approaches.
When appropriate, build and evaluate lighter-weight or traditional models (e.g., scoring, ranking, classification) to complement or replace LLM solutions.

Benefits

Competitive compensation and equity packages
Health, dental, and vision insurance
Parental leave
Flexible vacation time
Wellness allowance
Technology allowance
Company-sponsored personal and professional development
L&D: Partnerships with Ethena and monthly Lunch & Learns
Wellbeing: access to many wellbeing perks, including Peloton, Fetch, OneMedical, Headspace care+, etc.
Caregiver Support: company seed into the dependent care FSA and company sponsored Care.com membership.
Regular performance reviews: Vouch conducts regular performance discussions with all team members, offering goal setting and check-ins, development discussions, and promotion opportunities.