Data Scientist / ML Engineer

Hilbert's AI•San Francisco, CA

3h•Onsite

About The Position

Hilbert is a scalable, data science-first growth engine that gives B2C teams predictive clarity into user behavior, revenue drivers, and the actions that drive sustainable growth. Fully agentic by design, Hilbert shrinks months-long decision cycles to minutes. From Fortune 10 enterprises to beloved brands like FreshDirect, Blank Street, and Levain Bakery, operators run their growth on Hilbert. We're also co-building alongside leading AI companies. We're looking for a Data Scientist who understands B2C business problems deeply, builds models that work with real-world data, and delivers analyses that drive real growth outcomes — all with the ownership and urgency of a founder. This is not a "receive a ticket, train a model, hand off a notebook" role. You'll own problems end-to-end — from framing through modeling through impact — for enterprise customers where the stakes are real and the feedback loop is tight. If you understand why churn analysis matters differently for a grocery retailer versus a fashion marketplace, can build a recommendation system that works with sparse data, and can walk a customer through your causal analysis with clarity and conviction, we want to meet you. THE ROLE You'll work directly with the founding team and alongside engineering, product, and GTM to build and improve the data science systems at the heart of Hilbert. You'll be hands-on every day — building models, running experiments, interrogating data, and delivering analyses that change decisions. B2C is our world. The problems we solve — demand prediction, customer lifecycle, personalization, activation — require someone who understands these domains and can translate business context into modeling choices. The environment is high-autonomy and high-ambiguity. Data is often messy, incomplete, or limited. You thrive in exactly those conditions. What you'll do: Build ML models that power core product capabilities: recommendation systems, search relevance, customer segmentation, demand forecasting, and activation optimization Contribute to configurable, multi-tenant model architectures that adapt across different customer contexts, data availability, and business requirements — not bespoke rebuilds for every use case Create meaningful models with the data that's actually available — not the data you wish you had. You extract signal from limited, noisy, or sparse datasets and reach for the right level of complexity Design and run rigorous A/B tests — including understanding when A/B testing is insufficient and causal inference methods are required Apply causal reasoning rigorously — you know the difference between correlation and causation, you surface true drivers, and you flag when others confuse the two Deliver analyses that drive decisions — you connect model outputs to business outcomes and communicate them with clarity to founders, teammates, and customers Think in systems. You don't build isolated models — you understand how recommendation, segmentation, scoring, and activation interact with each other and design your work to fit within the broader system Collaborate closely with engineering to take models from prototype to production Move fast — prototype, validate, ship, iterate. You're comfortable with imperfect information and evolving requirements.

Requirements

You have strong B2C business knowledge. You understand the problems consumer businesses actually face — customer acquisition vs. retention economics, lifecycle dynamics, basket composition, churn drivers, promotional cannibalization, channel attribution, demand elasticity. This knowledge informs how you frame problems and design models
You're a systems thinker. You understand how models, data flows, customer behavior, and business outcomes connect. You don't optimize one metric in a vacuum — you consider second-order effects and how your work fits the bigger picture
You've built recommendation, search, and/or customer-based ML models — collaborative filtering, content-based methods, ranking systems, segmentation, propensity modeling. You understand when each applies and why
You know how to build for configurability. You've worked on or contributed to model architectures and pipelines that flex across multiple customers, segments, or contexts — not rigid, single-purpose implementations
You create value from limited data. You make pragmatic modeling choices when data is sparse, noisy, or cold-start. You know when a simpler approach beats a complex one and aren't seduced by unnecessary sophistication
You're rigorous about causality. You understand causal inference methods — difference-in-differences, instrumental variables, propensity scoring, synthetic controls — and know when to apply them. You design A/B tests properly and understand their limitations
You communicate with clarity and conviction. You can present an analysis to a non-technical audience and make it land. You can write a one-pager that changes a decision. You explain your reasoning, not just your results. Communication is not a nice-to-have here — it's core to the role
You take ownership. You don't wait for someone to define the problem perfectly. You dig in, frame it, propose an approach, and ship it. If something breaks or underperforms, you treat it as your problem
You thrive in ambiguity. Problem definitions shift. Data availability surprises you. Requirements evolve. You're energized by figuring it out, not paralyzed by incomplete specs
You move at startup speed. You understand what it means to be available, responsive, and biased toward action in a fast-moving, early-stage environment

Nice To Haves

Strong Python proficiency — you write production-quality code, not just notebook prototypes
Experience with experimentation platforms and A/B testing infrastructure
Exposure to retail, e-commerce, CPG, or marketplace data environments
Experience at early-stage startups or high-growth companies where you wore multiple hats
Familiarity with modern data and ML infrastructure — feature stores, orchestration, model serving, monitoring
Background in economics, econometrics, or quantitative social science that informs your causal thinking

Responsibilities

Build ML models that power core product capabilities: recommendation systems, search relevance, customer segmentation, demand forecasting, and activation optimization
Contribute to configurable, multi-tenant model architectures that adapt across different customer contexts, data availability, and business requirements — not bespoke rebuilds for every use case
Create meaningful models with the data that's actually available — not the data you wish you had. You extract signal from limited, noisy, or sparse datasets and reach for the right level of complexity
Design and run rigorous A/B tests — including understanding when A/B testing is insufficient and causal inference methods are required
Apply causal reasoning rigorously — you know the difference between correlation and causation, you surface true drivers, and you flag when others confuse the two
Deliver analyses that drive decisions — you connect model outputs to business outcomes and communicate them with clarity to founders, teammates, and customers
Think in systems. You don't build isolated models — you understand how recommendation, segmentation, scoring, and activation interact with each other and design your work to fit within the broader system
Collaborate closely with engineering to take models from prototype to production
Move fast — prototype, validate, ship, iterate. You're comfortable with imperfect information and evolving requirements.