Search - Agent Builder - Senior Data Scientist

Elastic

68d

About The Position

Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale — unleashing the potential of businesses and people. The Elastic Search AI Platform, used by more than 50% of the Fortune 500, brings together the precision of search and the intelligence of AI to enable everyone to accelerate the results that matter. By taking advantage of all structured and unstructured data — securing and protecting private information more effectively — Elastic’s complete, cloud-based solutions for search, security, and observability help organizations deliver on the promise of AI. What is The Role: The Search Conversational Experiences team builds Elastic’s new conversational (agentic) platform that lets customers chat with their own data in Elasticsearch. We own the quality layer for RAG, agents and tools, retrieval/citations, streaming, memory, and—crucially—the evaluation signals that turn open-ended questions into grounded, reliable answers. As a Senior Data Scientist, you’ll be part of a cross-functional team (backend, DS, PM, UX) driving chat quality end-to-end: designing and running evaluation pipelines, improving prompts and tool behaviors, and turning measurements into product decisions that customers can feel. You’ll help tackle frontier problems—folding RAG and vector search into an agent’s knowledge base, dynamically enriching model context to boost groundedness, shaping agent routing and tool selection policies, lighting up agent-driven visualizations on top of Elasticsearch data, and exploring multimodality and reasoning strategies where they truly move the needle. This is an applied role: you will prototype, evaluate, and partner with engineers to ship.

Requirements

5–8 years in applied DS/ML with strong IR/NLP experience (RAG, dense/sparse retrieval, re-ranking, vector search).
Proficiency in Python, PyTorch/Transformers, Pandas; reproducible experiments (e.g., MLflow), versioned datasets, and clean, reviewable code.
Hands-on evaluation expertise: offline metrics (nDCG/MRR/Recall@k), LLM-as-judge calibration, groundedness/citation scoring, and online A/B testing.
Experience turning experimental results into clear product calls (models, routing, tools) and communicating them crisply to cross-functional partners.
Practical Elasticsearch experience (or similar); ES|QL familiarity is a plus.
Comfort working in a distributed, async-first environment; strong written communication; low-ego collaboration.

Responsibilities

Design and maintain offline/online evaluation pipelines for conversational search: golden sets, rubric/LLM-as-judge calibration, groundedness/citation checks, and A/B tests.
Build and compare retrieval & re-ranking baselines (sparse + dense), query understanding, and semantic rewrites; land improvements with clear metrics.
Use results to drive product decisions: model selection, efficient agent routing, tool gating, and agent customization for Elastic use cases in search and beyond.
Instrument dashboards and telemetry so helpfulness, faithfulness, latency, and cost trade-offs are visible and trustworthy; guard against regressions in CI.
Collaborate tightly with backend engineers on contracts (ES|QL, citations, telemetry), and with PM/UX to translate findings into shipped features.
Share outcomes clearly (docs, notebooks, PRs) and mentor peers in experiment design and evaluation craft.

Benefits

Competitive pay based on the work you do here and not your previous salary
Health coverage for you and your family in many locations
Ability to craft your calendar with flexible locations and schedules for many roles
Generous number of vacation days each year
Increase your impact - We match up to $2000 (or local currency equivalent) for financial donations and service
Up to 40 hours each year to use toward volunteer projects you love
Embracing parenthood with minimum of 16 weeks of parental leave

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume