Data Scientist

ScrunchNew York, NY
12dHybrid

About The Position

We’re looking for a Data Scientist to help us build, measure, improve AI-augmented information retrieval and web visibility, spanning retrieval & ranking, NLP, experimentation, and knowledge graph / semantic web systems. You’ll partner closely with Engineering, Product, and Marketing to turn ambiguous questions into measurable work and shippable features. Location: NYC | Hybrid | 3x/week

Requirements

  • Strong foundations in statistics, experimental design, and model evaluation.
  • Hands-on experience with information retrieval / ranking / search relevance, ideally in AI-augmented contexts.
  • Experience building and evaluating NLP models in production or research settings
  • Proficiency in Python and SQL
  • Strong communication: you can explain what you did, why it matters, and how confident you are, without hand-waving.
  • Deep comfort with the web as a system (crawl/index realities, domains, content structure, measurement constraints).
  • Experience working with non-representative data and making results more trustworthy through sampling-aware analysis (bias checks, adjustments, uncertainty).
  • Work with noisy/partial labels, long-tailed queries, drifting content, and evaluation that mixes offline + online signals

Nice To Haves

  • Reinforcement learning / control theory / optimal control applied to ranking, allocation, or policy optimization.
  • Semantic Web / Knowledge Graph tooling (RDF/OWL concepts, graph DBs, SPARQL, entity resolution).
  • SEO + AEO familiarity and the ability to connect technical visibility drivers to business outcomes.
  • Marketing analytics experience (attribution-adjacent thinking, funnel measurement, incrementality).
  • Publications in top-tier venues (e.g., SIGIR and related IR/NLP conferences) or equivalent demonstrated research depth.

Responsibilities

  • Design, prototype, and productionize models/algorithms for retrieval, ranking, and relevance quality across web and AI-assisted surfaces.
  • Build NLP pipelines (classification, entity extraction, topic modeling, sentiment analysis) and validate them with clear offline + online metrics.
  • Own measurement and experimentation: hypotheses, experiment design, guardrails, and readouts that drive decisions.
  • Develop simulation / modeling frameworks to forecast outcomes, test policies, and stress-test system behavior under different assumptions.
  • Contribute to knowledge graph / semantic web work: schema design, entity resolution, and downstream ML / GenAI applications.
  • Translate technical work into crisp narratives for stakeholders (product tradeoffs, confidence, limitations, and next steps).
  • Contribute to external thought leadership where it makes sense (blog posts, talks, papers).
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service