Senior Data Scientist, Identity Graph

Impact.comColumbus, OH

About The Position

We're looking for a Senior Data Scientist with deep expertise in identity graph construction and resolution to lead a critical modernization of Impact's identity graph infrastructure. This is a technically demanding individual contributor role — you'll own the diagnostic, architectural, and implementation work required to understand our current identity graph end-to-end, identify its weaknesses, and drive meaningful, measurable improvements. You'll start by mapping and deeply understanding the existing pipeline, then move quickly to identify and execute near-term wins while simultaneously building the research and testing infrastructure needed to evaluate next-generation identity resolution approaches. A significant part of the role involves detecting, filtering, and eliminating bad data that degrades graph quality — requiring both investigative rigor and strong engineering capability. You'll also serve as Impact's technical counterpart with external identity and measurement partners — designing and running structured POCs, evaluating vendor solutions rigorously, and producing data-driven recommendations on build-vs-buy decisions that shape the long-term identity strategy.

Requirements

  • 5+ years in data science, ML engineering, or applied research, with significant hands-on experience in identity graph construction, entity resolution, or large-scale identity matching in a production environment.
  • Deep, firsthand knowledge of how identity graphs are built, maintained, and evaluated — including deterministic and probabilistic linking, entity deduplication, graph schema design, and resolution at scale.
  • Demonstrated experience diagnosing and remediating data quality issues in large, complex datasets — identifying corrupted records, erroneous links, and systemic pipeline failures.
  • Proven ability to build and deploy production-grade pipelines independently; strong Python and SQL; solid software engineering fundamentals (testing, version control, observability).
  • Ability to design reproducible experiments, define meaningful evaluation metrics, and validate results before recommending adoption — in both internal and vendor evaluation contexts.
  • Ability to translate complex technical findings into clear strategic recommendations; experience presenting to cross-functional and leadership audiences.
  • Bachelor's in a quantitative field (CS, Statistics, Math, Engineering, or similar); Master's/PhD preferred.

Nice To Haves

  • Experience with graph databases and graph analytics (Neo4j, NetworkX, or similar) applied to identity or entity data.
  • Familiarity with ML-based entity resolution approaches: blocking strategies, embedding-based similarity, probabilistic matching, and hierarchical clustering.
  • Experience evaluating third-party identity resolution or data enrichment vendors; familiarity with the external identity and measurement partner landscape.
  • Exposure to affiliate marketing, ad tech, or performance marketing data environments and the identity challenges specific to those ecosystems.
  • Experience with real-time or near-real-time identity resolution and low-latency graph querying.
  • Familiarity with GCP tools (BigQuery, Vertex AI, Dataflow, Cloud Run) and/or Databricks/Spark for large-scale data processing.
  • Knowledge of privacy-preserving identity techniques (differential privacy, hashing, clean rooms) and their implications for graph construction.
  • Experience with device fingerprinting, cross-device identity, or cookie-less identity resolution approaches.

Responsibilities

  • Conduct a thorough, end-to-end mapping of Impact's current identity graph construction pipeline — data sources, entity linking logic, resolution rules, graph schema, and downstream consumers.
  • Document the architecture clearly and completely, creating a shared foundation of understanding for Data Science, Engineering, and Product stakeholders.
  • Identify structural weaknesses, coverage gaps, resolution failures, and data quality issues that degrade graph accuracy and completeness.
  • Produce a prioritized inventory of problems and opportunities, distinguishing quick wins from longer-horizon architectural improvements.
  • Identify and execute high-leverage, near-term improvements to the identity graph — targeting resolution accuracy, coverage, and data freshness without requiring full architectural overhaul.
  • Implement incremental enhancements to existing matching and linking logic; measure impact rigorously and communicate results to stakeholders.
  • Build the habit of continuous improvement into the graph pipeline: monitoring, alerting, and iterative refinement as ongoing practice rather than one-time effort.
  • Research, design, and implement methods to systematically identify, filter, and eliminate bad data in the identity graph — including corrupted identifiers, ghost entities, erroneous links, and stale or conflicting records.
  • Build detection pipelines that surface data quality issues proactively; define quality thresholds and SLOs for graph health.
  • Establish feedback loops that catch new bad data patterns as they emerge, preventing quality degradation over time.
  • Develop a robust testing and experimentation environment for evaluating cutting-edge identity resolution techniques — including probabilistic matching, deterministic linking, graph-based entity resolution, and ML-based approaches.
  • Design evaluation frameworks with clear, reproducible metrics for resolution precision, recall, coverage, and graph coherence.
  • Research and prototype emerging methods from the academic and industry literature; validate results rigorously before recommending adoption.
  • Maintain a living research agenda that tracks the state of the art in identity resolution and surfaces relevant advances to the team.
  • Engage directly with external identity and measurement partners to scope and execute structured POCs — evaluating vendor technologies against Impact's specific identity graph requirements.
  • Design rigorous, data-driven evaluation frameworks for vendor assessments; produce clear, evidence-based build-vs-buy recommendations for leadership.
  • Serve as Impact's technical counterpart in partner conversations — understanding vendor architectures deeply, asking sharp questions, and stress-testing vendor claims against real data.
  • Stay current on the external landscape for identity resolution, data enrichment, and measurement vendors; bring relevant developments to the team proactively.
  • Take improvements and new capabilities from research and POC to production, independently or in close partnership with Data Engineering — owning deployment, testing, monitoring, and iteration.
  • Write clean, well-tested, production-grade code; build pipelines that are maintainable and observable by the broader team.
  • Collaborate with MLOps and Platform Engineering to ensure production readiness: reliability, scalability, latency, and drift monitoring.
  • Translate complex identity graph findings — audits, quality analyses, vendor evaluations — into clear, actionable narratives for technical and non-technical audiences.
  • Present findings, tradeoffs, and recommendations in planning and leadership forums; communicate uncertainty and risk honestly.
  • Contribute to documentation and knowledge sharing that makes the identity graph understandable and trustworthy across the organization.

Benefits

  • Medical, Dental, and Vision insurance
  • Office-only catered lunch every Thursday, a healthy snack bar, and great coffee to keep you fueled
  • Flexible spending accounts
  • 401(k)
  • Responsible PTO policy
  • Mental health and wellness benefit includes up to 12 fully covered therapy/coaching sessions per year, with additional dependent coverage.
  • Monthly gym reimbursement policy
  • Restricted Stock Units (RSUs)
  • Free Coursera subscription
  • PXA courses
  • 26 weeks of fully paid leave for the primary caregiver
  • 13 weeks fully paid leave for the secondary caregiver
  • Technology stipend to help you set up your home office
  • Monthly allowance to cover your internet expenses
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service