Data Scientist with Python expertise in New York

CapgeminiNew York, NY
$100,000 - $130,000Onsite

About The Position

Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues around the world, and where you’ll be able to reimagine what’s possible. Join us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world.Onsite : New YorkJob DescriptionKey ResponsibilitiesAttribution & Measurement Modeling• Build and maintain multi-touch attribution (MTA) models - touch-order aware, channel-weighted, with incremental lift quantification across owned, paid, and clean room channels• Develop cohort-level LTV/CAC scoring models using transaction signals, behavioral features (SHAP-ranked), and propensity scores - deployed at segment and micro-cohort resolution• Design holdout and matched-market test frameworks for measuring incrementality across CTV, display, paid search, and social channels• Build probabilistic identity linkage models for household graph construction and cross-device resolution where deterministic signals are absentAudience Intelligence• Develop SHAP-based feature importance pipelines for audience signal ranking - surfacing top predictive signals per segment for AI-generated audience briefs• Build behavioral micro-cohort clustering using unsupervised and semi-supervised methods on transaction and lifestyle features - producing 10+ interpretable sub-cohorts per major audience segment• Design suppression, exclusion, and lookalike model pipelines that feed into DSP activation and clean room audience deliveryAI Integration & Insight Generation• Collaborate with engineering to design system prompts, structured output schemas, and evaluation frameworks for AI-powered audience authoring, measurement intelligence, and campaign brief generation• Build model evaluation pipelines comparing AI-generated audience segments against held-out conversion actuals, benchmarking performance vs. deterministic baselines• Develop geo-level DMA performance models: LTV/CAC opportunity mapping, state-vs-DMA benchmarking, and priority zone classification for campaign planning• Author AI-assisted insight narratives - translating model outputs into plain-language recommendations surfaced to client marketing teams through the platform UIRequired Qualifications• 5+ years applied data science experience• Expert Python proficiency: scikit-learn, XGBoost or LightGBM, SHAP, pandas, statsmodels, and at least one deep learning framework for production model development• Deep expertise in multi-touch attribution methodologies: MTA, media mix modeling (MMM), incrementality testing, and controlled experiment design• Experience building LTV, propensity, and CAC models on financial transaction or behavioral data at segment and sub-segment resolution• Comfort operating inside data clean rooms - designing models that run on privacy-preserving aggregates rather than individual-level raw data• Strong statistical foundations: causal inference, Bayesian methods, survival analysis, and experiment design• Fluent SQL across cloud data warehouses (Snowflake, BigQuery, Redshift, or equivalent) and experience working with ML platforms such as Vertex AI, SageMaker, or Databricks MLflow• Ability to translate complex model outputs into business narratives for VP- and C-level marketing stakeholdersPreferred Qualifications• Experience designing AI-augmented analytics workflows - using LLM APIs for structured output generation, signal summarization, or compliance pre-screening alongside traditional models• Familiarity with walled garden measurement environments: Google ADH, Meta Analytics API, Amazon Attribution• Graph-based modeling experience - using Neo4j, Amazon Neptune, or similar for identity linkage, co-purchase signals, or household relationship modeling• Demonstrated expertise in identity resolution, household modeling, or cross-device attribution at scale

Requirements

  • 5+ years applied data science experience
  • Expert Python proficiency: scikit-learn, XGBoost or LightGBM, SHAP, pandas, statsmodels, and at least one deep learning framework for production model development
  • Deep expertise in multi-touch attribution methodologies: MTA, media mix modeling (MMM), incrementality testing, and controlled experiment design
  • Experience building LTV, propensity, and CAC models on financial transaction or behavioral data at segment and sub-segment resolution
  • Comfort operating inside data clean rooms - designing models that run on privacy-preserving aggregates rather than individual-level raw data
  • Strong statistical foundations: causal inference, Bayesian methods, survival analysis, and experiment design
  • Fluent SQL across cloud data warehouses (Snowflake, BigQuery, Redshift, or equivalent) and experience working with ML platforms such as Vertex AI, SageMaker, or Databricks MLflow
  • Ability to translate complex model outputs into business narratives for VP- and C-level marketing stakeholders

Nice To Haves

  • Experience designing AI-augmented analytics workflows - using LLM APIs for structured output generation, signal summarization, or compliance pre-screening alongside traditional models
  • Familiarity with walled garden measurement environments: Google ADH, Meta Analytics API, Amazon Attribution
  • Graph-based modeling experience - using Neo4j, Amazon Neptune, or similar for identity linkage, co-purchase signals, or household relationship modeling
  • Demonstrated expertise in identity resolution, household modeling, or cross-device attribution at scale

Responsibilities

  • Build and maintain multi-touch attribution (MTA) models - touch-order aware, channel-weighted, with incremental lift quantification across owned, paid, and clean room channels
  • Develop cohort-level LTV/CAC scoring models using transaction signals, behavioral features (SHAP-ranked), and propensity scores - deployed at segment and micro-cohort resolution
  • Design holdout and matched-market test frameworks for measuring incrementality across CTV, display, paid search, and social channels
  • Build probabilistic identity linkage models for household graph construction and cross-device resolution where deterministic signals are absent
  • Develop SHAP-based feature importance pipelines for audience signal ranking - surfacing top predictive signals per segment for AI-generated audience briefs
  • Build behavioral micro-cohort clustering using unsupervised and semi-supervised methods on transaction and lifestyle features - producing 10+ interpretable sub-cohorts per major audience segment
  • Design suppression, exclusion, and lookalike model pipelines that feed into DSP activation and clean room audience delivery
  • Collaborate with engineering to design system prompts, structured output schemas, and evaluation frameworks for AI-powered audience authoring, measurement intelligence, and campaign brief generation
  • Build model evaluation pipelines comparing AI-generated audience segments against held-out conversion actuals, benchmarking performance vs. deterministic baselines
  • Develop geo-level DMA performance models: LTV/CAC opportunity mapping, state-vs-DMA benchmarking, and priority zone classification for campaign planning
  • Author AI-assisted insight narratives - translating model outputs into plain-language recommendations surfaced to client marketing teams through the platform UI

Benefits

  • Paid time off based on employee grade (A-F), defined by policy: Vacation: 12-25 days, depending on grade, Company paid holidays, Personal Days, Sick Leave
  • Medical, dental, and vision coverage (or provincial healthcare coordination in Canada)
  • Retirement savings plans (e.g., 401(k) in the U.S., RRSP in Canada)
  • Life and disability insurance
  • Employee assistance programs
  • Other benefits as provided by local policy and eligibility

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service