Lead Data Scientist

MiddeskSan Francisco, CA
$210,000 - $250,000Hybrid

About The Position

Middesk is seeking a hands-on applied ML expert to build the technical foundation for AI-driven applications that streamline customer workflows, focusing on business onboarding. The role involves shipping external-facing models in the risk/fraud space, dealing with imbalanced data, low labels, and changing behavior. This is a highly technical role with significant influence on ML design, build, and scaling at Middesk. The company follows a hybrid work model, requiring 2 days per week in the SF/NYC office, with candidates needing to be within commuting distance. Middesk is a Y Combinator graduate, backed by Sequoia Capital and Accel Partners, and recognized on the Forbes Fintech 50 List.

Requirements

  • 5+ years of production ML experience in one or more of the following areas:
  • Building Production ML for risk, fraud, credit, or trust & safety: Track record of shipping external-facing ML applications in one or more of these domains.
  • Knowledge graph applications: Hands-on experience building, querying, or extracting signals from knowledge graphs—ideally over business entity networks (companies, persons, addresses, relationships) to support identity verification, fraud detection, or risk decisioning.
  • Entity resolution for business or individual identities: Experience disambiguating and linking records across noisy, incomplete, or conflicting data sources—particularly in KYB, KYC, AML, or identity verification contexts where the same real-world entity may appear under different names, addresses, or tax IDs.
  • Expertise in classification with real-world ML challenges, for example: imbalanced labels, sparse signals, cold start, and production version management.
  • Hands-on ML infrastructure experience: feature stores, model management, ML training/serving pipelines.
  • Comfort as a senior IC: setting technical direction, mentoring peers, and establishing best practices.

Nice To Haves

  • B2B SaaS experience, ideally building ML products for enterprise customers.
  • ML pipeline and automation engineering: Experience building end-to-end training harnesses that automate feature engineering, data validation, and model training.
  • Experience scaling ML across multiple products or risk domains.

Responsibilities

  • Build risk & fraud ML applications: Deliver production ML models in fraud, trust & safety, KYB, and compliance domains, with measurable impact on customer workflows.
  • Tackle hard data problems: Work on classification problems with extreme class imbalance, sparse signals, and “cold start” label challenges.
  • Innovate in feature engineering & labeling: Use graph-based techniques, weak supervision, LLMs, and AI agents to improve signal extraction and automate labeling process.
  • Establish ML infrastructure foundations: Partner with the ML infra team to design feature services, model training pipeline, model serving standards, and orchestration to scale multiple ML use cases.
  • Design and implement knowledge graph solutions: Leveraging LLMs for graph construction, querying, and retrieval to enhance entity resolution and business identity use cases.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service