About The Position

The Tech Lead Data Scientist, AI Evaluation & Monitoring is the principal technical expert for how Geisinger evaluates, monitors, and optimizes AI systems in production. This is a hands-on technical leadership role. The Tech Lead sets the technical direction for AI evaluation across a large and growing portfolio, provides technical leadership to a team of data analysts who execute evaluation work, and partners directly with AI program teams to raise the quality of how AI is validated, monitored, and improved over time. The role exists because AI at Geisinger has scaled past the point where oversight can be a document-review exercise. We need a technical leader who can guide program teams toward better-designed evaluations up front, instrument meaningful production monitoring, and continually advance the methods we use, from LLM-as-Judge frameworks to simulation-based testing to pragmatic experiment design that actually scales in healthcare.

Requirements

  • 6+ years in data science, statistics, ML engineering, or applied quantitative research, with demonstrated experience as the senior technical voice on cross-functional projects
  • Strong foundation in experimental design and causal inference — and judgment about which method fits which situation
  • Hands-on experience designing and running model evaluation studies in real production settings
  • Experience evaluating LLM or generative AI systems, or comparable experience evaluating complex ML systems where ground truth is messy
  • Proven ability to translate ambiguous failure modes into concrete, defensible evaluation designs and monitoring metrics
  • Strong fluency in Python and SQL; working comfort with modern ML tooling and cloud-native data environments
  • Experience with fairness and equity evaluation for ML systems
  • Track record of providing technical leadership and mentorship without formal people-management authority
  • Clear written communication — the role produces evaluation memos and specifications that non-technical decision-makers rely on
  • Bachelor's Degree-Related Field of Study (Required)
  • Minimum of 6 years-Relevant experience (Required)

Nice To Haves

  • Healthcare, clinical, or regulated-industry experience strongly preferred
  • MS or PhD in a quantitative field preferred; equivalent experience accepted

Responsibilities

  • The technical evaluation methodology applied to AI programs across the enterprise, pre-production validation, production monitoring, and ongoing optimization
  • Hands-on guidance to program teams as they design validation studies, equity audits, monitoring plans, and escalation playbooks for their AI systems
  • Instrumentation of production monitoring: translating program-specific failure modes into concrete, measurable metrics
  • The evaluation toolkit: LLM-as-Judge frameworks, golden sets, simulation harnesses, experimental study designs, drift detection, subgroup fairness analysis
  • Reusable evaluation playbooks and templates that let each new program move faster than the last
  • Technical direction, design review, and mentorship for a team of data analysts supporting the evaluation function

Benefits

  • healthcare benefits for full time and part time positions from day one, including vision, dental and domestic partners.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Number of Employees

5,001-10,000 employees

© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service