Director I, Data Science, People, Purpose & Brand

Liberty Mutual Insurance•Boston, MA

15h•$137,000 - $257,000•Hybrid

About The Position

The Data Science and Assessments (DS&A) team within People, Purpose, & Brand (PP&B) is hiring a Director I, Data Scientist (STP) to serve as a hands-on technical leader for GenAI evaluation, enablement, and responsible AI adoption. PP&B develops programs across talent acquisition, workforce planning, performance & rewards, employer branding, and wellbeing to build a diverse, future-ready workforce. In this role, you will shape how DS&A generates, evaluates, and scales ML and GenAI-based programs and tooling to improve productivity, decision-making, and employee experience across PP&B. Core responsibilities span three areas: GenAI Enablement: Design, build, and maintain evaluation frameworks and pipelines that accelerate the creation, iteration, and safe expansion of GenAI capabilities, including agentic workflows, data readiness, and post-deployment monitoring. Vendor & AI/ML Risk Management: Define and execute a vendor-AI evaluation vision enabling repeatable vendor selection, value validation, quality monitoring, and responsible model risk management. Broader Data Science: Support classic ML, automation, and technical consulting to empower PP&B colleagues and drive enterprise-wide outcomes. This individual contributor role reports through the Office of Data & Data Science. The ideal candidate is proactive, highly technical, and collaborative; able to translate complex tradeoffs into clear, actionable recommendations that drive meaningful impact.

Requirements

Strong foundation in Data Science principles (Probability, Statistics, AI/ML).
Experience with LLMs, embeddings, and generative/agentic systems.
Proficient in Python; comfortable writing production-quality code.
Strong SQL skills for querying, validation, and data exploration.
Experience with APIs and integrating external model or vendor services.
Familiarity with cloud computing concepts and services.
Experience evaluating models for fairness, bias, privacy, explainability, or responsible AI.
Comfortable with Git-based version control and collaborative code review.

Nice To Haves

A collaborative, customer-focused mindset oriented toward pragmatic, high-impact solutions and enabling others.
Strong project management skills: building clear plans, coordinating across stakeholders, and driving accountability.
Clear communication of technical tradeoffs, evaluation results, and recommendations to both technical and non-technical audiences.
Product-minded engineering: building systems that are maintainable, scalable, and easy to adopt.
A bias toward action combined with rigorous attention to evaluation and safety.

Responsibilities

Architect, develop, and maintain tooling and pipelines to support GenAI model development, evaluation, deployment, and monitoring across PP&B programs.
Design and operationalize scalable evaluation frameworks and metrics for GenAI systems (including automated and human-in-the-loop evaluations) to ensure quality, safety, and organizational alignment.
Lead the vendor AI evaluation program: define criteria, run benchmarks and pilots, synthesize results, and provide clear recommendations for vendor selection and integration.
Build reusable components leveraging APIs and templates that enable rapid iteration and reliable deployment of GenAI features.
Partner with stakeholders across PP&B to translate business needs into technical designs, evaluation plans, and implementation roadmaps.
Promote strong engineering hygiene across projects (CI/CD, version control, testing, documentation, reproducible pipelines).
Provide informal mentorship for data science and analytics colleagues on tools and evaluation processes.