Data Scientist 2

BioMarin Pharmaceutical Inc.Novato, CA
52d

About The Position

The Data Scientist in Technical Operations (TOPS) plays a critical role in advancing BioMarin’s end-to-end product lifecycle by delivering high value Data/AI Solutions across Technical Development, Manufacturing, Engineering, Quality, and Supply Chain functions. Data Scientists in TOPS contribute to owning, developing and executing the organization’s Integrated Technical Data Strategy, applying advanced analytics, machine learning, and AI to complex datasets originating from Manufacturing, Quality and Supply Chain systems. They help transform fragmented data into actionable intelligence, extract insights which are otherwise hidden, identify gaps, and drive data maturity roadmap. This role blends advanced technical skills in Data Science—covering statistics, Modelling, AI/ML—with deep domain expertise in highly regulated Biotech industry. These should be complemented by soft skills including collaboration, clear communication, presentation skills, enhanced clarity and ability to effectively translate those requirements to solutions . Data Scientists are expected to collaborate across departments, partners with Business SMEs, other Data Scientists/Analysts/Engineers and IT, and lead initiatives that promote a culture focused on decision science with an end-goal to help TOPS streamline operations, boost data reliability, and speed up decision-making.

Requirements

  • Master’s (minimum) in Data Science, Computer Science, Statistics, or related field; 5+ years of hands-on experience delivering Data/AI solutions in an industry setting.
  • Advanced SQL and Python for data wrangling, feature engineering, modeling, and automation.
  • Experience developing Python based web applications using frameworks such as Dash, Flask, Streamlit. Familiarity with HTML/CSS and TS frameworks (React) is a plus.
  • Strong experience working with Databases (Postgres, SQL Server) and Data Platforms (Azure Databricks).
  • Proven record of successful end-to-end data analysis project management: from problem and requirements definition to data validation and results presentation
  • Proficiency with one or more enterprise Business Intelligence technologies (Power BI, Tableau, Spotfire)
  • Solid understanding of Data modelling principles and design patterns.
  • Proven experience building and operationalizing GenAI pipelines (Chunking, RAG, Vector index) on Databricks (Delta, Unity Catalog, MLflow, Jobs/Workflows, Spark, Lakeflow).
  • Working knowledge of Microsoft Azure (storage, compute, identity/governance, Azure OpenAI).
  • High level understanding of data engineering pipelines and data quality practices.
  • Experience extracting/structuring data from unstructured sources (SOPs, reports, PDFs, ELN entries) using NLP or GenAI.
  • Demonstrated experience in biotech/biopharma operations and partnering with SMEs across technical development, manufacturing, quality, or supply.
  • Familiarity with Computer System Validation (CSV) documentation practices in regulated environments.
  • Strong communication skills supporting collaboration across Technical Development, Manufacturing, Quality, and Supply Chain.

Responsibilities

  • Identify and frame AI opportunities across Technical Development, Manufacturing, Quality, and Supply Chain; translate ambiguous problems into tractable use cases with measurable outcomes.
  • Maintain TOPS Data Science Portfolio of Projects. Participate in Portfolio prioritization, planning, solution design, development, and deployment.
  • Lead Projects from start to finish by closely working with stakeholders, leadership and project team. Author business case, design, development and project implementation documents.
  • Advance the Integrated Technical Data Strategy by defining roadmaps, value hypotheses, and success metrics that strengthen process robustness, speed, and cost/value realization.
  • Acquire and prepare multi-source technical data (e.g., MES, LIMS, QMS, ELN, SAP, PI), ensuring quality, lineage, and context for AI development at scale.
  • Engineer domain-aware features and reusable data assets that accelerate experimentation for manufacturing, quality, and supply analytics.
  • Build and validate ML/AI models for use cases such as process monitoring, anomaly/root-cause analysis, yield and cycle-time optimization, and intelligent document processing.
  • Develop GenAI solutions (e.g., RAG for SOPs/reports, Semantic search, Q&A assistants over technical data, workflow copilots) using approved enterprise platforms.
  • Operationalize models (MLOps) with reproducible pipelines by closely working with Data Engineering team—data ingestion, training, evaluation, versioning, deployment—and monitor drift, performance, and data quality for continuous improvement.
  • Collaborate with IT/Engineering to ensure scalable, secure, and supportable AI services aligned to TOPS environments and platform standards.
  • Drive data visualization and decision support with clear narratives and dashboards that communicate model insights to engineers, operators, quality leads, and executives.
  • Champion data integrity and documentation (e.g., model cards, validation records) consistent with TOPS quality expectations and regulated biotech practices.
  • Educate and enable partners through demos, playbooks, and training that raise data/AI literacy and adoption across TOPS functions.
  • Quantify and report value realization (e.g., cost avoidance, OEE improvements, cycle-time reduction, quality signal detection) and maintain a transparent backlog of AI initiatives.
  • Promote “build-first” evaluations against internal platforms before third-party tools when requirements are met internally with better agility and cost efficiency.
  • Contribute to TOPS AI standards (feature stores, evaluation frameworks, prompt/agent guidelines) and mentor peers to strengthen the data science community of practice.
  • Stay current on AI advances (foundation models, time-series, causal inference, simulation/digital twins) and assess applicability to manufacturing, quality, and supply use cases.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service