Senior R&D Data Scientist

PDIWoodcliff Lake, NJ
$140,000 - $150,000Hybrid

About The Position

PDI is seeking a Senior Data Scientist to serve as the technical foundation of its R&D digital transformation. This individual will design and operate the data systems, pipelines, and analytical capabilities that transition PDI's R&D from siloed, manual data environments to a connected, AI-ready platform ecosystem. This role is empowered to enable faster formulation decisions, reduced development cycle time, and compliance by design across EPA, FDA OTC/NDA, Medical Device, and Cosmetic regulatory domains. The scientist will design and operate systems that collect, structure, process, and use experimental data from R&D labs, resulting in accelerated development, improved decision-making, and reduced manual effort. This role enables PDI’s R&D organization by bridging lab instrumentation, data science & engineering, and machine learning workflows to create integrated, compliant, and scalable digital capabilities. The position partners closely with R&D scientists, Quality, IT, and external vendors to ensure instrumentation is connected, data flows are automated, ML workflows are production‑ready, and digital tools support execution across the R&D lifecycle.

Requirements

  • PhD or MS strongly preferred in a quantitative field, such as: Computational Chemistry/Physics, Data Science, Bioinformatics, Applied Mathematics, Statistics, Chemistry, or Chemical Engineering
  • Demonstrated strong grounding in statistical analysis, modeling, and data interpretation (e.g. DOE, Arrhenius modeling, chemometrics)
  • Ability to evaluate data quality, transformation logic, and analytical assumptions
  • Familiarity with common lab data sources (e.g., stability, analytical testing, formulation data)
  • Working knowledge of data tools (e.g., Python, R, SQL, JMP) for analysis and validation
  • Understanding of multi-regulatory environments, particularly where EPA, FDA drug, device, and cosmetic frameworks intersect or diverge
  • 6+ years of combined experience, including hands-on work in at least two of the following capability areas:
  • Hands-on experience in generating advanced insights and predictive capability from statistical models, including (but not limited to) multivariate linear regression, clustering algorithms, decision trees, logistic regression, Principal Component Analysis (PCA), Partial Least Squares (PLS) modeling, time series, survival analysis, machine learning (supervised and unsupervised), Bayesian Methods, Neural Networks, etc.
  • Using SQL and Python or R skills for data manipulation, aggregation, and optimization.
  • Strong hands-on experience with laboratory instrumentation and specialized scientific software (Empower, LabWare, OMNIC, etc.)
  • Comfort operating at the interface of scientists, informatics teams, and quality stakeholders
  • Strong written and verbal communication skills; capable of presenting complex concepts, digital strategies, and performance insights to technical and non-technical audiences
  • Ability to structure complex scientific workflows into digital systems

Nice To Haves

  • Working knowledge of LIMS, or ELNs
  • Knowledge of regulatory frameworks (21 CFR Part 11, audit trails, data integrity)
  • Exposure to AI/ML concepts, tools, and models
  • AI/ML platform engineering or ML workflow deployment
  • Experience in Infection Prevention, Interventional Care, or regulated product development processes

Responsibilities

  • Define and enforce scientifically meaningful data standards across lab workflows, including raw, processed, and reduced data
  • Ensure data transformations preserve scientific validity and traceability from raw instrument output through to interpreted results
  • Partner with scientists to validate analytical assumptions, calculations, and interpretation logic
  • Establish minimum viable metadata standards to enable reuse and traceability without overengineering
  • Define and maintain data dictionaries and controlled vocabularies for key experimental parameters, in support of ALCOA+ and 21 CFR Part 11 compliance
  • Lead structured assessment of historical R&D data, including formulation, stability, analytical, and process development records, to identify high-value, recoverable knowledge assets
  • Evaluate AI‑assisted extraction methods (e.g., semantic search, pattern mining) with a focus on scientific validity
  • Enable discoverability of prior experiments and learnings to reduce redundant work and speed decision‑making
  • Quantify redundancy and rework attributable to inaccessible historical data; translate findings into a prioritized data recovery roadmap
  • Design data science solutions based on gathering and translation of business requirements.
  • Translate high‑value scientific decision points into analytical and statistical models
  • Apply appropriate mathematical, statistical, or experimental design techniques to evaluate hypotheses and trends (e.g. DOE, Arrhenius modeling, chemometrics)
  • Partner with stakeholders to ensure outputs are explainable, interpretable, and trusted
  • Support development of predictive or comparative models where scientifically justified
  • Develop and maintain ML-ready datasets and reusable feature layers that support R&D modeling, advanced analytics, and automation.
  • Apply Design of Experiment (DOE) principles to help R&D teams structure studies that generate AI-ready, analyzable datasets from the outset
  • Work alongside scientists, IT, and external vendors to Integrate lab instrumentation (e.g., chromatography, spectroscopy, automated systems) with digital data environments by defining structured data capture schemas, metadata requirements, and audit trail specifications at the instrument level
  • Define structured data capture, metadata schemas, and workflow models at the instrument level to ensure compliance and traceability
  • Collaborate with internal lab teams to map experimental workflows and translate them into digital processes
  • Ensure analytical processes and data interpretations align with regulated R&D expectations (ALCOA+, data integrity principles)
  • Support inspection readiness by ensuring traceability between data, analysis, and decisions
  • Help define governance models for analytics and future AI use within GxP and non‑GxP contexts
  • Act as the primary technical interface between R&D and IT for data architecture decisions, ensuring R&D data needs are accurately represented in enterprise platform and governance discussions
  • Train scientists on digital tools, data capture best practices, and ML-enabled analysis workflows
  • Translate scientific, regulatory, and business requirements into practical, scalable technical solutions
  • Improve data literacy and analytical confidence within R&D teams
  • Serve as a trusted scientific counterpart to informatics and external vendors
  • Help shift the organization from intuition‑driven to evidence‑driven decision‑making

Benefits

  • Medical, behavioral & prescription drug coverage
  • Health Savings Account (HSA)
  • Dental
  • Vision
  • 401(k) savings plan with company match and profit sharing
  • Basic and supplemental Life and AD&D insurance
  • Flexible Spending Accounts (FSAs)
  • Short & long-term disability
  • Employee Assistance Program (EAP)
  • Health Advocacy Program
  • Legal services
  • critical illness
  • hospital indemnity
  • accident coverage
  • ID theft and fraud protection
  • pet insurance
  • employee discounts
  • paid time off programs
  • sick & safe leave
  • vacation
  • company & floating holidays
  • paid parental leave
  • summer hours
  • flex place/flex time options
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service