Principal Data Scientist – R&D DSDH - Therapeutics Discovery (TD)

Johnson & Johnson Innovative MedicineAmbler, PA

About The Position

Johnson & Johnson Innovative Medicine is seeking a highly skilled R&D Data Scientist to support our Therapeutics Discovery (TD) organization. This role sits within the R&D Data Science group and will focus on building and applying advanced Machine Learning (ML) and Data Engineering solutions that accelerate scientific innovation across the drug discovery lifecycle. The ideal candidate brings strong computational expertise and a solid scientific understanding of early R&D, including areas such as Target Identification & Assessment, Lead Identification & Optimization, Mechanistic / Mode of Action studies, and Lab Automation & high‑throughput experimentation. The Data Scientist will collaborate closely with discovery scientists, automation engineers, computational biologists, and platform technology teams to transform complex, multimodal R&D data into actionable insights that drive therapeutic innovation.

Requirements

  • Master’s or Ph.D. in Computational Biology, Bioinformatics, Data Science, Chemistry, Chemical Biology, Biomedical Engineering, Computer Science, or related field.
  • Experience applying ML/AI in scientific domains (drug discovery, biology, chemistry, systems biology, imaging, or related areas).
  • Strong programming skills in Python (preferred) and experience with scientific/ML libraries (PyTorch, TensorFlow, scikit‑learn, RDKit, etc.).
  • Practical experience with data engineering, including data modeling, workflow orchestration, ETL/ELT pipelines, and cloud computing environments (AWS, GCP, or Azure).
  • Ability to work directly with experimental scientists to solve real R&D challenges.

Nice To Haves

  • Experience in pharma or biotech discovery, including target assessment, phenotypic screening, medicinal chemistry workflows, and lab automation.
  • Familiarity with omics, high‑content imaging, chemical structure data, or biological assay data.
  • Knowledge of data standards (e.g., FAIR, ontologies, controlled vocabularies) and working within regulated or quality‑governed environments.
  • Strong communication skills and ability to thrive in a matrixed, multidisciplinary environment.

Responsibilities

  • Machine Learning & Modeling Develop ML/AI models that support discovery workflows, including target prioritization, multi‑omics integration, and mechanistic inference.
  • Apply modern ML approaches (e.g., deep learning, graph learning, foundation models, generative models) to chemical, biological, imaging, and assay datasets.
  • Build and optimize models for real‑world R&D use cases, ensuring scalability, interpretability, and scientific rigor.
  • Data Engineering & Pipeline Development Design, build, and maintain robust data pipelines that curate, standardize, and integrate diverse R&D datasets (chemical, biological, multi‑omics, imaging, biophysical, automation logs, etc.).
  • Partner with platform teams to implement best‑practice MLOps/DevOps workflows and deploy ML models into production R&D environments Develop tooling that accelerates dataset preparation, feature engineering, and model lifecycle management across TD.
  • Scientific Partnership Work hand‑in‑hand with TD scientists to understand key biological and chemical questions and shape computational strategy accordingly.
  • Translate sparse, heterogeneous experimental datasets into insights that guide decision‑making in hit discovery, mechanism studies, perturbation experiments, and compound optimization.
  • Participate in design, interpretation, and iterative refinement of discovery experiments.
  • Innovation & Collaboration Partner with cross-functional teams in R&D Data Science, IT, platform engineering, and therapeutic area groups to drive AI/ML adoption.
  • Contribute to evaluating new analytical methods, automation technologies, and data platforms supporting next‑generation discovery science.
  • Champion high standards for data quality, documentation, governance, and reproducibility.

Benefits

  • Vacation –120 hours per calendar year
  • Sick time - 40 hours per calendar year; for employees who reside in the State of Colorado –48 hours per calendar year; for employees who reside in the State of Washington –56 hours per calendar year
  • Holiday pay, including Floating Holidays –13 days per calendar year
  • Work, Personal and Family Time - up to 40 hours per calendar year
  • Parental Leave – 480 hours within one year of the birth/adoption/foster care of a child
  • Bereavement Leave – 240 hours for an immediate family member: 40 hours for an extended family member per calendar year
  • Caregiver Leave – 80 hours in a 52-week rolling period10 days
  • Volunteer Leave – 32 hours per calendar year
  • Military Spouse Time-Off – 80 hours per calendar year

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Principal

Education Level

Ph.D. or professional degree

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service