Principal Data Scientist

VeracyteSan Diego, CA
3hHybrid

About The Position

We are seeking a Principal Data Scientist to lead the research and creation of multimodal AI (MMAI) models and workflows that integrate genomic, transcriptomic, imaging, and clinical data for various oncology applications. This role is critical to advancing the science and technology of MMAI, driving innovation in predictive modeling to benefit patient outcomes, and supporting product strategy through rigorous hypothesis-driven research. The Principal Data Scientist will report to the Senior Director of Computational Biology and collaborate closely with cross-functional teams including Discovery, Bioinformatics and Data Science, Cloud Ops/Engineering, Pathology, Medical, and Clinical Affairs. Based in the R&D division, this role supports our mission to discover, develop, and deliver the best diagnostic, prognostic, and predictive tests to transform cancer care for patients all over the world. Position can be remote (within USA or Canada), on-site (South San Francisco or San Diego), or hybrid.

Requirements

  • Ph.D. in bioinformatics, computational biology, genomics, biostatistics, computer science, or a related field applying quantitative computational methodologies to biological/clinical problems.
  • Minimum 8 years of relevant experience, with at least 5 years in an industry setting (biotech, diagnostics, or healthcare preferred).
  • Demonstrated expertise in multimodal data integration, machine learning, and model development for NGS-based clinical diagnostics.
  • Strong programming skills in Python, R, and SQL.
  • Strong experience with cloud computing environments (AWS preferred).
  • Deep knowledge of genomics, transcriptomics, digital pathology, and clinical data analysis.
  • Proven track record of technical leadership, project ownership, and successful delivery of high-impact R&D projects.
  • Excellent communication skills and ability to mentor and lead interdisciplinary teams.
  • Strong publication record in peer-reviewed journals, including first and senior authorship.

Nice To Haves

  • Experience with advanced ML architectures (transformers, multimodal fusion, attention mechanisms).
  • Familiarity with regulatory requirements (HIPAA, GDPR) and data governance in clinical research.
  • Experience with medical imaging analysis and cytopathology.
  • Knowledge of cancer biology, immunology, and clinical trial design.

Responsibilities

  • Lead research into novel MMAI models while closely collaborating with other machine learning experts across the computational team on strategy, study design, cohort selection, data acquisition, and data generation.
  • Architect, train, and validate MMAI models integrating modalities including genomics, transcriptomics, whole-slide imaging (e.g. H&E tumor tissue slides), and clinical features for cancer prognosis, risk stratification, diagnosis, and therapy selection.
  • Drive proof-of-concept and feasibility projects from definition through model development, benchmarking, interpretation, and dissemination of results.
  • Design and implement pipelines for ingesting, harmonizing, and integrating diverse data modalities (including whole-slide images, RNA-seq, WGS, clinical metadata).
  • Work closely with wet lab scientists, bioinformatics/data science teams, medical/clinical/pathology teams, software/data/cloud engineers, and other cross-functional teams to ensure models are biologically interpretable and clinically applicable.
  • Prepare and present findings to technical and non-technical audiences, including conference abstracts and presentations, scientific publications, and internal reports.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service