MSD-posted 11 days ago
Full-time • Mid Level
Hybrid • Cambridge, MA
5,001-10,000 employees

AI Research Scientist, Foundation Models Our Artificial Intelligence Machine Learning (AI/ML) capabilities are critical accelerators to our mission of inventing new medicines that save and improve lives. Core to the Data, AI, and Genome Sciences (DAGS) function is an AI/ML-first approach to improving target and biomarker discovery, validation and selection, and elucidating complex disease mechanisms. As a senior AI scientist, you will be responsible for pre-training and fine-tuning biological foundation models on text, multi-omics, and imaging data, analyzing pre-trained models posthoc, building rigorous benchmarks for evaluating foundation models, and serving in-house trained foundation models. Your work will advance our understanding of complex diseases and support the development of innovative therapeutic strategies. You will be part of a cross-functional team of computational biologists, bioinformaticians, data scientists, software engineers, and machine learning researchers who strive to identify therapeutic targets and biomarkers.

  • Collaborate with cross-functional teams to identify research questions and data requirements and develop appropriate solutions.
  • Develop and train large foundation models for -omics data augmented with text and images.
  • Interpret and critically post-hoc analyze pre-trained models.
  • Rigorously benchmark and evaluate the performance of both in-house and publicly available models.
  • Host and serve in-house state-of-the-art models and make them accessible to scientists across our company.
  • Stay up to date with the latest advancements in machine learning and statistics and apply relevant advancements to improve existing methodologies and models.
  • Publish research findings in relevant conferences and journals and actively contribute to the scientific community through knowledge sharing and collaborations.
  • PhD, MS, or BS in Computer Science, Engineering, Data Science, AI/ML, Bioinformatics, Computational Biology, Genetics & Genomics, Mathematics, Statistics, Physics, Pharmaceutical Science, or related STEM field and 0+ years of full-time experience (with PhD), 4+ years of experience (with MS), or 7+ years of experience (with BS).
  • Experience training large models on multi-node, multi-GPU environments.
  • Experience designing novel architectures for multi-modal foundation models.
  • Deep expertise in post-training foundation models, including some parameter efficient fine-tuning, post-hoc interpretability, and preference optimization.
  • Demonstrated expertise in classical machine learning, statistical models, and in training, evaluating, and debugging models and data at scale.
  • Excellent software design and development skills and strong proficiency in Python.
  • Experience with standard deep learning frameworks like the PyTorch ecosystem for working with large foundation models.
  • Excellent communication skills and ability to work collaboratively in a multi-disciplinary team.
  • Interest in life sciences problems and disease biology, and willing to learn from and teach others.
  • Experience with pre-training biological foundation models (transcriptomic, protein language models, DNA transformers, or structure-to-function models) is a strong plus.
  • Familiarity with biological data and previous experience with protein language models and foundation models for omics is a strong plus.
  • Experience training and working with large discrete diffusion models.
  • Experience with reinforcement learning (RL) and using RL for training reasoning models.
  • Relevant publications in scientific journals and experience contributing to research communities, including NeurIPS, ICML, ICLR, etc.
  • medical
  • dental
  • vision healthcare and other insurance benefits (for employee and family)
  • retirement benefits, including 401(k)
  • paid holidays
  • vacation
  • compassionate and sick days
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service