2026 Summer Intern - Regev Lab - Molecular Design

GenentechDaly City, CA
2d$50Onsite

About The Position

The Regev Lab in Genentech Research and Early Development is seeking an exceptional graduate student intern with a demonstrated research track record in developing computational models to represent and design biology at the molecular and genomic level, and the ability to independently execute on innovative ideas. This internship position is located in South San Francisco, on-site. Programmable molecular design depends on learning models that can navigate exponentially large sequence spaces, which current experimental screens can only partially sample. This project asks how machine learning models can extract generalizable structure from novel, systematically perturbed experimental datasets and apply it to guide exploration of unseen molecular space. In this role, you will design and train models that learn the latent “grammar” governing molecular function for our system of interest. Working with a large, novel experimental dataset, you will develop architectures that integrate sequence-level information with geometric and contextual representations. The focus is not merely on improving predictive accuracy, but on uncovering the inductive biases and representational strategies that enable extrapolation—when and why models succeed in new regimes, and when they fail. You will explore representation learning approaches for molecular design, investigate tradeoffs between foundation-scale models and structured task-specific models, and develop insights that inform both model construction and experimental strategy. There is also an opportunity for experimental validation of model-derived hypotheses, grounding model behavior in measurable biological outcomes. This internship is intended for students who are excited to build models that reveal structure underpinning biology — and who are motivated by contributing work that advances the interface between machine learning and experimental biology.

Requirements

  • Must be pursuing a PhD (enrolled student).
  • Computer Science, Biomedical Engineering/Bioengineering, Chemical Engineering, Biomedical Informatics, Computational Biology, Applied Physics, and related technical fields.
  • Strong foundation in machine learning or a related quantitative field, with experience training and evaluating models on real, noisy datasets.
  • Excellent programming skills in Python and common deep learning frameworks (e.g. PyTorch, TensorFlow, or equivalent).
  • Experience working with structured or high-dimensional data (e.g., sequences, graphs, or other relational representations).
  • Experience conducting rigorous analysis and interpreting model behavior, including ability to reason about model generalization and extrapolation and diagnosing model limitations and failure modes.
  • Ability to work independently in a fast-paced research environment and clearly communicate technical results in writing and presentations; strong motivation, intellectual curiosity, and eagerness to learn through feedback and iteration.

Nice To Haves

  • Excellent communication, collaboration, and interpersonal skills.
  • Complements our culture and the standards that guide our daily behavior & decisions: Integrity, Courage, and Passion.
  • Experience with representation learning, including learning latent or compositional features from structured data.
  • Experience with sequence- and/or structure-based modeling (e.g., CNNs, transformers, protein language models).
  • Familiarity with combinatorial or design-oriented modeling problems, where generalization beyond observed data is critical.
  • Interest in or experience with molecular or biological data, such as protein sequences or molecular structures.
  • Prior research experience in machine learning, computational biology, or a related quantitative field.
  • Track record publishing in ML/Comp Bio conferences (ICLR/ICML/NeurIPS/MLCB, etc).
  • Interest in connecting modeling insights to experimentally testable hypotheses.

Responsibilities

  • Design and train models that learn the latent “grammar” governing molecular function for our system of interest.
  • Develop architectures that integrate sequence-level information with geometric and contextual representations.
  • Explore representation learning approaches for molecular design
  • Investigate tradeoffs between foundation-scale models and structured task-specific models
  • Develop insights that inform both model construction and experimental strategy.
  • Experimental validation of model-derived hypotheses, grounding model behavior in measurable biological outcomes.

Benefits

  • Intensive 12-weeks full-time (40 hours per week) paid internship.
  • Program start dates are in May/June 2026.
  • A stipend, based on location, will be provided to help alleviate costs associated with the internship.
  • Ownership of challenging and impactful business-critical projects.
  • Work with some of the most talented people in the biotechnology industry.
  • paid holiday time off benefits

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Intern

Education Level

Ph.D. or professional degree

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service