About The Position

This role will be part of a project at the intersection of protein engineering, evolution, and machine learning, which is part of HHMI’s AI for Science Initiative (ai.hhmi.org). Our goal is to integrate evolutionary biology principles into protein language models. This role will contribute to peer-reviewed publications and advance the state-of-the-art. The project will have two homes: one at HHMI’s Janelia Research campus and the other in the Matsen and Bloom labs at the Fred Hutchinson Cancer Center. We are seeking a highly skilled AI Engineer to join our team and play a crucial role in advancing our AI-driven scientific initiatives. We already have an existing codebase (https://github.com/matsengrp/netam), which we are currently in the process of rewriting to be more flexible and modular. In this position, you will work with us to develop new models in that framework and maintain computational infrastructure. This role will require deep knowledge of the underlying models as well as practical implementation skills to have the maximum biological impact. You will lead comparative studies, implement novel architectures, and ensure all work meets the highest standards of reproducible, open science. In collaboration with the Bloom Lab, we will have the opportunity to design experiments to maximally inform and probe the boundaries of the resulting models. A data engineer will be on staff to assist data curation, and you will interact with other contributors in the Matsen lab who are developing and refining various aspects of these models. You will be embedded in a team of about 30 software & AI engineers working on related projects at HHMI Janelia and the AI initiative. HHMI has a commitment to a robust open-source software ecosystem; see https://ossi.janelia.org/ for more details.

Requirements

  • Bachelor's degree in Computer Science, Data Science, Statistics, Applied Mathematics, Physics, or a related field. An equivalent combination of education and relevant experience will be considered.
  • A background in mathematics, statistics, or physics is strongly preferred.
  • Demonstrated record of impactful research through publications or preprints in a quantitative field is required.
  • Strong experiential design skills, including rigorous model comparisons, ablation studies, statistical significance testing, and commitment to reproducible research and open science.
  • Strong programming skills in Python and PyTorch, with the ability to reason about neural network behavior from first principles.
  • Strong analytical thinking about how architectural choices, regularization, and training procedures affect model behavior.
  • Strong experimental methodology for model evaluation, including proper train/validation/test protocols, understanding of data leakage, and statistical rigor.
  • Excellent technical documentation and communication skills, with the ability to explain model behavior and experimental design to both technical and non-technical audiences.
  • A detail-oriented, creative, and organized team player with a collaborative mindset.

Nice To Haves

  • Experience with biological sequence modeling is a plus.
  • Experience with workflow orchestration and data processing is valuable but secondary to scientific and analytical depth.

Responsibilities

  • Investigate and implement alternative transformer architectures for biological sequence modeling using PyTorch.
  • Expand models to be multimodal, using a diversity of inputs and outputs
  • Design and execute rigorous comparative experiments between model architectures
  • Contribute to scientific publications and present findings at conferences
  • Apply software engineering best practices, ensuring a maintainable, extensible and well documented codebase allowing seamless reproduction and extension of research results
  • Stay up to date with the latest advancements in AI research. Leverage agentic coding and develop practices that enable safe application of this technology.
  • Collaborate with interdisciplinary teams at HHMI Janelia and Fred Hutchinson Cancer Center to ensure seamless integration of computational and experimental aspects of the project, mentor junior engineers, and direct or assist in directing the work of others to meet project goals while advising stakeholders on data strategies and best practices.

Benefits

  • A competitive compensation package, with comprehensive health and welfare benefits.
  • A supportive team environment that promotes collaboration and knowledge sharing.
  • The opportunity to engage with world-class researchers, software engineers and AI/ML experts, contribute to impactful science, and be part of a dynamic community committed to advancing humanity’s understanding of fundamental scientific questions.
  • Amenities that enhance work-life balance such as on-site childcare, free gyms, available on-campus housing, social and dining spaces, and convenient shuttle bus service to Janelia from the Washington D.C. metro area.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service