AI Engineer - Vision Foundation Model Pretraining

Howard Hughes Medical InstituteAshburn, VA
17h$96,326 - $299,737Onsite

About The Position

AI@HHMI: HHMI is investing $500 million over the next 10 years to support AI-driven projects and to embed AI systems throughout every stage of the scientific process in labs across HHMI. The Foundational Microscopy Image Analysis (MIA) project sits at the heart of AI@HHMI. Our ambition is big: to create one of the world’s most comprehensive, multimodal 3D/4D microscopy datasets and use it to power a vision foundation model capable of accelerating discovery across the life sciences. We are seeking a highly skilled AI Research Engineer to join our team and advance our AI-driven scientific initiatives. You will develop and deploy a self-supervised pre-training pipeline for learning from a large-scale microscopy dataset. You will work with expert computational scientists, data engineers, and experimentalists to train models that learn foundational embeddings that can be used across a wide range of microscopy modalities and applications. In collaboration with other engineers and scientists, you will use these models for scalable vision tasks, instance segmentation, tracking, classification, and more. You will utilize probabilistic models to produce uncertainty-aware predictions across scales. This role requires deep knowledge of the underlying models and practical implementation skills to maximize biological impact. You will lead rigorous model evaluations, implement novel architectures, and ensure all work meets the highest standards of reproducible open science. Success in this role requires close collaboration with our microscopy experts, cellular biologists, neuroscientists, and computer scientists to ensure models can be deployed in large data real-world scenarios. Strong programming skills in Python, PyTorch, and/ or JAX are required, along with the ability to reason about neural network behavior from first principles. The role also requires knowledge of microscopy data formats and tools such as Zarr and Neuroglancer. We seek candidates who can think critically about model design, understand how architectural choices and regularization affect model behavior, and design rigorous experiments to evaluate models. Domain expertise in microscopy image analysis is not necessary, but will be highly valued. Because this is a team project, we value a clean shared codebase and git-based collaborative workflows. Familiarity with state-of-the-art vision frameworks such as DinoV3, SAM, CellPose, or Vision Transformers is required. We are looking for candidates with experience in ML model deployment, workflow orchestration, and high-throughput data processing, as well as experience working with large biological datasets in scalable GPU-based computing environments.

Requirements

  • Master's or PhD degree in Computer Science, Applied Mathematics, Computational Neuroscience, or a related field—or an equivalent combination of education and relevant experience.
  • 3+ years of experience training and evaluating deep learning architectures such as Transformers or U-Nets, particularly on image or point cloud data.
  • Strong programming skills in Python, PyTorch, and JAX.
  • Familiarity with computational tools in microscopy and connectomics data (Cellpose, CAVE, Flood Filling Networks, Neuroglancer, Zarr).
  • Familiarity with state of the art (self-supervised) computer vision algorithms (e.g., DINO, Masked Autoencoders, SAM).
  • Experience with ML model deployment, workflow orchestration, and high-throughput data processing and model training.
  • Keen interest to work in a truly interdisciplinary environment and learn about cellular/molecular biology (e.g. transcriptomics) & neuroscience.

Nice To Haves

  • Skills in Javascript are a plus.
  • Domain expertise in microscopy image analysis is not necessary, but will be highly valued.

Responsibilities

  • Research and explore the model design space for vision foundation models of multi-modal biological microscopy data.
  • Build a self-supervised pre-training pipeline on a large-scale foundational dataset of multi-modal biological microscopy data.
  • Design and execute rigorous experiments to evaluate model performance on a wide distribution of microscopy images and model architectures.
  • Collaborate with interdisciplinary teams, potentially mentor junior engineers, and direct or assist in directing the work of others to meet project goals while advising stakeholders on data strategies and best practices.
  • Deploy models both at Janelia and in the broader scientific community and ensure downstream usability.

Benefits

  • A competitive compensation package, with comprehensive health and welfare benefits.
  • A supportive team environment that promotes collaboration and knowledge sharing.
  • Access to a world-class computational infrastructure and high-quality datasets.
  • The opportunity to engage with world-class researchers, software engineers, and AI/ML experts, contribute to impactful science, and be part of a dynamic community committed to advancing humanity’s understanding of fundamental scientific questions.
  • Amenities that enhance work-life balance, such as on-site childcare, free gyms, available on-campus housing, social and dining spaces, and convenient shuttle bus service to Janelia from the Washington D.C. metro area.
  • Opportunity to partner with frontier AI labs on scientific applications of AI (see https://www.anthropic.com/news/anthropic-partners-with-allen-institute-and-howard-hughes-medical-institute ).
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service