General DiffUSE Job Application

Astera InstituteEmeryville, CA
$100,000 - $300,000Onsite

About The Position

DiffUSE is a project at the Astera Institute focused on developing open infrastructure for studying protein dynamics directly from experimental data. The project operates at the intersection of structural biology (crystallography, cryo-EM), modern machine learning, computational biophysics, and open scientific tooling. The team comprises computational biologists, ML researchers, software engineers, and program staff, collaborating across Astera, Radial, and partner institutions. DiffUSE is expanding and is seeking individuals excited about contributing to their mission, even if a specific role isn't currently posted. Submissions are reviewed on a rolling basis, and interested candidates will be contacted if a suitable opportunity arises now or in the future. The project is actively building around several key areas: Computational and data science, including diffraction data processing, structural data pipelines, multiconformer and heterogeneity analysis, data standards (mmCIF), macromolecular ensemble metrics, and machine learning research for macromolecules and biophysics (representation learning, ML on raw experimental data, 3D vision, geometric deep learning). They are also focused on Dataset generation and open release, which involves designing and running campaigns for large structural datasets, partnering with external collaborators for data standardization, and bridging experimental facilities, data producers, and the open-science community. Lastly, they are building out Software and infrastructure engineering capabilities, focusing on scientific data infrastructure, pipelines, tooling, and open-source release engineering, reproducibility, and developer experience, as well as Program and operations roles in program management, scientific coordination, communications, and open-science publishing and community building.

Requirements

  • Familiarity with protein biophysics and the experimental methods that generate structural data
  • High agency: identify what needs doing and move it forward without waiting for direction
  • Ability to drive projects and people, including collaborators outside your reporting line
  • Comfortable owning a problem end-to-end and unblocking collaborators
  • Strong commitment to open science and public-good infrastructure
  • Comfort working at the boundary between disciplines
  • Bias toward shipping, iteration, and rapid feedback
  • Clear written and verbal communication
  • Track record of independent work inside collaborative teams
  • Experience in computational and data science
  • Experience in diffraction data processing and structural data pipelines
  • Experience in multiconformer and heterogeneity analysis from large datasets
  • Experience with data standards work (mmCIF and related)
  • Experience with macromolecular ensemble metrics
  • Experience in machine learning research for macromolecules and biophysics
  • Experience in representation learning for protein dynamics
  • Experience with ML on raw experimental data rather than processed structures
  • Experience in 3D vision and geometric deep learning
  • Experience in designing and running campaigns to generate large structural datasets
  • Experience in partnering with external collaborators to open and standardize existing datasets
  • Experience in bridging experimental facilities, data producers, and the open-science community
  • Experience in scientific data infrastructure, pipelines, and tooling
  • Experience in open-source release engineering, reproducibility, and developer experience
  • Experience in program management, scientific coordination, or communications
  • Experience in open-science publishing and community building

Nice To Haves

  • Backgrounds in 3D vision and geometric deep learning are especially welcome

Responsibilities

  • Identify what needs doing and move it forward without waiting for direction
  • Drive projects and people, including collaborators outside your reporting line
  • Own a problem end-to-end and unblock collaborators
  • Ship, iterate, and gather rapid feedback
  • Contribute to open science and public-good infrastructure
  • Work at the boundary between disciplines
  • Generate large structural datasets
  • Partner with external collaborators to open and standardize existing datasets
  • Bridge experimental facilities, data producers, and the open-science community
  • Develop scientific data infrastructure, pipelines, and tooling
  • Engage in open-source release engineering, reproducibility, and developer experience
  • Manage programs and coordinate scientific efforts
  • Handle communications related to the project
  • Engage in open-science publishing and community building
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service