Scientist - Structural Informatics Infrastructure

Astera InstituteEmeryville, CA
Onsite

About The Position

The diffUSE Project is seeking a Scientist to join a multidisciplinary team to help build the infrastructure needed to host and distribute dynamic structural biology data. The diffUSE Project is an ambitious initiative designed to advance our understanding of protein dynamics by building the experimental methods, computational models, and global infrastructure needed to capture molecular motion at scale. Our goal is to establish dynamic structural biology as a foundational pillar of modern science, as transformative and indispensable as static structures have been. This role will lead the development of standards and platforms that enable the community to deposit, validate, search, and leverage dynamic structural information at scale. The job will involve helping to architect the foundational infrastructure, including metrics and encoding, to make dynamic structural biology data as accessible, trustworthy, and impactful as the Protein Data Bank has been for static structures. You will work at the intersection of structural biology, data science, and community standards development to build an ensemble-aware, living database that evolves with algorithmic advances while maintaining scientific rigor and reproducibility. This is a full-time position within the diffUSE Project, in-person at Radial, a division of the Astera Institute.

Requirements

  • Ph.D. in structural biology, biophysics, computational biology, or related field
  • Demonstrated expertise in structural biology methods
  • Deep understanding of structural heterogeneity and dynamics in biomolecular systems
  • Experience with data standards, metadata frameworks, or scientific database development
  • Strong collaborative skills and ability to build consensus across diverse scientific communities

Nice To Haves

  • Experience with PDB, EMDB, BMRB, or other structural biology databases
  • Knowledge of validation methods for experimental and computational structural data
  • Familiarity with machine learning workflows and ML-ready data formats
  • Background in model uncertainty quantification or ensemble refinement methods
  • Understanding of software development practices and data engineering principles
  • Track record of working at the interface of methods development and infrastructure

Responsibilities

  • Help lead the conceptual design of an ensemble-aware database architecture that balances flexibility, scalability, and scientific integrity
  • Identify and prioritize technical challenges, from data representation to validation frameworks to query interfaces
  • Direct the design and implementation of infrastructure that enables continuous model improvement as algorithms advance, while preserving provenance, trust, and reproducibility
  • Oversee development of ensemble-aware validation frameworks that assess fit-to-data, physical realism, and uncertainty across diverse structural representations
  • Guide the creation of data deposition, search, and retrieval tools that allow users to interrogate and interpret structural heterogeneity at scale
  • Help coordinate with stakeholders to ensure interoperability and adoption
  • Work with software developers, data engineers, and user experience designers to translate scientific requirements into robust technical solutions
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service