Bioinformatics Scientist

Revalia Bio•New Haven, CT

1d•$150,000 - $180,000•Remote

About The Position

Revalia is pioneering Human Data Trials: a new paradigm for accelerating translational medicine by linking donated human organs to cutting-edge research. Our platform provides biotech, pharma, and medtech innovators with unparalleled access to human biology at scale, reducing reliance on animal models and speeding the path to patients. We are seeking a Data Scientist with a strong foundation in multi-omics data processing and a broader command of computational biology and bioinformatics. The ideal candidate brings both the versatility to integrate and analyse multi-omics datasets, spanning genomics, transcriptomics, and proteomics, and the technical depth to extract meaningful signal from complex, high-dimensional biological data. In this role, you will design and maintain robust analytical pipelines that underpin our data platform, supporting the translation of large-scale biological datasets into actionable scientific insight. You will work at the intersection of research and engineering, contributing to infrastructure that scales with our programs and delivers analytical capabilities that directly shape our scientific and operational priorities. We are looking for someone who combines advanced programming proficiency with a solid grounding in statistical and mathematical modelling, and who brings genuine intellectual curiosity to the problems they work on. The candidate must have an interest in (or prior exposure to) organ perfusion biology and ex vivo research models.

Requirements

Proficiency in analysis of at least two omics data types (e.g., bulk and single-cell RNA-seq, proteomics, metabolomics) with demonstrated experience integrating across modalities
Familiarity with standard bioinformatics toolkits (e.g., Seurat, Scanpy, DESeq2, limma, or equivalent) and awareness of when to use them
Experience building and maintaining reproducible pipelines (Nextflow, Snakemake, or similar)
Hands-on experience with multi-omics data integration frameworks (e.g., MOFA+, DIABLO, Weighted Correlation Network Analysis) and an understanding of their assumptions and limitations
Familiarity with pathway enrichment and network-based analysis approaches (e.g., GSEA, ORA, STRING, or similar) to contextualize multi-omics findings biologically
Comfortable working with reference databases and annotation resources relevant to the omics types in scope (e.g., Ensembl, UniProt, MSigDB, KEGG)
Strong proficiency in Python and/or R for data science and omics analysis
Solid knowledge of version control systems (Git and GitHub) with collaborative workflows
Familiarity with Linux environments, high-performance computing, and cloud-based workflows (AWS)
Master’s or Ph.D. in Computational Biology, Bioinformatics, Bioengineering, Computer Science, or a related field and 3+ years of experience in data science with knowledge of the biomedical industry
Accepted: Master’s or Ph.D. in Electrical Engineering, Statistics, Applied Mathematics, or a related field and 5+ years of experience in data science with knowledge of the biomedical industry

Nice To Haves

Experience with dimensionality reduction and clustering methods as applied to high-dimensional biological data (PCA, UMAP) and the ability to critically interpret outputs
Experience with ETL processes and high-performance data pipelines, including relevant data formats (Parquet, Delta Lake, Iceberg)
Proficiency in processing and analyzing imaging data, including microscopy and/or CT imaging
Experience with imaging toolkits such as CellProfiler, FIJI/ImageJ, or Python-based equivalents (e.g., scikit-image, napari)

Responsibilities

Work with our Human Data Trials Operations, Innovation, and Engineering teams to answer scientific questions by utilizing available data and models
Design and execute pipelines for next-generation sequencing (NGS) data processing
Integrate omics data with other biological data types (e.g., clinical metadata, imaging, sensor data) for multi-modal analysis
Perform statistical and machine learning-based analyses to detect patterns, quantify phenotypes, and support hypothesis-driven research
Collaborate cross-functionally with scientists, software developers, and domain experts to define project goals and deliver actionable insights
Maintain and optimize analysis pipelines for scalability and reproducibility using version control and containerized environments (e.g., Docker)
Present findings through reports, visualizations, and scientific publications or presentations
Stay current with advances in omics analysis, AI/ML techniques, and biological imaging technologies