Senior/Staff Scientist, Data Science

Glyphic BiotechnologiesBerkeley, CA
59d$168,000 - $238,000Hybrid

About The Position

At Glyphic Biotechnologies, we plan to create the protein revolution for which scientists and researchers have been waiting. We are developing a massively parallel, single-molecule proteome sequencing platform that will transform life science discovery and usher in a new era of insights into human biology and disease. To date, we have raised >$50M from venture partners and non-dilutive grant funding to achieve our vision of next generation proteome sequencing. Glyphic is seeking a highly motivated and experienced Senior/Staff Data Scientist to assist in the advancement of our cutting-edge single molecule proteome sequencing platform which has the potential to transform how we understand biology and develop new medicines. We're looking for a Senior Data Scientist who's excited about solving complex, real-world problems with cutting-edge technology. You'll work directly with our CTO and a collaborative team of scientists, engineers, and bioinformaticians who are passionate about pushing the boundaries of what's possible. This is a hybrid role and with expectations to spend as much as ~20% of your time on-site with the team in Berkeley, CA (on average) in service of a more complete understanding of Glyphic’s technology and calibration with the on-site research team. This role will require some flexibility for additional collaboration as projects require.

Requirements

  • PhD in Computer Science, Bioinformatics, Computational Biology, Biostatistics or related field with 4+ (Senior) or 6+ (Staff) years of hands-on experience.
  • Proven ability to model and interpret high-dimensional datasets with numerous interacting variables, uncovering statistically robust patterns and causal relationships.
  • Competency in chemistry data science (e.g., interpreting LCMS data, utilizing deconvolution tools, understanding surface chemistry and molecule-target interactions).
  • Competency in next generation sequencing, including familiarity with multi-omics, error modeling, and basecalling.
  • Expertise in Python and/or R for biostatistical analysis, including data wrangling, statistical modeling, and visualization of high-dimensional experimental results.
  • Experience designing ML models for experimental data and deploying pipelines (Snakemake, Nextflow).
  • Familiarity with ML frameworks (PyTorch, TensorFlow) and data science libraries (pandas, numpy, scipy).
  • Experience building automated data pipelines and infrastructure for scalable analysis (cloud, Docker/Kubernetes).
  • Experience with cloud platforms (AWS, GCP, or Azure) and containerization tools (Docker, Kubernetes).
  • Proficiency with data visualization tools (matplotlib, seaborn, plotly) and Jupyter notebooks.
  • Familiarity with version control (git) and pipeline workflow systems (Snakemake, Nextflow, etc.)

Nice To Haves

  • Ability to work in performant languages (C++, Rust, Julia, or CUDA).
  • Ability to develop solutions that optimize the utilization of large-scale data storage, cloud processing infrastructure, and distributed computing.
  • Direct proteomics experience (mass spectrometry, multiplex assays, etc.).
  • Deep learning experience with time-series data, signal processing, or sequence modeling.
  • Ability to build and deploy scalable ML pipelines using PyTorch/TensorFlow for real-time protein sequence analysis.
  • Experience with MLOps tools and practices for model deployment and monitoring.
  • Experience building commercially successful life science tools that other scientists actually use and love.
  • Previous startup or fast-paced industry (e.g., skunkworks) experience.

Responsibilities

  • Data Analysis and Insight Generation:
  • Design and implement novel algorithms to analyze proteomics data that no one has ever seen before.
  • Develop machine learning models that can extract meaningful insights from complex, noisy biological signals.
  • Develop and optimize algorithms for analyzing high-dimensional chemistry and NGS data, including single cell, spatial data, and LCMS data outputs
  • Build models that reveal how parameters and molecular interfaces drive outcomes, including surface interactions and molecule-target binding.
  • Design and execute biostatistical analyses using Python and/or R to uncover significant trends, model experimental outcomes, and inform data-driven decision-making.
  • Apply machine learning to guide experiment design, identify key parameters, and optimize workflows for efficiency and reproducibility.
  • Develop clear, insightful visualizations that make complex, high-dimensional results understandable and actionable for scientists and stakeholders.
  • Help define metrics and visualizations that clarify high-dimensional relationships for scientists and stakeholders.
  • Partner with wet lab, hardware, and software teams to translate experimental goals into computational strategies.
  • Pipelines and Automation:
  • Create ETL pipelines that clean, normalize, and integrate diverse datasets (sequencing reads, LCMS spectra, metadata) into analysis-ready formats.
  • Combine off-the-shelf pipelines (basecalling, variant calling, deconvolution) with custom scripts to deliver end-to-end solutions.
  • Continuously improve throughput and data quality by automating QC steps and integrating feedback from experiments.
  • Establish best practices for code quality, testing, and deployment that will scale with our growing team.

Benefits

  • Employee Stock Option Plan
  • 100% Health Plan Coverage for Employees & Dependents (Medical, Dental, & Vision)
  • Employer Retirement Contributions to 401(k)
  • Generous Paid Time Off
  • Paid Maternity and Paternity Leave
  • Health & Wellbeing Program
  • Office Snacks and Beverages
  • Regular Team Bonding Activities

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

Ph.D. or professional degree

Number of Employees

11-50 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service