Senior Data Engineer - (Durham NC or Menlo Park, CA) - #4571

Grail•Durham, NC

1d•Hybrid

About The Position

Our mission is to detect cancer early, when it can be cured. We are working to change the trajectory of cancer mortality and bring stakeholders together to adopt innovative, safe, and effective technologies that can transform cancer care. We are a healthcare company, pioneering new technologies to advance early cancer detection. We have built a multi-disciplinary organization of scientists, engineers, and physicians and we are using the power of next-generation sequencing (NGS), population-scale clinical studies, and state-of-the-art computer science and data science to overcome one of medicine’s greatest challenges. GRAIL is headquartered in the bay area of California, with locations in Washington, D.C., North Carolina, and the United Kingdom. It is supported by leading global investors and pharmaceutical, technology, and healthcare companies. For more information, please visit grail.com The Data Engineer will partner with scientists and statisticians to support efficient and accurate capture, transfer, and aggregation of sample information and analysis results through GRAIL’s analysis systems. As such, the Data Engineer will become proficient in GRAIL’s system architecture, including primarily GRAIL’s LIMS for laboratory data; Electronic Data Capture (EDC) for clinical study data; the Bioinformatics Pipeline for sequence analysis and cancer classification data; and TidyData, the system that aggregates, packages, and serves datasets created from LIMS, EDC, and Pipeline outputs. The Data Engineer will also collaborate with software engineers and scientists to develop requirements and produce analysis-ready datasets for clinical research and product development. The Data Engineer will develop code and procedures to support dataset generation, perform QC, and troubleshoot issues that arise during dataset generation. The Data Engineer will also collect requirements, develop prototypes, and collaborate on production implementations of new reporting, data visualization, and statistical analysis features as needed. The Data Engineer will learn analysis requirements by reading laboratory protocols, Statistical Analysis Plans, and other analysis planning documents and meeting with scientists, biostatisticians, and other stakeholders. They will demonstrate a consistent commitment to delivering on team goals with a sense of shared urgency. This is a hybrid role based in either Menlo Park, CA (moving to Sunnyvale, CA in Fall 2026) or Durham, NC . Our current flexible work arrangement policy requires that a minimum of 60%, or 24 hours, of your total work week be on-site. Your specific schedule, determined in collaboration with your manager, will align with team and business needs and could exceed the 60% requirement for the site.

Requirements

BS with 5+ years of related experience or MS with 3+ years of related experience in a computatonal or scientific field (life science, computer science, engineering, mathematics, statistics, bioinformatics, etc.)
Strong proficiency in R or Python programming
Comfort working across the full data stack — from ingestion and transformation to orchestration and visualization
Excellent interpersonal communication (written and verbal) and organizational skills
Team player with demonstrated success in a cross-functional environment

Nice To Haves

Experience using a system-level programming language like Go, Java or C++
Understands basic concepts of molecular biology
Familiarity with AI-assisted coding or data workflows
Experience with Amazon Web Services
Proficiency in SQL development and data warehousing concepts

Responsibilities

Independently execute on sample selection strategies for studies according to analysis requirements
Collect user requirements, develop and automate custom reporting features
Promote self-service data platform adoption through training, process improvements, and infrastructure updates
Reconcile multi-dimensional data across databases and systems to ensure data integrity
Develop and improve tools to enable dataset creation and update, data QC and reconciliation, data transfer, and sample selection
These responsibilities summarize the role’s primary responsibilities and are not an exhaustive list. They may change at the company’s discretion