AstraZeneca is seeking Master’s and PhD students studying Biology, Computer Science, Chemistry, Physics, Engineering, Biomedical Science, Pharmacology, Data Science, Bioinformatics, or a related discipline for a 10-week internship role at our site in Waltham, MA from June 01, 2026- August 07, 2026. This internship sits at the intersection of data engineering, biomedical NLP, and translational science, enabling faster insight generation for R&D teams. Position Description: Build an end-to-end pipeline turning literature (papers, abstracts, patents) into a standardized knowledge graph with contextualized evidence. Handle source selection, inclusion/exclusion criteria, updates, and data snapshots. Develop NLP for entity recognition, relation extraction, assertion detection, and context tagging (drug, indication, resistance, biomarker, outcome). Encode domain relations (e.g., Drug–mechanism→Gene/Pathway; Biomarker–modulates→Outcome; ADC–targets→Antigen). Map entities to controlled vocabularies; manage synonyms, disambiguation, and canonical IDs. Implement edge-level confidence scoring (source quality, claim type, co-occurrence, citations, model certainty) with full evidence provenance. Build graph storage (property graph or RDF) and queryable APIs. Deliver interactive visualization (UI or notebook) with filters, context toggles, and evidence drill-down. Define metrics, run error analyses, and validate with scientific stakeholders. Ensure reproducibility and documentation: version models/data; record architecture, assumptions, benchmarks; provide user guides. Present outcomes to data science, oncology, and translational medicine teams.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Career Level
Intern
Number of Employees
5,001-10,000 employees