About The Position

Advances in AI, data, and computational sciences are transforming drug discovery and development. Roche’s Research and Early Development organisations at Genentech (gRED) and Pharma (pRED) have demonstrated how these technologies accelerate R&D, leveraging data and novel computational models to drive impact. Seamless data sharing and access to models across gRED and pRED are essential to maximising these opportunities. The new computational sciences Center of Excellence (CoE) is a strategic, unified group whose goal is to harness this transformative power of data and Artificial Intelligence (AI) to assist our scientists in both pRED and gRED to deliver more innovative and transformative medicines for patients worldwide. The Computational Sciences Center of Excellence (CS CoE) brings together data, AI, and computational expertise to accelerate innovation across gRED and pRED. Within CS CoE, the Data and Digital Catalyst (DDC) organization leads the modernization of our data ecosystem, enabling scalable, data-driven science. The Data Capability organization within DDC is responsible for establishing foundational data capabilities, including data connectivity, data compliance, scientific content management and data ingestion, curation, integration, and delivery. The team ensures that high-quality, well-structured datasets are available to power analytics, AI/ML, and scientific discovery across Research and Early Development. We are seeking an Associate Data Delivery Specialist to support the delivery and operationalization of real-world data (RWD) and clinical-genomic datasets sourced from external partnerships and public/purchased data collections. In this entry-level role, you will contribute to the coordination, preparation, and delivery of multimodal, high-dimensional datasets, ensuring they are accessible, well-documented, and ready for use in research, analytics, and AI/ML workflows. You will also support interactions with external data providers and internal stakeholders to ensure efficient and compliant data usage. You will work within a cross-functional environment spanning data engineering, data science, and research teams, helping to enable data-driven discovery across Roche’s R&D ecosystem.

Requirements

  • PhD and 0-2 years of experience, Master’s degree and 3-5 years of experience or a Bachelor’s degree and 4-7 years of experience in Data Science, Bioinformatics, Health Informatics, Biomedical Engineering, Computer Science, or a related field and experience working with real-world data, clinical data, or biomedical datasets
  • Basic understanding of RWD sources (e.g., EHR, claims, registries, clinical-genomic datasets)
  • Strong attention to detail and commitment to data quality and reliability
  • Strong organizational and communication skills, with the ability to support multiple stakeholders
  • Programming: Python (Pandas) or SQL; familiarity with Bash is a plus.
  • Data Formats: Experience with structured data (CSV, JSON, Parquet); exposure to scientific formats is a plus.
  • Data Platforms: Exposure to cloud environments (AWS S3, GCS, or Azure).
  • Tools: Familiarity with Jupyter notebooks, data portals, or workflow tools is beneficial

Nice To Haves

  • Exposure to clinical-genomic or multimodal datasets (e.g., Caris, FMI, or similar)
  • Familiarity with data governance and compliance in healthcare or life sciences
  • Exposure to AI/ML workflows or data preparation for analytics
  • Understanding of FAIR data principles and metadata standards
  • Interest in working with external data partnerships and large-scale data ecosystems

Responsibilities

  • Intake, tracking, and fulfillment of real-world data requests, including clinical-genomic and multimodal datasets.
  • Assist in preparing datasets for delivery, ensuring completeness, quality, and documentation.
  • Coordinate with external partners (e.g., Caris, FMI) to support data requests, query submissions, and data returns.
  • Assist in managing communications, timelines, and deliverables.
  • Assist in managing data access workflows, ensuring appropriate approvals, training, and compliance with data usage agreements.
  • Track data usage and maintain documentation.
  • Work with sequencing, imaging, and proteomics datasets, supporting standardized formatting, validation, and integration readiness.
  • Contribute to handling emerging multimodal data types and evolving standards.
  • Perform quality checks, metadata validation, and documentation to ensure datasets are analysis-ready.
  • Support troubleshooting of data delivery issues and escalate when necessary.
  • Contribute to early-stage efforts in AI-enabled data curation and harmonization, supporting improved scalability and efficiency in data delivery workflows.
  • Partner with internal teams (e.g., AIBT, CBM, gRED TM, pRED DTAs) to support data integration and delivery needs across diverse scientific use cases.

Benefits

  • A discretionary annual bonus may be available based on individual and Company performance.
  • This position also qualifies for the benefits detailed at the link provided below.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Entry Level

Education Level

Ph.D. or professional degree

© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service