Senior Data Engineer, AI for Drug Discovery

RocheNew York, NY
$141,100 - $262,200Onsite

About The Position

Roche’s Research and Early Development organisations at Genentech (gRED) and Pharma (pRED) are leveraging advances in AI, data, and computational sciences to transform drug discovery and development. The new Computational Sciences Center of Excellence (CoE) is a strategic, unified group aiming to harness the power of data and Artificial Intelligence (AI) to assist scientists in delivering innovative medicines. This role is part of the team building and maintaining the next-generation Therapeutic Molecule Registration (TMR) platform, a foundational component of the AI-driven drug discovery infrastructure, Lab-in-the-Loop. This platform will manage and integrate molecular data across the global research organization, handling billions of records to enable large-scale virtual molecule design and testing. The TMR platform needs to be a high-performance, cloud-native system supporting rapid iteration cycles between computational design and experimental validation. The Senior Data Engineer will consolidate molecule registration systems into a single, harmonized environment to accelerate the development of life-changing therapies. The role involves implementing scalable solutions for molecular data management and contributing to the architecture of the cloud-native platform, working closely with machine learning teams, drug discovery teams, and other global partners.

Requirements

  • 7+ years of data engineering experience
  • Expert knowledge of Postgres SQL and experience with Oracle
  • Skilled with at least one modern data toolkit (Glue, dbt, Databricks, ...)
  • Experience with cloud platforms (preferably AWS)
  • Python programming skills
  • Strong testing practices and test automation
  • Understanding of CI/CD pipelines
  • Experience with agile development methodologies

Nice To Haves

  • Open source cheminformatics experience (e.g., RDKit, chemfp, Indigo, HELM toolkit)
  • Chemical database cartridge expertise
  • Familiarity with biological sequence alignment
  • Chemical & biological structure notation expertise
  • Familiarity with chemical structure canonicalization
  • Molecular structure searching algorithm expertise
  • Experience with scientific software development
  • Familiarity with Docker and Kubernetes
  • Experience with event-driven architectures
  • Knowledge of security best practices

Responsibilities

  • Design and implement features of our TMR data model
  • Oversee cloud data migration to TMR and production deployment
  • Contribute to technical design discussions and architecture decisions
  • Write high-quality, testable code for chemical registration workflows
  • Support and mentor junior team members
  • Collaborate with scientists and other engineers to implement business requirements

Benefits

  • Relocation benefits are available
  • A discretionary annual bonus may be available
  • Benefits detailed at the link provided below
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service