Database Analyst I

Duke CareersDurham, NC
$59,829 - $104,550Onsite

About The Position

The Department of Surgery is seeking a detail-oriented and highly motivated Database Analyst I to join our team. The Database Analyst I will play a critical role in advancing our computational medicine research by architecting efficient data pipelines and managing, cleaning, and preparing complex, large-scale clinical datasets. Working with both Protected Health Information (PHI) and de-identified (non-PHI) data, this individual will ensure that rigorous data governance and security protocols are met. This position is essential for building a robust, secure data infrastructure that maximizes computational efficiency and supports high-performance machine learning, AI-driven predictive modeling, and advanced clinical analytics.

Requirements

  • Work requires a bachelor's degree in mathematics, computer science, or a computer related field or the equivalent coursework or technical training.

Responsibilities

  • Clean, manage, and consolidate both sensitive PHI and non-PHI datasets from diverse clinical sources (e.g., continuous physiological monitors, electronic medical records). Ensure strict adherence to institutional and federal data privacy regulations (e.g., HIPAA) while maintaining high data quality, accuracy, and structural integrity.
  • Map and transform disparate data types into unified formats. Incorporate harmonized data standards, specifically utilizing the Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM) and the Medical Event Data Standard (MEDS), to build automated, reproducible data processing pipelines that safely handle varying levels of data sensitivity.
  • Architect data workflows and optimize the lab's computational infrastructure to support high-throughput processing. Enhance the computational efficiency of compute environments by optimizing resource allocation (including CPU/GPU utilization), parallelizing data pipelines, and resolving processing bottlenecks to accelerate large-scale machine learning tasks.
  • Structure and index large-scale harmonized datasets for highly efficient querying within distributed computing environments. Develop, test, and maintain robust codebases (e.g., Python, SQL) for ongoing analytical tasks, implement version control, and comprehensively document architectural decisions and data extraction procedures.

Benefits

  • health insurance plans
  • generous paid time off
  • retirement programs with employer contributions
  • tuition assistance for employees and their children
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service