Senior Healthcare Data Engineer

AxleRockville, MD
6h

About The Position

Axle is seeking a Senior Healthcare Data Engineer to join our vibrant team at the National Institutes of Health (NIH) supporting the National Center for Advancing Translation Sciences (NCATS) located in Rockville, MD. Join the team at the forefront of revolutionizing medical research in the United States. We are building and maintaining the foundational infrastructure of the National Clinical Cohort Collaborative (N3C), the nation’s largest and most significant public repository of harmonized electronic health record (EHR) data. What began as a critical response to the pandemic has evolved into a multi-disease, terabyte-scale resource that empowers researchers to make discoveries faster than ever before. This isn't just another data engineering job. This is a chance to leave your mark on a national-scale platform, solve complex data challenges that directly impact public health, and work with a passionate team dedicated to smarter science and better treatments for all. We are looking for a visionary Senior Healthcare Data Engineer to be a lead architect of our data ingestion and harmonization ecosystem. You will be instrumental in N3C’s next evolutionary step: the transition to a scalable, secure, and flexible “Dynamic Workspaces" model. You won’t just be maintaining pipelines; you will be re-architecting, modernizing, and scaling the systems that ingest and harmonize a diverse torrent of data. From EHRs and CMS claims to cancer registries and geospatial data, you will be making it research-ready for thousands of scientists. If you are a builder who thrives on complex challenges and wants your work to have a tangible, lasting impact on science and medicine, we want to talk to you.

Requirements

  • A deep passion for using technology to solve meaningful problems in healthcare and medical research.
  • Bachelor's or Master's degree in Computer Science, Data Engineering, Bioinformatics, or a related field, with 8+ years of hands-on experience in data engineering (or 5+ years with a Master's).
  • Expert-level proficiency in Python and SQL, with a proven track record of building and maintaining complex, large-scale data pipelines and ETL processes.
  • Significant experience with healthcare data is essential. You must have deep, practical knowledge of common data models (CDMs), particularly OMOP and/or FHIR , and experience with clinical terminologies (e.g., ICD, SNOMED, RxNorm).
  • Strong experience with big data technologies (e.g., Apache Spark, Hadoop) and containerization using Docker for creating reproducible and scalable workflows.
  • Proficiency with version control (Git) and CI/CD practices for data infrastructure.
  • An architectural mindset with the ability to design for scalability, reliability, and security.

Nice To Haves

  • Experience designing and deploying data solutions on cloud platforms (AWS, GCP, Azure).
  • Proficiency with modern workflow management systems (e.g., Nextflow, Snakemake, Airflow).
  • Experience with privacy-preserving record linkage (PPRL) techniques and the challenges of working with de-identified patient data.
  • Familiarity with federated data systems and architectures.
  • Experience working in a regulated data environment (e.g., FISMA, HIPAA).

Responsibilities

  • Architect and Modernize National-Scale Data Pipelines: Design, develop, and optimize robust, disease-agnostic data acquisition and ingestion pipelines built to handle the complexity and scale of N3C.
  • Master Data Integration and Harmonization: Tackle the complex challenge of harmonizing heterogeneous clinical data from countless sources. You will maintain and enhance the OMOP harmonization pipeline, improve interoperability between common data models (e.g., OMOP, PCORNet, FHIR), and ensure consistency for critical data like medications and lab values.
  • Build the Future with Dynamic Workspaces: Be a key technical player in developing the infrastructure for N3C's new Dynamic Workspaces. You will help build the systems that provision secure, project-specific analytical environments, giving researchers access to the specific data they need while providing institutions granular control.
  • Champion Data Quality and Governance: Develop and implement sophisticated data quality frameworks, creating dashboards and feedback loops to ensure our data partners and researchers have transparent insight into data completeness, consistency, and quality.
  • Innovate with Advanced Technologies: Integrate critical new data sources, including national mortality data and CMS. You will link datasets and help build the processes for integrating novel data types like geospatial and environmental data.
  • Collaborate and Lead: Work alongside a world-class team of scientists, project managers, and engineers to translate scientific needs into technical solutions. You will provide technical leadership and mentorship, driving best practices in an agile, mission-focused environment.

Benefits

  • 100% Medical, Dental & Vision Coverage for Employees
  • Paid Time Off and Paid Holidays
  • 401K match up to 5%
  • Educational Benefits for Career Growth
  • Employee Referral Bonus
  • Flexible Spending Accounts: Healthcare (FSA)
  • Parking Reimbursement Account (PRK)
  • Dependent Care Assistant Program (DCAP)
  • Transportation Reimbursement Account (TRN)
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service