Associate Cloud Data Engineer

ERC Pathlight•Denver, CO

9h•Hybrid

About The Position

The Associate Cloud Data Engineer supports the design and development of scalable data pipelines to ensure reliable, high-quality data including the development of the Enterprise Data Management and Analytics (EDMA) across ERC Pathlight. Working alongside senior engineers and in collaboration with clinical, operational, and analytics teams, this role helps turn raw data into actionable insights that improve patient care and decision making. As a key contributor to the Enterprise Data Warehouse (EDW), the Associate Cloud Data Engineer gains hands-on experience advancing data infrastructure in support of evidence-based treatment for mood, anxiety, and eating disorders.

Requirements

2+ year of experience in designing and contributing to data models in transactional or analytical environments, preferably in healthcare or a regulated industry.
2+ years of experience applying data normalization and denormalization techniques, relational database design, and performance tuning basics.
2+ years of experience developing or supporting cloud-based data pipelines, data warehouses, or data lake implementations using ETL tools and database systems.
2+ years of experience translating business or clinical requirements into data structures for analytics and reporting.
2+ year of hands-on experience building or supporting data pipelines with Azure Data Factory, Azure Synapse Analytics, or Azure Databricks (or equivalent cloud platforms).
2+ years of experience working with Azure SQL, Delta Lake, or Azure Data Lake Storage Gen2 to support AI/ML/analytics or BI use cases.
Familiarity with maintaining or troubleshooting ADF pipelines, data flows, and integration runtimes in a cloud data environment.
Basic experience with data governance concepts, including metadata management, data lineage, or business glossary documentation.
Experience or academic exposure to designing ETL/ELT workflows using Azure Data Factory to integrate data from SQL Server, REST APIs, Blob Storage, or third-party systems.

Nice To Haves

Exposure to or academic/project experience with standard healthcare data models including FHIR, HL7, OMOP, or CDISC (SDTM/ADaM) is a strong plus.
Proficiency in integrating data pipelines with the Azure AI stack (Azure ML, Synapse ML) and Databricks, with a developing interest in Agentic AI frameworks and the orchestration of AI Agents for automated data reasoning and processing.
Basic understanding of clinical data models & workflows (e.g., HL7, FHIR, or OMOP) and healthcare-specific schemas to support the design of secure, HIPAA-compliant pipelines that handle complex patient records, ensuring consistent data delivery and longitudinal integrity across transactional and analytical systems.

Responsibilities

Develop foundational knowledge in cloud-based storage patterns, including Data Warehousing, Data Lakes (ADLS/Blob), Data Marts, and Operational Data Stores, while assisting in the implementation of scalable architectures like Medallion Architecture.
Assist in designing and deploying end-to-end ETL/ELT workflows using enterprise tools such as Azure Data Factory (ADF), Informatica IICS, and AWS Glue to ingest data from diverse sources like Salesforce, REST APIs, and SQL databases.
Apply modeling principles to build conceptual, logical, and physical designs, supporting the creation of schemas optimized for different workloads, such as Star, Snowflake, or Data Vault patterns for analytical environments.
Work within the modern data stack to transform complex datasets using Databricks and Spark and learn to integrate data pipelines with the Azure AI stack (Azure ML, Synapse ML) to support downstream analytics.
Contribute to the development of data quality frameworks, including profiling, validation, and cleansing, and implementing the Audit, Balance, and Control (ABC) framework to ensure data accuracy and completeness.
Support the automation of the development lifecycle using Git for version control and CI/CD pipelines for automated deployment, while leveraging Azure Boards to track deliverables and project milestones.
Implement security-by-design through encryption, tokenization, and data masking, ensuring all data pipelines and infrastructure comply with strict regulatory standards such as HIPAA, GDPR, and SOC 2.
Monitor and troubleshoot cloud infrastructure to improve efficiency by performing cost tuning in ADF, optimizing SQL queries, and managing failure alerts to maintain high system reliability.
Assist in implementing data cataloging and lineage tools such as MS purview, EDC/AXON to support discoverability and traceability, ensuring that metadata practices meet auditability and compliance requirements.
Create high-level design documents and mapping specifications and serve as a bridge between technical and business teams by presenting engineering progress at analytics roundtables and stakeholder showcases.
Develop familiarity with the Azure Data and AI stack by facilitating the integration of ADF pipelines with Azure ML and Databricks, ensuring high-quality feature engineering and automated data delivery for Machine Learning models.