Principal, Data Engineer

Regeneron Pharmaceuticals

119d•$101,800 - $194,500

About The Position

The Principal, Data Engineer builds data infrastructure, leads technical initiatives, and mentors junior team members while driving data-driven solutions across the organization. As a Principal, Data Engineer, a typical day might include the following: Design complex data engineering solutions and define standards Mentor junior engineers and drive infrastructure innovation Build scalable, secure data pipelines with robust monitoring Optimize ETL/ELT workflows for large-scale data processing Architect end-to-end data pipeline solutions from ingestion to consumption Implement real-time and batch processing systems to handle diverse biotech data streams Design fault-tolerant pipelines with appropriate error handling and recovery mechanisms Establish CI/CD practices for data pipeline deployment and testing Develop data transformation logic to support analytical and operational needs Integrate disparate data sources including laboratory instruments, clinical systems, and external APIs Implement data validation frameworks to ensure data integrity throughout the pipeline Manage and organize large datasets Ensure data quality and accessibility for data analysts Implement data lake and data warehouse architectures Monitor data pipeline performance and troubleshoot issues Maintain efficiency and reliability of data systems Implement observability solutions for pipeline monitoring Develop automated alerting systems for pipeline failures or anomalies as needed Build and leverage GenAI solutions to improve performance, speed and efficiency of data engineering team Document data processes and systems as required Ensure compliance with data governance policies.

Requirements

Strong Python, Java, or Scala programming skills
Deep SQL expertise and relational database experience
NoSQL and big data technology experience (Hadoop, Spark, Kafka)
Proficiency in data modeling and schema design
Knowledge of data security and compliance requirements in regulated environments
Familiarity with Biotech Enterprise Systems (MES, LIMS, QMS)
Excellent communication skills
Knowledge of MCP and Orchestration platforms related to AI/GenAI
Proficiency in star schemas and data modeling tools
Knowledge of industry regulatory requirements (CFR Part 11, GxP, CSA)
Stream processing experience (Kafka, Flink)
Cloud certifications
BA/BS in Computer Science, Bioinformatics, or related field
Principal: 8+ years meaningful experience or equivalent combination of education and experience
Staff: 10+ years relevant experience or equivalent combination of education and experience

Nice To Haves

Experience in biotech, pharmaceutical, or other life sciences industries preferred
Cloud platform experience (AWS, Azure) preferred
Experience with workflow orchestration tools (Airflow, Luigi, Prefect, or similar)
Experience with containerization technologies
Experience with scientific data management systems
Experience with using GenAI to enhance own work

Responsibilities

Design complex data engineering solutions and define standards
Mentor junior engineers and drive infrastructure innovation
Build scalable, secure data pipelines with robust monitoring
Optimize ETL/ELT workflows for large-scale data processing
Architect end-to-end data pipeline solutions from ingestion to consumption
Implement real-time and batch processing systems to handle diverse biotech data streams
Design fault-tolerant pipelines with appropriate error handling and recovery mechanisms
Establish CI/CD practices for data pipeline deployment and testing
Develop data transformation logic to support analytical and operational needs
Integrate disparate data sources including laboratory instruments, clinical systems, and external APIs
Implement data validation frameworks to ensure data integrity throughout the pipeline
Manage and organize large datasets
Ensure data quality and accessibility for data analysts
Implement data lake and data warehouse architectures
Monitor data pipeline performance and troubleshoot issues
Maintain efficiency and reliability of data systems
Implement observability solutions for pipeline monitoring
Develop automated alerting systems for pipeline failures or anomalies as needed
Build and leverage GenAI solutions to improve performance, speed and efficiency of data engineering team
Document data processes and systems as required
Ensure compliance with data governance policies

Benefits

Health and wellness programs
Fitness centers
Equity awards
Annual bonuses
Paid time off for eligible employees at all levels

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Education Level

Bachelor's degree

Number of Employees

5,001-10,000 employees

Principal, Data Engineer

About The Position

Requirements

Nice To Haves

Responsibilities

Benefits

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company