Databricks Lead Data Engineer (FTE) - Hybrid in Minnesota, MN

UnitedHealth Group•Minnetonka, MN

2d•Hybrid

About The Position

Optum Tech is a global leader in health care innovation. Our teams develop cutting-edge solutions that help people live healthier lives and help make the health system work better for everyone. From advanced data analytics and AI to cybersecurity, we use innovative approaches to solve some of health care’s most complex challenges. Your contributions here have the potential to change lives. Ready to build the next breakthrough? Join us to start Caring. Connecting. Growing together. The EL3 Databricks Data Engineer will design, build, and optimize scalable data pipelines supporting Claims Payment Integrity (PI) analytics across Medicare & Retirement, Community & States, and Employer & Individual businesses. The role focuses on developing governed lakehouse based data assets, integrating claims and provider datasets, and ensuring high quality data availability for PI, actuarial, audit, recovery, and financial analytics teams. This position follows a hybrid schedule with four in-office days per week.

Requirements

12+ years of experience in Data Engineering
8+ years of experience in Databricks
5+ years of experience PySpark, SQL, Jobs/Workflows
5+ years of Spark performance tuning experience
4+ years of experience in Delta Lake
3+ years of experience in healthcare
Ability to work a hybrid schedule of four in-office days per week in Minneapolis, MN

Nice To Haves

Experience with Call center data (member & provider interactions), Provider RCM datasets, and EHR/clinical data
Experience with DLT, CI/CD, and MLflow integrated pipelines
Exposure to actuarial or PI forecasting workflows
Healthcare Claims Payment Integrity | M&R, C&S, E&I Claims
Actuarial & Forecasting Analytics Exposure is an Added Advantage
Experience engineering data for claims, provider, and membership domains
Solid understanding of healthcare data models and adjudication flows

Responsibilities

Data Engineering & Lakehouse Development:
Build scalable ETL/ELT pipelines in Databricks using PySpark, Spark SQL, Delta Live Tables, and workflows
Engineer curated datasets across bronze/silver/gold layers for claims, pricing, provider, RCM, and member data
Implement Delta Lake best practices including ACID transactions, schema evolution, CDC, and optimized storage formats
Automate ingestion/transformation of large datasets from claims systems, provider files, call center platforms, and EHR feeds
Data Quality & Governance:
Perform reconciliation and validation of claim related financial datasets
Enforce PHI compliant design patterns using Unity Catalog, governance guardrails, and cluster policies
Implement pipeline monitoring, logging, and Spark performance optimization
Platform & Collaboration:
Work with Data Analysts, Data Scientists, and PI SMEs to translate analytic requirements into production data assets
Support cluster optimization, table indexing (Z ORDER), and cost efficient lakehouse operations
Participate in Agile ceremonies and ensure timely delivery of engineering tasks