Senior Data Platform Engineer

Kestra Holdings•Tempe, AZ

About The Position

We are seeking a seasoned Databricks Data Engineer with expertise in Azure cloud services and the Databricks Lakehouse platform. The role involves designing and optimizing large-scale data pipelines, modernizing cloud-based data ecosystems, and enabling secure, governed data solutions. Strong skills in SQL, Python, PySpark, ETL/ELT frameworks, and experience with Delta Lake, Unity Catalog, and CI/CD automation are essential.

Requirements

8+ years of experience designing and developing scalable data pipelines in modern data warehousing environments, with full ownership of end-to-end delivery.
Expertise in data engineering and data warehousing, consistently delivering enterprise-grade solutions.
Proven ability to lead and coordinate data initiatives across cross-functional and matrixed organizations.
Advanced proficiency in SQL, Python, and ETL/ELT frameworks, including performance tuning and optimization.
Hands-on experience with Azure, Snowflake, and Databricks, and integration with enterprise systems.

Responsibilities

Design, build, and optimize large-scale data pipelines on the Databricks Lakehouse platform, ensuring reliability, scalability, and governance.
Modernize the Azure-based data ecosystem, contributing to cloud architecture, distributed data engineering, data modeling, security, and CI/CD automation.
Utilize Apache Airflow and similar tools for orchestration and workflow automation.
Work with financial or regulated datasets, applying strong compliance and governance practices.
Develop and optimize ETL/ELT pipelines using Python, PySpark, Spark SQL, and Databricks notebooks.
Design and optimize Delta Lake data models for reliability, performance, and scalability.
Implement and manage Unity Catalog for RBAC, lineage, governance, and secure data sharing.
Build reusable frameworks using Databricks Workflows, Repos, and Delta Live Tables.
Create scalable ingestion pipelines for APIs, databases, files, streaming sources, and MDM systems.
Automate API ingestion and workflows using Python and REST APIs.
Support data governance, lineage, cataloging, and metadata initiatives.
Enable downstream consumption for BI, data science, and application workloads.
Write optimized SQL/T-SQL queries, stored procedures, and curated datasets for reporting.
Automate deployments, DevOps workflows, testing pipelines, and workspace configuration.