Lead Data engineer

Goosehead Insurance•Westlake, OH

50d

About The Position

Key Responsibilities Lead the design, development, and scaling of data infrastructure that powers analytics, machine learning, GenAI, and business automation across Goosehead. Define and own data engineering best practices, including standards for data architecture & modeling, pipeline reliability, and quality assurance. Architect and oversee complex ETL/ELT workflows across Azure Data Services & AI foundry, Databricks, Snowflake, dbt, and Fivetran. Build and optimize production-grade data pipelines using Python, PySpark, and SQL. Partner with data science, analytics, and software engineering teams to design data systems that enable advanced analytics and model deployment. Mentor and coach data engineers, guiding career growth and technical excellence within the team. Evaluate and implement new innovative tools and frameworks that enhance productivity, scalability, and observability across the data ecosystem. Lead and uphold the enterprise data governance framework, including lineage tracking, master data management, privacy compliance & security. Collaborate with business stakeholders to align data engineering initiatives with enterprise priorities and long-term platform strategy. Ensure data governance, privacy, and compliance are embedded throughout all data processes. Required Qualifications 8+ years of experience in data engineering, data infrastructure, or related fields, including demonstrated leadership in architecting enterprise data solutions. Deep expertise in Python, PySpark, and SQL for large-scale data processing and transformation. Strong experience with Azure, Snowflake, and Databricks, including designing and optimizing cloud-native data pipelines. Advanced knowledge of dbt for data modeling and transformation and Fivetran for automated data ingestion. Proven ability to design for reliability, performance, and cost optimization in large-scale data environments. Excellent communication skills with the ability to influence cross-functional partners and leadership. Preferred Qualifications Experience leading data platform modernization or migration initiatives in Azure or other cloud environments. Proficiency with real-time and event-driven architectures (such as Kafka or Kinesis). Experience with CI/CD design for advanced analytics pipelines and infrastructure-as-code (such as Terraform). Experience supporting machine learning workflows, model monitoring, and feature store management. Experience with data governance solutions (e.g., Azure Purview, Collibra, Unity catalog). Background in insurance, financial services, or other regulated industries.

Requirements

8+ years of experience in data engineering, data infrastructure, or related fields, including demonstrated leadership in architecting enterprise data solutions.
Deep expertise in Python, PySpark, and SQL for large-scale data processing and transformation.
Strong experience with Azure, Snowflake, and Databricks, including designing and optimizing cloud-native data pipelines.
Advanced knowledge of dbt for data modeling and transformation and Fivetran for automated data ingestion.
Proven ability to design for reliability, performance, and cost optimization in large-scale data environments.
Excellent communication skills with the ability to influence cross-functional partners and leadership.

Nice To Haves

Experience leading data platform modernization or migration initiatives in Azure or other cloud environments.
Proficiency with real-time and event-driven architectures (such as Kafka or Kinesis).
Experience with CI/CD design for advanced analytics pipelines and infrastructure-as-code (such as Terraform).
Experience supporting machine learning workflows, model monitoring, and feature store management.
Experience with data governance solutions (e.g., Azure Purview, Collibra, Unity catalog).
Background in insurance, financial services, or other regulated industries.

Responsibilities

Lead the design, development, and scaling of data infrastructure that powers analytics, machine learning, GenAI, and business automation across Goosehead.
Define and own data engineering best practices, including standards for data architecture & modeling, pipeline reliability, and quality assurance.
Architect and oversee complex ETL/ELT workflows across Azure Data Services & AI foundry, Databricks, Snowflake, dbt, and Fivetran.
Build and optimize production-grade data pipelines using Python, PySpark, and SQL.
Partner with data science, analytics, and software engineering teams to design data systems that enable advanced analytics and model deployment.
Mentor and coach data engineers, guiding career growth and technical excellence within the team.
Evaluate and implement new innovative tools and frameworks that enhance productivity, scalability, and observability across the data ecosystem.
Lead and uphold the enterprise data governance framework, including lineage tracking, master data management, privacy compliance & security.
Collaborate with business stakeholders to align data engineering initiatives with enterprise priorities and long-term platform strategy.
Ensure data governance, privacy, and compliance are embedded throughout all data processes.