Data Engineer

Heaven Hill Brands•Louisville, KY

29d•Onsite

About The Position

The Data Engineer will build and manage data pipelines from both primary external and internal sources into central data repositories, and from there to usable, business-facing, enterprise-ready data sources. The Data Engineer is expected to understand large language models and have experience using them to drive data engineering solutions, including IDE-integrated and external LLM-based coding tools. This position may also design and support language-model-driven solutions for end-user reporting and analytical needs in partnership with our AI program.

Requirements

Bachelor’s degree in computer science or related field, or may substitute equivalent experience for educational requirements.
Minimum 4 years of experience creating and managing robust data pipelines (ETL/ELT) that follow best practices, including medallion architecture, unit testing, and related standards.
Minimum 4 years of experience with databases (SQL Server, Oracle, or similar relational databases).
Programming languages: SQL (required); Python (required) and/or R (preferred).
Knowledge of MS SQL Server, SQL, Reporting Services, Analysis Services, and Integration Services for historical workloads.
Knowledge of version control systems (Git) and collaboration services (GitHub).
Experience with Microsoft Fabric, Power BI, and notebook-based development, or a demonstrated ability to learn these quickly.
An attitude of continuous improvement and learning.
A team player with excellent interpersonal skills who is comfortable working with a business-integrated scrum to plan and complete work.
Comfortable with “change as the only constant” and willing to flex as priorities shift; understands that 'other duties as required' is part of the core role.
The use of AI-assisted development tools (eg, integrated LLM copilots).

Nice To Haves

R/ Shiny experience.
Familiarity with Power BI and semantic model design.
Understanding of the Posit Team set of applications.
Familiarity with data sources used in the spirits industry (Nielsen, NABCA, SRS, distributor data, etc.).
Experience working with time series data (manufacturing line data, chemical processing, etc.).

Responsibilities

Build, maintain, and optimize data products and models to be processed and served up by AI.
Collaborate with IT and business stakeholders to define requirements, prioritize work, and deliver high-quality data products.
Plan and document work using Asana and GitHub.
Curate enterprise-ready data sources for consumption by Power BI for self-service analytics in the business.
Focus on data products that can be used by LLMs to support forecasting, performance measurement, and decision-making.
Write data pipelines from external and internal commercial data sources into central data locations (data lake / Fabric lakehouse) using SQL and/or Python.
Transition legacy OLAP-based and SSIS-based workflows into well-documented, transparent, and fully reproducible ETL/ELT environments, likely driven by notebook (code + text) solutions within Fabric and related tooling.
Write data pipelines from central repositories to enterprise-ready data sources for consumption by MCP servers, web apps, data scientists, analyst professionals, and Power BI semantic models.
Work with transparency, best practices, and documentation as guiding principles.
Leverage integrated large language models (LLMs) and AI-assisted tooling to accelerate coding, documentation, testing, and ideation of data and analytics solutions.
Other duties as required to support changing business needs.