Data Engineer

Atlanta Hawks•Atlanta, GA

About The Position

The Data Engineer is responsible for the end-to-end data lifecycle—from orchestrating high-frequency ingestion pipelines to ensuring reliable delivery through the activation layer. This role focuses on ingesting, transforming, and delivering reliable, high-quality data from a variety of internal and external sources into our cloud data ecosystem. Working closely with Analytics Engineers, BI Engineers, and business stakeholders, the Data Engineer ensures that data is accessible, timely, and trustworthy—enabling analytics, reporting, marketing activation, and operational decision-making across the company. In this position, you will play a key role in developing and optimizing the data ingestion and processing layers of our platform. You will be responsible for building robust ETL/ELT pipelines, managing data workflows, and ensuring efficient movement and storage of data across systems. This includes integrating with APIs, streaming platforms, and third-party systems to support both batch and real-time data use cases.

Requirements

Strong proficiency in SQL and at least one programming language (Python preferred).
Deep understanding of Apache Spark architecture and hands-on experience with PySpark for large-scale data processing and performance optimization.
Deep understanding of dimensional data modeling concepts (Kimball) and practical experience building star schemas, including the definition of facts, dimensions, and slowly changing dimensions (SCDs).
Experience building and maintaining data pipelines (ETL/ELT).
Hands-on experience with Databricks and Databricks-native orchestration/workflow development.
Experience with Fivetran for data ingestion and source system connectivity.
Experience with dbt, including transformation development, testing, and documentation.
Experience with modern cloud data platforms and data engineering best practices.
Familiarity with APIs, data ingestion patterns, and data integration techniques.
Understanding of distributed data processing and storage concepts.
Experience with Git-based workflows (Github Actions) and CI/CD practices.
Strong problem-solving skills and ability to debug complex data issues.
Experience implementing data governance practices, including data quality frameworks, data lineage, metadata management, access controls, and compliance with organizational and regulatory standards.
Proven ability to collaborate cross-functionally with Analytics Engineers, BI teams, Data Scientists, and business stakeholders to translate data requirements into scalable technical solutions.

Nice To Haves

Experience with data ingestion and data modeling Ticketmaster’s Archtics data is a big plus.
Experience with streaming technologies (Kafka, Kinesis, Pub/Sub).
Experience architecting and managing data solutions within the Azure ecosystem environment, including deep familiarity with Azure Data Lake Storage (ADLS), Azure Data Factory (ADF), and Azure functions.
Exposure to Unity Catalog, Delta Lake, Databricks SQL, or serverless compute.
Experience with infrastructure-as-code (Terraform, CloudFormation).
Knowledge of data governance, cataloging, or metadata management tools.
Experience in a fast-paced, data-driven organization.

Responsibilities

Design, build, and maintain scalable and reliable data pipelines to ingest data from various sources, including APIs, databases, and SaaS platforms.
Develop and optimize ETL/ELT workflows to efficiently process large volumes of data.
Implement orchestration frameworks and Databricks-native workflows to manage dependencies and scheduling.
Ensure pipelines are fault-tolerant, observable, and recoverable with appropriate logging and alerting.
Continuously improve pipeline performance, scalability, and cost efficiency.
Design and maintain core components of the data platform, including data lakes, warehouses, and storage systems.
Optimize data storage formats and partitioning strategies for performance and cost.
Manage integrations with Databricks and other cloud data platform services.
Support both batch and streaming architectures where applicable.
Collaborate on infrastructure-as-code and environment management practices.
Implement data validation and monitoring at ingestion and pipeline stages.
Ensure data completeness, freshness, and accuracy through automated checks and alerts.
Troubleshoot pipeline failures and data discrepancies, driving root cause resolution.
Establish SLAs/SLIs for data availability and pipeline performance.
Partner with Analytics Engineers to ensure clean handoffs between raw and curated data layers.
Enable downstream consumers by delivering well-structured, reliable datasets.
Support integrations with downstream tools such as BI platforms, CDPs, and operational systems.
Contribute to improving data accessibility and usability across the organization.
Document data pipelines, data sources, and system architecture.
Maintain clear data lineage and metadata practices.
Contribute to standards around naming conventions, code quality, and deployment workflows.
Participate in code reviews and CI/CD processes.