Director, Data Engineer

The Coca-Cola Company•Atlanta, GA

23h•$149,000 - $173,000

About The Position

The Global Equipment Platforms (GEP) team is seeking a highly skilled and experienced Lead Data Engineer to be a core technical contributor to the data backbone for The Coca-Cola Company's global fleet of 17MM+ connected equipment. Reporting to the Head of Data within GEP Digital, this role is pivotal in transforming raw telemetry data from beverage vending machines, dispensers, coolers, and retail racks into a strategic asset that fuels real-time market insights, predictive analytics, and operational efficiencies across our global ecosystem. You will be responsible for designing, building, and maintaining robust, secure, and cost-optimized cloud-based data pipelines and platforms, primarily leveraging Microsoft Azure. This includes hands-on development of scalable data ingestion, transformation, and storage solutions capable of handling high-volume, real-time data from a diverse fleet of equipment running on the KO Operating System (KOS) and other embedded systems. This role demands a deep technical expert with a proven track record of solving complex data challenges, ensuring data integrity, scalability, and accessibility for both internal stakeholders and our 200+ global franchise bottlers and OEM partners in a multi-tenant environment.

Requirements

Bachelor's degree in Computer Science, Engineering, Information Systems, or a related quantitative field. Master's degree preferred.
10+ years of hands-on experience in data engineering, with a strong focus on building and operating large-scale data platforms.
Expert-level proficiency in designing, building, and operating data pipelines and data solutions in Microsoft Azure (e.g., Azure Data Factory, Azure Synapse Analytics, Azure Databricks, Event Hubs, IoT Hub, ADLS Gen2).
Deep experience with real-time data streaming architectures and technologies (e.g., Kafka, Azure Event Hubs/IoT Hub).
Strong programming skills in Python (preferred), Scala, or Java; expert in SQL.
Extensive experience with big data technologies and distributed computing frameworks (e.g., Spark).
Solid understanding of data warehousing concepts, dimensional modeling, and data lake architectures.
Proven track record of implementing robust data quality and security measures.
Experience with CI/CD practices for data pipelines and infrastructure as code (e.g., Terraform, ARM templates).
Familiarity with IoT, connected devices, embedded systems, and telemetry data.

Responsibilities

Data Pipeline & Platform Development (40%): Design, develop, and maintain highly scalable, secure, and resilient data pipelines (batch, streaming, real-time) and data platforms on Microsoft Azure (e.g., Azure Data Factory, Azure Synapse Analytics, Azure Databricks, Event Hubs, IoT Hub, ADLS Gen2, Cosmos DB, Azure SQL). Implement robust data ingestion processes to collect high-volume telemetry data from 17MM+ connected devices, ensuring data quality and reliability at source. Develop efficient data transformation logic and data models to rationalize, cleanse, and enrich raw equipment data, making it ready for consumption by analytics and AI applications. Optimize data storage solutions (data lakes, data warehouses) for performance, cost-efficiency, and accessibility. Design and build scalable, resilient data pipelines for real-time telemetry, ensuring data quality and accessibility that directly fuels the analytical models developed by Data Scientists and the production AI systems deployed by the Lead AI Engineer.
Data Foundation for AI/ML & Analytics (25%): Engineer the data foundation required to support advanced analytics, machine learning models (e.g., predictive maintenance, demand forecasting, personalization), and AI Agents. Work closely with Data Scientists and Data Analysts to understand their data needs, ensure data quality, and optimize data structures for efficient model training and inference. Develop and maintain curated datasets and data marts that simplify data consumption for business intelligence tools and internal/external analytics applications. Ensure the seamless flow of rich, rationalized telemetry data from KOS-powered devices to the core analytics platform.
Data Governance, Quality & Compliance (20%): Contribute to the implementation and enforcement of data governance frameworks, data quality standards, and data integrity checks for the equipment data. Develop and implement solutions to monitor data pipeline health, identify data anomalies, and proactively address data quality issues. Ensure the data platform adheres to global data privacy regulations (e.g., GDPR, CCPA) and TCCC's internal security protocols, especially for multi-tenant data access by bottlers and OEMs. Implement robust logging, monitoring, and alerting for data pipelines and data platform components.
Cross-Functional Collaboration & OEM Integration (15%): Collaborate closely with the Global Product Owner for Unified IoT, GEP Hardware/Software Engineering, Enterprise Digital Technology Solutions (IT), and Experience Design teams to ensure seamless data integration from connected devices. Work with equipment OEMs (e.g., Lancer, Cornelius, True, Imbera) to integrate their telemetry systems and ensure data capture adheres to TCCC's standards. Contribute to the data requirements and solutions for Over-The-Air (OTA) updates of firmware, software, and content to the equipment fleet. Participate in code reviews, design discussions, and knowledge sharing within the data engineering team and broader GEP digital organization. Work closely with the Product Owner, Unified Ecosystem for new telemetry requirements driven by product features, ensuring seamless integration of new data sources. Partner with Data Scientists to optimize data structures and access patterns for efficient model training, feature engineering, and inference, and with Lead AI Engineers for deployment of data-intensive AI solutions.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume