Senior Data Engineer

dentsu•Detroit, MI

19h•Hybrid

About The Position

As the Senior Data Engineer, reporting to the Senior Manager of Global Data and AI, you will serve as a core architect and enabler of our evolving data ecosystem. You will help design and scale the foundational infrastructure for our global clients, transforming fragmented, reactive data processes into proactive, intelligent systems powered by AI and a robust semantic layer. You will work closely with Analytics, Media, Product, and Engineering teams to optimize data pipelines, automate complex transformations, embed trust and visibility into every stage of the data lifecycle, and enable AI-driven insights that address "why" questions through enhanced reasoning and accuracy. You are ideally based in the Greater Detroit, MI area, but we are considering candidates within the continental United States as well.

Requirements

8+ years of experience as a Data Engineer or in a similar role building scalable data infrastructure, with at least 2-3 years focused on AI-integrated systems, semantic layers, or agentic AI deployments.
Bachelor's Degree in Computer Science, Engineering, Information Systems, or related field required.
Advanced expertise in SQL, Python, DBT; strong experience with PySpark, Databricks, and semantic layer tools like DBT YAML, Unity Catalog, and knowledge graphs required.
Hands-on experience with ETL/ELT design tools like Trifacta (Alteryx), Adverity, Azure Data Factory, Power BI DAX, or similar, including data normalization and workflow automation.
Proven experience building and extending semantic layers for AI applications, including ontologies, taxonomies, vector databases, and integration with LLMs for enhanced reasoning, accuracy, and "why" question resolution.
Deep experience in the Microsoft Tech Data Stack, including Power BI, Power Apps, Fabric/OneLake, Azure Data Lakes (ADLS Gen2), Azure Blob Storage, Copilot Studio, and Azure AI Foundry for ModelOps and intelligent actions.
Experience with AI deployment and orchestration tools such as Kubernetes (via AKS), n8n, LangChain, and Microsoft Copilot Platform (MCP) for containerized agents, multi-step workflows, and governance.
Strong experience in developing and managing API endpoints, integrating with external systems, and supporting LLM access for conversational AI and automation.
Proficiency in Java or Scala for large-scale data processing, ingestion workflows, and custom AI integrations.
Experience supporting data observability, quality frameworks (e.g., unit tests, reconciliation logic, job monitoring), and AI governance (e.g., metadata embedding, compliance rules).
Strong familiarity with Git-based development, GitHub Copilot for AI-assisted coding, and structured code collaboration in environments like DBT Cloud and GitHub Actions.
Act quickly and independently, demonstrating a self-starter mindset with a proven ability to learn new tools and technologies on the fly, while delivering scalable solutions using any combination of tools in our tech stack to drive continuous improvement and impact.

Nice To Haves

Exposure to building tools in Microsoft Power Apps or other low-code platforms, including Copilot integrations for monitoring and workflows.
Experience in advertising, marketing, or digital media environments, particularly with use cases like performance reporting, reconciliation automation, or brand visibility optimization.

Responsibilities

Build, scale, and maintain robust data pipelines and models using DBT, Python, PySpark, Databricks, and SQL across cloud platforms, with a focus on integrating AI-first foundations and semantic layers for consistent data interpretation.
Design, develop, and manage semantic models, ontologies, taxonomies, knowledge graphs, and business glossaries using tools like DBT YAML, GitHub, Databricks Unity Catalog, Microsoft Fabric/OneLake, and Power BI to ensure unified data understanding, contextual responses, and enhanced AI reasoning.
Utilize and manage low-code/no-code data transformation and visualization tools such as Trifacta (Alteryx), DBT, Power BI, Tableau, Microsoft Fabric/OneLake, and Copilot Studio to enable governed, scalable semantic layers that support natural language querying, vector search, and hybrid AI indexing.
Help build the development of AI deployment pipelines, including containerized AI agents and workflow automation using Kubernetes, n8n, LangChain, Azure AI Foundry, and Microsoft Copilot Platform (MCP), to orchestrate multi-step processes like retrieval, summarization, recommendations, and proactive notifications.
Strengthen AI accuracy and governance by implementing metadata, sensitivity rules, access controls, and grounding mechanisms (e.g., combining vector databases, AI search indexes, and knowledge graphs) to enable reliable, compliant responses and address "why" questions through intelligent reasoning and source citation.
Design modular, reusable, and documented data models in support of analytics, reporting, AI enablement, and agentic applications, including integration with LLMs for intent parsing, routing, retrieval, and synthesis.
Develop and monitor mapping tables, validation rules, lineage tracking, automated error logging, and observability mechanisms for ETL/ELT job health, row-level integrity checks, schema/version control, and real-time data quality monitoring.
Collaborate with analysts, engineers, and business leads to translate raw data into governed, insight-ready datasets, leveraging tools like Adverity for multi-source integration and normalization.
Work to implement agentic AI and Copilot integrations into data processes to enhance accessibility, autonomous issue resolution, and dynamic.
Drive innovation across our Data Quality Suite and our roadmap, including real-time metrics monitoring, dynamic mapping interfaces, self-serve correction tools, and AI-enhanced features for scalability and ROI.
Contribute to the medallion data architecture (bronze/silver/gold) and define best practices for reusable data components, semantic layer extension (e.g., indexing unstructured knowledge for RAG), and AI infrastructure across the organization.
Integrate with and manage Databricks Unity Catalog, Databricks Workflows, SQL Analytics, Notebooks, and Jobs for governed analytics and ML workflows.
Develop and manage data pipelines and tools with Microsoft Fabric, Power BI, Power Apps, Azure Data Lake, Azure Blob Storage, and Copilot Studio, ensuring seamless ties to GitHub, n8n, and Kubernetes for orchestration.
Leverage GitHub and GitHub Copilot for version control, workflow automation, CI/CD deployments, code suggestions, and collaboration on SQL, Python, YAML, and agent development.
Utilize Java or Scala to support custom processing scripts, scalable ingestion frameworks, and advanced AI actions like code execution or vector search.