Data Architect

CGI•Strongsville, OH

About The Position

Position Description CGI is looking for an experienced Data Architect to design, implement, and optimize scalable data architectures across enterprise data platforms. The role will focus on Big Data ecosystems including Hadoop, Apache Kafka, Apache Iceberg, Teradata, Oracle databases, and Graph databases, supporting large-scale data ingestion, processing, governance, and analytics workloads. Your future duties and responsibilities: Design and implement scalable, distributed data architectures leveraging Hadoop ecosystem technologies. . Define data Lakehouse architecture using technologies such as Apache Iceberg. . Develop and maintain enterprise data models, including conceptual, logical, and physical data models. . Architect solutions for real-time and batch data processing pipelines. . Build and optimize data ingestion pipelines using Apache Kafka and other streaming frameworks. . Design and implement data lakes and Lakehouse architecture on Hadoop-based platforms. . Optimize data storage formats, partitioning strategies, and query performance using Apache Iceberg. . Integrate Teradata, Oracle, Hadoop, and streaming platforms to support enterprise data ecosystems. . Lead data modernization initiatives, including migration from legacy data warehouses to modern lakehouse platforms. . Design and implement graph-based data models for complex relationship analysis. . Architect solutions using graph databases (Neo4j, TigerGraph, JanusGraph, etc.) for use cases such as fraud detection, network analysis, and recommendation engines. . Define data governance frameworks, metadata management, and data lineage. . Implement data quality, privacy, and security standards across data platforms. . Ensure compliance with enterprise data policies and regulatory requirements. . Collaborate with data engineers, analytics teams, platform engineers, and business stakeholders. . Provide technical leadership in data platform design and architecture decisions. . Mentor engineers on data architecture best practices and modern data engineering frameworks.

Requirements

8+ years of experience in data architecture, data engineering, or big data platforms.
Strong experience with Hadoop ecosystem (HDFS, Hive, Spark, etc.).
Hands-on expertise with Apache Kafka for streaming data pipelines.
Experience implementing Apache Iceberg or modern Lakehouse architecture.
Strong experience with Teradata and Oracle databases.
Experience designing data models and enterprise data architecture.
Experience with Graph databases (Neo4j, JanusGraph, TigerGraph, etc.).
Strong knowledge of SQL, distributed computing, and data processing frameworks.
Expertise in Banking, Insurance and financial service
Strong agile/scrum development experience
Strong collaboration and communication skills within distributed project teams
Excellent written and verbal communication skills
Hadoop
Hive
MySQL
Oracle
Python
Shell Script
Teradata

Responsibilities

Design and implement scalable, distributed data architectures leveraging Hadoop ecosystem technologies.
Define data Lakehouse architecture using technologies such as Apache Iceberg.
Develop and maintain enterprise data models, including conceptual, logical, and physical data models.
Architect solutions for real-time and batch data processing pipelines.
Build and optimize data ingestion pipelines using Apache Kafka and other streaming frameworks.
Design and implement data lakes and Lakehouse architecture on Hadoop-based platforms.
Optimize data storage formats, partitioning strategies, and query performance using Apache Iceberg.
Integrate Teradata, Oracle, Hadoop, and streaming platforms to support enterprise data ecosystems.
Lead data modernization initiatives, including migration from legacy data warehouses to modern lakehouse platforms.
Design and implement graph-based data models for complex relationship analysis.
Architect solutions using graph databases (Neo4j, TigerGraph, JanusGraph, etc.) for use cases such as fraud detection, network analysis, and recommendation engines.
Define data governance frameworks, metadata management, and data lineage.
Implement data quality, privacy, and security standards across data platforms.
Ensure compliance with enterprise data policies and regulatory requirements.
Collaborate with data engineers, analytics teams, platform engineers, and business stakeholders.
Provide technical leadership in data platform design and architecture decisions.
Mentor engineers on data architecture best practices and modern data engineering frameworks.