Data Engineering Lead

Ignite IT•Suitland-Silver Hill, MD

53d

About The Position

The Data Engineering Lead is responsible for designing and implementing modern, scalable data architectures to support migration of legacy, file-based analytical systems to AWS Cloud Native environments. This role leads the transformation of legacy SAS-based data storage models—including flat files, batch outputs, and subsystem-specific data artifacts—into structured, governed, and scalable data models optimized for cloud-native processing. The Data Engineering Lead will ensure data integrity, performance, and visibility across a system-of-systems modernization initiative, while providing technical leadership for data modeling, ingestion patterns, validation frameworks, and transparency reporting. Expert-level proficiency in Python and strong experience designing AWS-based data architectures are required.

Requirements

8+ years of experience in data engineering or data architecture.
Expert-level proficiency in Python for data engineering.
Demonstrated experience transforming legacy file-based systems into cloud-native data architectures.
Experience developing data models for high-volume, data-intensive applications.
Deep experience with AWS data services (Glue, Lambda, S3, Aurora/Postgres, EventBridge, etc.).
Experience designing scalable ETL/ELT pipelines.
Experience building analytical dashboards (e.g., QuickSight or equivalent).
Experience implementing automated data validation and quality controls.
Experience working in Agile Scrum Teams.
U.S. Citizenship required.

Nice To Haves

Experience modernizing SAS-based data environments.
Experience supporting system-of-systems integration programs.
Experience implementing data lineage and metadata management.
Experience operating in regulated or federal environments.

Responsibilities

Legacy Data Discovery & Data Model Transformation
Participate in structured system inventory efforts to document:
Legacy file-based storage structures
SAS dataset dependencies
Subsystem data flows
Manual gating and handoff processes
Analyze legacy storage models and design target-state data models aligned to AWS Cloud Native architecture.
Replace file-driven batch dependencies with:
API-based ingestion
Event-driven workflows
Database-backed storage (e.g., Aurora/Postgres)
Define canonical data schemas and transformation standards.
Cloud-Native Data Architecture Design
Architect scalable AWS data pipelines using services such as:
S3
Glue
Lambda
EventBridge
SNS/SQS
Aurora/Postgres
Batch
Athena
Design data ingestion, staging, transformation, and validation workflows.
Establish schema management, versioning, and data lineage practices.
Optimize data storage for performance, scalability, and cost efficiency.
Support serverless and containerized data processing architectures.
Expert Python-Based Data Engineering
Develop advanced Python-based data transformation and validation pipelines.
Implement modular, reusable data processing components.
Optimize large-scale data manipulation for distributed execution.
Develop high-performance ETL/ELT frameworks.
Embed automated validation checks directly into data pipelines.
Expert-level Python proficiency is required, particularly for:
High-volume data processing
Data validation logic
Modular data engineering frameworks
Data Accuracy, Validation & Visibility
Design and implement automated data validation frameworks to ensure:
Functional equivalence during migration
Record-level and aggregate-level consistency
Downstream compatibility across subsystems
Develop dashboards and reporting mechanisms providing:
Data accuracy metrics
Pipeline health indicators
Variance detection summaries
Enable transparency into data transformation impacts across modernization phases.
Support regression validation through golden datasets and automated comparisons.
System-of-Systems Data Coordination
Coordinate with Senior Developers and Requirements Engineers to align data models with application modernization.
Ensure upstream/downstream data contract stability.
Prevent data thrashing during phased migration.
Support orchestration of gated workflows through automated triggers rather than manual file exchanges.
Collaborate across workstreams to establish shared data standards.
DevSecOps & Governance Alignment
Integrate data pipelines into CI/CD frameworks.
Support infrastructure-as-code alignment (Terraform/CloudFormation collaboration).
Ensure compliance with security controls (IAM, encryption, key management).
Produce documentation supporting:
Architecture review boards
Interface control documents
Data flow diagrams
Support ATO-related data validation evidence.