Lead Data Architect

IBR•Suitland, MD

1d•Onsite

About The Position

At Imagine Believe Realize, LLC we are driven by innovation, transformation and a relentless pursuit of excellence. As an industry leader delivering impactful results, we thrive on solving complex technical challenges and developing cutting-edge solutions that empower our customers and advance critical missions. IBR is a fast-growing company fueled by passion, curiosity, and innovative thinking – where every team member has the opportunity to continuously learn, unlock their full potential, and redefine what is possible in engineering and technology. If you are inspired by innovation, eager to make a difference, and ready to bring your creativity and expertise to a mission-focused team, we invite you to join us and together, we will shape the future. Let’s Make It Real!

Requirements

14+ years of IT experience focusing on enterprise data architecture and management
Must be able to obtain and maintain a Public Trust security clearance
Bachelor’s degree required
Experience in Conceptual/Logical/Physical Data Modeling to define how data is stored, processed, and accessed.
Expertise in Relational and Dimensional Data Modeling OLTP and OLAP workloads noSQL Databases Time-Series Databases Graph Databases
Strong experience with data extraction, cleaning, and transformation (ETL) processes.
Expertise in statistical modeling, machine learning algorithms, and data mining techniques.
Experience with Data processing frameworks, Orchestration, Structured Streaming, Data and Delta Lake concepts, and Delta Live Tables.
Expertise in Spark/Python/Databricks, Data Lake and SQL
Experience leading and architecting enterprise-wide initiatives specifically system integration, data migration, transformation, data warehouse build, data mart build, and data lake implementation / support.
Advanced level understanding of streaming data pipelines and how they differ from batch systems
Formalize concepts of how to handle late data, defining windows, and data freshness
Advanced understanding of ETL and ELT and ETL/ELT tools such as SSIS, Pentaho, Data Migration Service etc.
Understanding of concepts and implementation strategies for different incremental data loads such as tumbling window, sliding window, high watermark, etc.
Familiarity and/or expertise with Great Expectations or other data quality/data validation frameworks
Understanding of streaming data pipelines and batch systems
Familiarity with concepts such as late data, defining windows, and how window definitions impact data freshness
Experience with data lineage (technical, business, operational) and observability
Experience developing open archive solutions based using Parquet.
Advanced level SQL experience (Joins, Aggregation, Windowing functions, Common Table Expressions, RDBMS schema design, Postgres performance optimization, and caching)
Indexing and partitioning strategy experience.
Experience with large-scale, high-performance enterprise big data application deployment and solutioning.
Understanding how to create DAGs to define workflows
Experience working with JSON and defining JSON Schemas
Experience setting up and management Confluent/Kafka topics and ensuring performance using Kafka
Familiarity with Schema Registry, message formats and data storage options such as Avro, ORC, Parquet etc.
Table Formats and Lakehouse technologies experience utilizing Apache Iceberg (schema evolution, large-scale analytics, and ACID), Delta Lakes, and Apache Hudi
Understanding how to manage ksqlDB SQL files and migrations and Kafka Streams
Experience with incremental ingestion of data using Batch Ingestion Patterns, Streaming & Event-Driven Architecture, Change Data Capture (CDC), and API-Based Integration.
Experience with data governance, security, compliance, metadata management, and data cataloging.
Ability to thrive in a team-based environment
Experience briefing the benefits and constraints of technology solutions to technology partners, stakeholders, team members, and senior level of management

Nice To Haves

Familiarity with CI/CD pipelines, containerization, and pipeline orchestration tools such as Airflow, Prefect, etc., but not required.
Architecture experience in AWS environment a bonus
Familiarity working with Kinesis and/or Lambda specifically with how to push and pull data, how to use AWS tools to view data in Kinesis streams, and for processing massive data at scale a bonus
Experience with Docker, Jenkins/GitLab, and CloudWatch
Ability to write and maintain Jenkins files for supporting CI/CD pipelines
Experience working with AWS Lambdas for configuration and optimization
Experience working with DynamoDB to query and write data
Experience with S3
Knowledge of Python (Python 3 desired) for CI/CD pipelines a bonus
Familiarity with Pytest and Unittest a bonus

Responsibilities

Develop conceptual, logical, and physical data models to define how data is stored, processed, and accessed.
Identify the strategy, tooling, and governance for implementing a common Enterprise Metadata Repository.
Create and maintain database architectures in alignment with business requirements, ensuring integrity, scalability, and performance.
Design solutions to integrate data from multiple internal and external sources, enabling a unified view for business use.
Establish policies, procedures, and standards for data quality, security, and regulatory compliance.
Implement measures to protect sensitive data from unauthorized access and ensure regulatory compliance.
Work closely with data engineers, analysts, scientists, and business stakeholders to ensure data solutions meet organizational needs.
Oversee migration from legacy systems, optimize existing data systems, and monitor performance, leveraging automation where possible.
Prepare architecture reports and maintain documentation for management and technical teams.
Collaborate with software developers, system architects, and business analysts to develop, verify, and optimize data science solutions, including AI/ML-based ones.
Lead and mentor a team of data scientists and data engineers, providing technical guidance and support.

Benefits

Nationwide medical, dental, and vision insurance
3 weeks of Paid Time Off and 11 Paid Federal Holidays
401k matching
Life Insurance, Short-Term Disability, and Long-Term Disability at no cost to our employees
Supplemental insurance options
Flexible spending accounts and Dependent Care spending accounts
Wellness incentives
Reimbursement for professional development and certifications
Access to training assistance opportunities to support career growth and progression
Hybrid and Remote work opportunities to support work-life balance