Senior Data AI Engineer

IntelliTech LLC•Alexandria, VA

54d•Remote

About The Position

IntelliTech is seeking a Senior Data / AI Engineer to support a Department of War program focused on operationalizing a Government-owned digital twin application for ammunition industrial base readiness. The platform is a supply chain simulation solution built on Python, FastAPI, and React that enables analysts to model production timelines, identify bottlenecks, assess supply chain risk, and evaluate surge and modernization scenarios. This role will own the data lifecycle end-to-end—from raw file ingestion through validation, normalization, versioning, and delivery of run-ready artifacts to the simulation engine. The engineer will also help design and implement the AI-enabled decision-support layer, supporting natural-language analysis of scenario outputs, automated comparison and briefing generation, and guided scenario creation. This is a hands-on role on a lean, senior team. The ideal candidate is comfortable writing production code daily, designing scalable data pipelines, and working directly with Government analysts and data stakeholders to deliver mission-focused solutions.

Requirements

Bachelor’s degree in Computer Science, Data Science, Engineering, Information Systems, or a related technical discipline and 8+ years of relevant experience; or Master’s degree in a related field and 6+ years of relevant experience.
7+ years of professional experience in data engineering or data / AI engineering roles.
Strong hands-on Python development experience, including Pandas, NumPy, ETL/ELT design, data pipeline development, and asynchronous programming patterns.
Experience building data validation and quality frameworks, including schema enforcement, referential integrity, data contracts, and validation feedback mechanisms.
Experience integrating LLM APIs such as OpenAI, Anthropic, or equivalent platforms, including function calling, tool use, scoped retrieval, and prompt engineering for structured outputs.
Experience with MongoDB or other document-oriented databases, including data modeling and aggregation pipelines for analytics workloads.
Experience with Amazon S3 or other cloud object storage services, including raw, normalized, and curated data layering approaches.
Experience supporting DoD or federal Government programs.
Strong communication skills and the ability to work directly with technical and non-technical stakeholders in mission environments.

Nice To Haves

Experience with defense supply chain, logistics, manufacturing, or industrial base data.
Familiarity with Databricks, data mesh, or medallion architecture patterns such as bronze/silver/gold.
Familiarity with SimPy or discrete-event simulation data inputs and outputs.
Experience with Advana, WDP (War Data Platform), or other DoD enterprise data platforms.
Experience establishing data-sharing agreements and supporting Technical Exchange Meetings with Government source-system owners.
Knowledge of munitions-related data structures such as NIIN, CAGE, Bill of Material hierarchies, and production line capacity models.
Experience with Redis or other caching layers supporting analytics applications.
Experience with FastAPI or Flask backend development.
Prior experience supporting Army Cloud Environments

Responsibilities

Data Ingestion and Automation
Design and implement governed ingestion pipelines for complex defense supply chain datasets, including Bills of Materials (BOM), demand and order backlogs, facility and production line capacity, supplier risk, and acquisition planning data.
Build validation services that enforce schema conformance, referential integrity across linked datasets, circular reference detection, and business-rule validation with actionable row- and column-level feedback.
Implement raw data preservation in object storage such as Amazon S3, including metadata capture for source type, upload timestamp, uploader identity, file checksum, and dataset version.
Develop canonical data transformation workflows that convert validated source inputs into normalized, run-ready artifacts aligned to the simulation engine’s entity model.
Implement dataset versioning and lineage tracking so each scenario run is tied to explicit input versions and assumptions.
Automated Data Refresh
Work with Government stakeholders and source-system owners to identify, prioritize, and implement automated or semi-automated data refresh paths.
Participate in Technical Exchange Meetings (TEMs) to help define data contracts, including source format, semantics, refresh cadence, and validation requirements.
Implement approved connection patterns such as scheduled file landing, secure file exchange (SFTP), API-based retrieval, and cloud-to-cloud transfer mechanisms.
Maintain hardened controlled upload workflows in parallel so mission operations are not dependent solely on external integrations or approvals.
AI-Enabled Decision Support
Build the AI integration layer within the FastAPI backend to broker access to Government-approved hosted LLM endpoints.
Implement scoped retrieval logic that constrains AI context to approved run artifacts, simulation outputs, and post-processed analytics.
Develop natural-language Q&A capabilities that allow analysts to query scenario results such as bottlenecks, supplier risks, and differences between runs.
Build guided scenario generation workflows that translate analyst intent into structured JSON scenario configurations for user review and approval before execution.
Implement AI-assisted comparison summaries and brief-ready output generation.
Enable function calling and tool-use patterns so the model can dynamically query backend APIs for scenario comparison, bottleneck analysis, production planning, and supply chain risk.
Ensure all AI interactions are audit-logged, role-scoped, and grounded in explicit scenario artifacts.
Deterministic Analytics and Reporting
Extend existing comparison capabilities to generate structured side-by-side scenario outputs with standardized metrics and deltas.
Build reusable templates for brief-ready outputs that reduce analyst time-to-brief.
Generate reproducible comparison artifacts and store them as part of the scenario run record.
Data Quality and Performance
Implement data quality monitoring and dashboards for ingestion success rates, validation outcomes, and overall pipeline health.
Optimize data preparation and post-processing workflows to reduce end-to-end scenario runtime.
Design and implement version-bounded caching strategies for validated inputs, normalized data products, and reusable post-processing summaries.

Benefits

IntelliTech provides a comprehensive benefits package designed to support employees’ well-being and professional growth, including health insurance, dental insurance, and vision insurance, a 401(k), paid time off, professional development opportunities, and flexible work arrangements to support work-life balance.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume