AI / ML Engineer

Guidehouse•Huntsville, AL

About The Position

Guidehouse is seeking a Lead AI/ML Engineer to join our Technology / AI and Data team, supporting mission-critical initiatives for Defense and Security clients. In this role, you will help design and operationalize advanced AI solutions that leverage large language models (LLMs), retrieval systems, and secure, scalable inference pipelines to enable data-driven decision-making. You will lead efforts to ensure models are accurate, reliable, and compliant with stringent security and regulatory standards, while collaborating with architects, engineers, and subject matter experts to deliver innovative solutions that enhance operational effectiveness. This is an opportunity to shape cutting-edge AI capabilities that directly support national security objectives. What You Will Do: Serves as the lead AI/ML engineer responsible for developing, optimizing, and operationalizing advanced LLM-driven workflows for the FBI adjudication platform. Drives design and implementation of inference pipelines, RAG workflows, retrieval systems, prompt architectures, and model lifecycle processes. Leads development of dual-path model operations supporting self-hosted open‑weight LLMs in AWS GovCloud and FedRAMP‑High managed endpoints. Engineers GPU-based inference infrastructure, model containerization, distributed inference strategies, and performance‑optimized reasoning workflows. Designs and maintains continuous learning systems including SFT, LoRA/QLoRA adapters, dataset curation, automated evaluation suites, hallucination detection, bias evaluation, and model drift monitoring. Ensures models are safe, accurate, reliable, and aligned to SEAD‑4 adjudication criteria. Ensures all model operations adhere to FedRAMP High, RMF, CJIS, and FBI ATO requirements, including controls for logging, access, explainability, evidence provenance, and data protection. LLM Development, Inference Optimization & Prompt Engineering Develop and maintain LLM inference pipelines supporting long‑document reasoning, multi‑document fusion, entity extraction, anomaly detection, SEAD‑4 scoring, and structured memo generation. Build and manage advanced prompt architectures including system prompts, instruction sets, retrieval‑augmented prompts, multi-step reasoning flows, and output‑schema enforcement to ensure accuracy and stability. Implement distributed GPU inference frameworks (vLLM, TGI, DeepSpeed, Sagemaker) and optimize workloads with KV caching, tensor parallelism, dynamic batching, and memory efficiency strategies. Develop output‑validation routines enforcing schema correctness, key‑evidence referencing, structured scoring, and quality controls for all model‑generated adjudicative content. Retrieval‑Augmented Generation (RAG) & Embedding Pipelines Implement RAG architectures including embedding generation, vector indexing, long‑context retrieval, and retrieval scoring to support evidence‑grounded outputs for 300–400‑page investigative files. Optimize chunking strategies, ranking models, hybrid search pipelines, and retrieval heuristics to ensure accurate and contextually aligned LLM output. Develop retrieval pipelines that reduce hallucination risk, enforce evidence provenance, and provide structured citation‑linked responses consistent with adjudication standards. Fine‑Tuning, Model Customization & Learning Systems Lead development of supervised fine‑tuning (SFT) pipelines using adjudicator examples, SEAD‑4 scoring decisions, historical memos, and SME‑curated datasets. Build LoRA/QLoRA fine‑tuning workflows for secure GovCloud environments, enabling high‑fidelity model specialization without full retraining cycles. Design evaluation suites measuring guideline adherence, evidence alignment, factual consistency, hallucination probability, and reasoning stability across adjudicative categories. Implement model drift detection, scoring distribution monitoring, and automated retraining triggers tied to analyst feedback and dataset evolution. Secure ML Operations & Compliance Alignment Ensure ML operations align with FedRAMP High and RMF requirements, including encryption, boundary isolation, identity controls, inference logging, and auditable model‑output trails. Establish secure input‑validation flows, restricted‑context enforcement, prompt sanitization, and runtime protections to mitigate security and data‑integrity risks. Develop telemetry pipelines capturing query metadata, retrieval context, response confidence, scoring variances, and override patterns for audit and monitoring. Backend Integration, Workflow Support & Platform Engineering Integrate LLM inference services with backend APIs, scoring engines, memo‑generation modules, entity‑resolution tools, and analyst‑facing UI workflows. Develop supporting microservices for prompt routing, retrieval assembly, evaluation probes, model profiling, and inference orchestration. Collaborate with backend engineers to optimize throughput, latency, concurrency, and reliability for high‑volume adjudication workflows. Collaboration, Leadership & Mission Support Work with the AI Solutions Architect to maintain coherence between ML pipelines and system‑wide architecture. Collaborate with adjudicators, SEAD‑4 SMEs, and mission stakeholders to translate adjudicative logic into prompts, features, and structured model outputs. Mentor junior engineers, lead experimentation cycles, participate in design reviews, and contribute to Guidehouse AI/ML engineering best practices.

Requirements

An ACTIVE and MAINTAINED "TOP SECRET" Federal or DoD security clearance and obtained and maintain TS/SCI clearance.
Minimum of Eight (8) years of experience in AI/ML engineering with 4+ years focused on NLP, LLMs, or MLOps.
Bachelor' s Degree or Four (4) additional Years of experience in lieu of degree.
Expertise in PyTorch, HuggingFace Transformers, vLLM, DeepSpeed, or equivalent frameworks.
Strong background in retrieval systems, embeddings, RAG pipelines, vector databases, and long‑context optimization.
Experience implementing MLOps workflows, evaluation frameworks, drift detection, and responsible‑AI safeguards.
Experience delivering ML systems in secure federal environments subject to FedRAMP High or RMF controls.

Nice To Haves

Experience supporting adjudication, continuous vetting, background investigations, or SEAD‑4 scoring workflows.
Experience deploying open‑weight LLMs in GovCloud or secure enclaves.
Experience with citation‑grounding pipelines, evidence‑verification workflows, or structured model‑output evaluation.
AWS Machine Learning Specialty, Solutions Architect Professional, or GPU Compute certifications.
Experience with explainability tooling, guardrails, reasoning verification, or adversarial evaluation.

Responsibilities

Serves as the lead AI/ML engineer responsible for developing, optimizing, and operationalizing advanced LLM-driven workflows for the FBI adjudication platform.
Drives design and implementation of inference pipelines, RAG workflows, retrieval systems, prompt architectures, and model lifecycle processes.
Leads development of dual-path model operations supporting self-hosted open‑weight LLMs in AWS GovCloud and FedRAMP‑High managed endpoints.
Engineers GPU-based inference infrastructure, model containerization, distributed inference strategies, and performance‑optimized reasoning workflows.
Designs and maintains continuous learning systems including SFT, LoRA/QLoRA adapters, dataset curation, automated evaluation suites, hallucination detection, bias evaluation, and model drift monitoring.
Ensures models are safe, accurate, reliable, and aligned to SEAD‑4 adjudication criteria.
Ensures all model operations adhere to FedRAMP High, RMF, CJIS, and FBI ATO requirements, including controls for logging, access, explainability, evidence provenance, and data protection.
Develop and maintain LLM inference pipelines supporting long‑document reasoning, multi‑document fusion, entity extraction, anomaly detection, SEAD‑4 scoring, and structured memo generation.
Build and manage advanced prompt architectures including system prompts, instruction sets, retrieval‑augmented prompts, multi-step reasoning flows, and output‑schema enforcement to ensure accuracy and stability.
Implement distributed GPU inference frameworks (vLLM, TGI, DeepSpeed, Sagemaker) and optimize workloads with KV caching, tensor parallelism, dynamic batching, and memory efficiency strategies.
Develop output‑validation routines enforcing schema correctness, key‑evidence referencing, structured scoring, and quality controls for all model‑generated adjudicative content.
Implement RAG architectures including embedding generation, vector indexing, long‑context retrieval, and retrieval scoring to support evidence‑grounded outputs for 300–400‑page investigative files.
Optimize chunking strategies, ranking models, hybrid search pipelines, and retrieval heuristics to ensure accurate and contextually aligned LLM output.
Develop retrieval pipelines that reduce hallucination risk, enforce evidence provenance, and provide structured citation‑linked responses consistent with adjudication standards.
Lead development of supervised fine‑tuning (SFT) pipelines using adjudicator examples, SEAD‑4 scoring decisions, historical memos, and SME‑curated datasets.
Build LoRA/QLoRA fine‑tuning workflows for secure GovCloud environments, enabling high‑fidelity model specialization without full retraining cycles.
Design evaluation suites measuring guideline adherence, evidence alignment, factual consistency, hallucination probability, and reasoning stability across adjudicative categories.
Implement model drift detection, scoring distribution monitoring, and automated retraining triggers tied to analyst feedback and dataset evolution.
Ensure ML operations align with FedRAMP High and RMF requirements, including encryption, boundary isolation, identity controls, inference logging, and auditable model‑output trails.
Establish secure input‑validation flows, restricted‑context enforcement, prompt sanitization, and runtime protections to mitigate security and data‑integrity risks.
Develop telemetry pipelines capturing query metadata, retrieval context, response confidence, scoring variances, and override patterns for audit and monitoring.
Integrate LLM inference services with backend APIs, scoring engines, memo‑generation modules, entity‑resolution tools, and analyst‑facing UI workflows.
Develop supporting microservices for prompt routing, retrieval assembly, evaluation probes, model profiling, and inference orchestration.
Collaborate with backend engineers to optimize throughput, latency, concurrency, and reliability for high‑volume adjudication workflows.
Work with the AI Solutions Architect to maintain coherence between ML pipelines and system‑wide architecture.
Collaborate with adjudicators, SEAD‑4 SMEs, and mission stakeholders to translate adjudicative logic into prompts, features, and structured model outputs.
Mentor junior engineers, lead experimentation cycles, participate in design reviews, and contribute to Guidehouse AI/ML engineering best practices.

Benefits

Medical, Rx, Dental & Vision Insurance
Personal and Family Sick Time & Company Paid Holidays
Parental Leave
401(k) Retirement Plan
Group Term Life and Travel Assistance
Voluntary Life and AD&D Insurance
Health Savings Account, Health Care & Dependent Care Flexible Spending Accounts
Transit and Parking Commuter Benefits
Short-Term & Long-Term Disability
Tuition Reimbursement, Personal Development, Certifications & Learning Opportunities
Employee Referral Program
Corporate Sponsored Events & Community Outreach
Care.com annual membership
Employee Assistance Program
Supplemental Benefits via Corestream (Critical Care, Hospital Indemnity, Accident Insurance, Legal Assistance and ID theft protection, etc.)
Position may be eligible for a discretionary variable incentive bonus