AI Platform Engineer

General Dynamics Information Technology

1d•$127,500 - $172,500•Onsite

About The Position

MEANINGFUL WORK AND PERSONAL IMPACT As an AI Platform Engineer (LLM & MLOps), the work you’ll do at GDIT will be impactful to the mission of USCENTCOM. You will play a crucial role in the design, deploy, and operate secure, scalable AI inference and orchestration platforms supporting USCENTCOM’s Data Analytical Environment (DAE) and AI environment. This role focuses on platform reliability, workflow stability, and operationalizing commercial LLMs in on-premises and hybrid environments. The engineer will work with GPU-enabled Kubernetes clusters, model serving frameworks, vector databases, and secure APIs to enable Retrieval-Augmented Generation (RAG) and agent-based AI workflows. This position does not focus on model training or AI research; instead, it emphasizes execution, integration, and platform resilience. This role supports the evolution of enterprise AI capabilities from foundational platforms to reusable, governed agent-based services.

Requirements

Bachelor’s degree in Computer Science, Engineering, or related technical field (or equivalent experience
DoD Directive 8140 compliant
8+ years of related experience
Strong experience with Kubernetes, containerization (Docker/Podman), and GPU scheduling.
Hands-on experience deploying LLM inference services (commercial or open-source).
Proficiency with Python and API development for platform services.
Experience integrating vector databases (e.g., FAISS, Milvus, Weaviate, OpenSearch).
Familiarity with MLOps toolchains (MLflow, CI/CD pipelines, artifact registries).
Experience operating systems in secure DoD environments.
Knowledge of monitoring/logging stacks (Prometheus, Grafana, ELK/Loki).
Active Secret clearance required; TS/SCI preferred or eligible
US citizenship required

Nice To Haves

Experience with RAG or agent-based AI architectures.
Familiarity with Kubernetes-native workflow engines (Argo, Kubeflow).
Exposure to cost tracking or usage metering for shared compute platforms.
Understanding of DoD AI governance, ethical AI, and responsible deployment.

Responsibilities

Design, deploy, and maintain GPU-enabled Kubernetes environments for AI inference and orchestration.
Operationalize commercial LLM inference services using frameworks such as Text Generation Inference (TGI), KServe, FastChat, Triton, or similar.
Integrate vector databases and knowledge repositories to support RAG and graph-augmented LLM workflows.
Build and maintain secure REST APIs for AI job submission, inference requests, and workflow orchestration.
Implement MLOps and platform lifecycle practices, including model versioning, containerization, CI/CD, and reproducibility.
Enforce multi-tenant isolation, RBAC, namespace quotas, and resource controls across teams.
Implement monitoring, logging, and alerting for AI services, GPU utilization, and workflow health.
Support secure deployment in air-gapped, on-prem, and hybrid environments, adhering to DoD security requirements.
Collaborate with platform, automation, and data teams to align AI capabilities with mission workflows.
Support prompt, rule, and heuristic-based agents by ensuring reliable inference, retrieval, and context delivery.
Maintain conversation-aware context pipelines used for tagging and classification agents.

Benefits

Growth: AI-powered career tool that identifies career steps and learning opportunities
Support: An internal mobility team focused on helping you achieve your career goals
Rewards: Comprehensive benefits and wellness packages, 401K with company match, competitive pay and paid time off
Community: Award-winning culture of innovation and a military-friendly workplace

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume