Senior LLM / RAG Engineer

Peraton•Reston, VA

About The Position

We are looking for a Senior‑Level Engineer to lead the development and sustainment of Retrieval‑Augmented Generation (RAG) AI prototypes for a national‑security mission. You will work directly with customer stakeholders to expand an existing prototype and deliver new LLM‑powered capabilities that help analysts understand and act on large volumes of proprietary data. In this role, you will combine data engineering, model serving, GPU‑based inference, and rapid application development to deliver high‑impact AI tools in a secure environment.

Requirements

Bachelor’s degree in an area relevant to the position with 12+ years of applicable experience OR a Master’s degree in an area relevant to the position with 10 years of applicable experience; an additional 4 years of applicable experience maybe considered in lieu of a degree.
Active TS/SCI or SCI eligibility and active polygraph or ability to obtain a polygraph
Strong hands‑on experience with AWS services (EC2, S3, IAM, container services)
Expertise with Python and Linux in operational environments
Ability to design, deploy, and manage Docker‑based workloads
Experience with GPU inference, model serving, and LLM fundamentals
Experience working with structured, semi‑structured, and unstructured data
Familiarity with CI/CD basics (Git, Jenkins, etc.)

Nice To Haves

Streamlit or similar tools for rapid UI development
Experience with vector stores (Milvus or comparable)
Familiarity with embedding generation and RAG pipeline tooling
Experience with sglang, Ray Serve, LlamaIndex, Hugging Face, or similar frameworks
AWS certifications
Knowledge of prompt engineering and evaluation best practices

Responsibilities

Maintain and extend current RAG prototypes to integrate new datasets and features
Build and optimize data ingest pipelines using Python and AWS services
Develop LLM/embedding pipelines and operate GPU inference workloads
Deploy and manage containerized services in Kubernetes‑like or Docker‑based environments
Implement vector search solutions using modern vector databases
Develop mission‑focused UIs using Streamlit or similar tools for rapid prototyping
Use tools such as sglang, Ray Serve, and LlamaIndex to operationalize LLM capabilities
Collaborate closely with analysts and mission leaders to understand use cases and rapidly iterate on prototypes
Ensure solutions follow customer security, compliance, and operational guidelines
Write technical documentation for deployments, APIs, and system behaviors