Senior AI Integration Developer

Peraton•Red Bank, NJ

About The Position

Peraton Labs is seeking a Senior AI Integration Developer to lead the design and implementation of an AI assistant capability within an existing web application in support of RF spectrum monitoring for the Department of Defense. This is a technically demanding role at the intersection of applied AI, software engineering, and operational tooling. The core focus of this position is the development of a context-aware AI assistant and the Model Context Protocol (MCP) server and tooling infrastructure that connects it to the application’s data, workflows, and services. Given the sensitive nature of the operational environment, the primary deployment target is locally-hosted models (e.g., Ollama) running in air-gapped or connectivity-constrained environments — with cloud-based LLM APIs as a secondary consideration. The right candidate understands not just how to wire up a model, but how to design tool interfaces and select or tune models that perform reliably under these constraints. It is particularly important for the candidate to take the time to properly understand the application domain and CONOPs in order to develop appropriate MCP tool chains. This individual will work closely with the broader engineering team and domain stakeholders to identify high-value AI use cases, implement and iterate on MCP tools, and evaluate and improve the quality of AI-generated outputs over time. Familiarity with the full stack is also expected, as effective AI integration requires understanding the existing system that the assistant will interact with. The core web application for this effort uses the following technologies in the stack: FastAPI backend, React frontend, and PostgreSQL database).

Requirements

Minimum of 8 years of experience with a Bachelor's degree; 6 years with a Master's degree; or 3+ years with a PhD in Computer Science, Computer Engineering, Information Systems, or similar/related programs.
Experience deploying and working with locally-hosted models (e.g., Ollama, llama.cpp, or similar) in offline or restricted network environments
Strong understanding of the Model Context Protocol (MCP) — server design, tool schemas, and client-server communication
Experience with prompt engineering and system prompt design, particularly tuning prompts for the capabilities of smaller or quantized local models
Experience with agentic AI patterns — multi-step reasoning, tool chaining, and error recovery
Familiarity with model selection tradeoffs — capability, context length, quantization, and hardware requirements
Ability to design structured evaluation approaches for AI output quality and tool performance
Strong judgment about AI assistant UX — what makes a tool call well-designed, when an AI response is actually useful, etc.
Proficiency in Python; familiarity with FastAPI or comparable frameworks
Experience with Docker and containerized service development
Familiarity with TypeScript/Node.js for server-side development
Experience with React for implementing AI assistant or chat UI components
Experience with Git, CI/CD pipelines, and automated testing infrastructure
Clear communicator across technical and non-technical audiences
Must be a U.S. Citizen with ability to obtain/maintain a Secret clearance
Candidate should be local and able to work within our Red Bank, NJ; Basking Ridge, NJ; or Silver Spring, MD locations

Nice To Haves

Experience with LangChain, LangGraph, and FastMCP
Experience with GPU hardware performance benchmarking on constrained edge-deployed infrastructure
Familiarity with performance evaluation including: tool selection accuracy, parameter extraction correctness, multi-step reasoning success rates, response quality scoring, latency benchmarking, and regression testing across model versions
Experience fine-tuning or adapting open-weight models for domain-specific tasks
Familiarity with RAG (retrieval-augmented generation) architectures and vector databases in offline or on-premise deployments
Background in RF, spectrum management, spectrum sensing, software defined radios, propagation modeling, signal processing, or related DoD domains
Cybersecurity awareness in the context of AI systems and DoD environments
Experience with cloud-hosted LLM APIs as a secondary deployment target
Active Secret (or Higher) Clearance

Responsibilities

Design and implement MCP server and tool interfaces that expose application data and functionality to the AI assistant
Deploy and configure locally-hosted models (e.g. Ollama) for use in air-gapped or connectivity-constrained environments
Evaluate and select local models appropriate for specific assistant tasks; assess capability and performance tradeoffs across model sizes and families
Integrate LLM inference endpoints into the application backend and frontend, supporting both local and cloud-hosted models where applicable
Develop and refine system prompts, tool definitions, and context management strategies optimized for the capabilities and limitations of local models
Define and execute evaluation frameworks to assess AI output quality, tool call accuracy, and assistant reliability
Identify high-value use cases in collaboration with domain experts and stakeholders; translate them into concrete AI tool designs
Maintain and extend backend Python and TypeScript/Node.js services supporting AI functionality or work closely with other engineers to do so
Document AI architecture, tool schemas, prompt strategies, model configurations, and evaluation results
Stay current with the evolving local model and MCP ecosystem landscape