Applied AI Research Engineer

Code Metal•Boston, MA

5h•Hybrid

About The Position

We're building next-generation AI systems that help military planners explore, compare, and evaluate operational courses of action. Our work combines frontier language models, simulation, planning, and verification into human-in-the-loop decision-support systems for defense applications. As an Applied AI Research Engineer, you’ll focus on human machine teaming and agentic AI to build systems that allow warfighters, planners, analysts, and decision-makers to explore operational choices with speed, confidence, and control. This role focuses on designing and building agentic AI systems – not chatbots. You'll develop multi-agent workflows, fine-tune and evaluate models, build retrieval pipelines, experiment with post-training techniques, and integrate AI with simulation and planning software. You'll work closely with AI researchers, software engineers, and defense experts to turn research ideas into production-ready capabilities. The goal is to make complex planning, wargaming, adjudication, and analysis workflows faster, more explainable, and more trustworthy.

Requirements

Bachelor's or Master's degree in Computer Science, Machine Learning, Engineering, Mathematics, Physics, or a related technical field, or equivalent practical experience.
3+ years building AI, machine learning, or applied research systems.
Strong Python engineering skills.
Experience with PyTorch and modern LLM tooling (Transformers, vLLM, Hugging Face, etc.).
Experience building or deploying agentic AI systems, tool-calling workflows, or multi-step reasoning pipelines.
Experience fine-tuning, evaluating, or serving language models.
Experience with retrieval-augmented generation, embeddings, vector search, or knowledge retrieval systems.
Strong understanding of experiment design, benchmarking, and model evaluation.
Ability to move quickly from research prototype to production-quality implementation.
Eligible to obtain a U.S. security clearance.

Responsibilities

Design and build agentic AI systems for planning, decision support, and human-machine teaming
Develop AI pipelines that integrate foundation models, retrieval, simulation, external tools, and deterministic software
Design, run, and analyze experiments to evaluate model and agent performance, reliability, traceability, latency, cost, and user trust
Fine-tune, distill, and evaluate foundation models for domain-specific planning, reasoning, and decision-support tasks
Build datasets, retrieval pipelines, automated benchmarks, and experiment infrastructure to support continuous model improvement and reproducible research
Partner with software engineers to transition research prototypes into scalable AI services
Collaborate with domain experts to translate operational workflows into AI-enabled capabilities while ensuring AI outputs remain explainable, reviewable, and under human control

Benefits

Health care plan with 100% premium coverage, including medical, dental, and vision
401k with 5% matching
Paid Time Off (uncapped vacation, plus sick and public holidays)
Flexible hybrid or remote work arrangement
Relocation assistance for qualifying employees

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume