The student will curate a structured domain knowledge corpus — drawing from technical references, historical sensor logs, incident reports, and operational procedures — and generate cross-modal embeddings to populate a vector index that serves as the retrieval substrate for the analyst. They will then build a retrieval-augmented generation (RAG) pipeline that conditions large multimodal models on evidence retrieved from this corpus, designing and evaluating dense, sparse, hybrid, and cross-modal retrieval strategies in which a query in one modality surfaces supporting evidence in another. A major part of their work will be benchmarking the pipeline on anomaly detection, contextual interpretation, and cross-modal reasoning tasks, analyzing how chunking, retriever quality, and retrieved-context configuration affect end-to-end accuracy, faithfulness, hallucination rates, and the marginal value of retrieval over a non-augmented baseline — identifying which retrieval-augmented approaches are most effective for mission-relevant monitoring scenarios.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Entry Level
Education Level
No Education Listed