ML/AI Research Engineer — Agentic AI Lab (Founding Team)

Fabrion•San Francisco, CA

124d

About The Position

We’re designing the future of enterprise AI infrastructure — grounded in agents, retrieval-augmented generation (RAG), knowledge graphs, and multi-tenant governance. We’re looking for an ML/AI Research Engineer to join our AI Lab and lead the design, training, evaluation, and optimization of agent-native AI models. You'll work at the intersection of LLMs, vector search, graph reasoning, and reinforcement learning — building the intelligence layer that sits on top of our enterprise data fabric. This isn’t a prompt engineer role. It’s full-cycle ML: from data curation and fine-tuning to evaluation, interpretability, and deployment — with cost-awareness, alignment, and agent coordination all in scope.

Requirements

Deep experience fine-tuning open-source LLMs using HuggingFace Transformers, DeepSpeed, vLLM, FSDP, LoRA/QLoRA
Worked with both base and instruction-tuned models; familiar with SFT, RLHF, DPO pipelines
Comfortable building and maintaining custom training datasets, filters, and eval splits
Understand tradeoffs in batch size, token window, optimizer, precision (FP16, bfloat16), and quantization
Experience building enterprise-grade RAG pipelines integrated with real-time or contextual data
Familiar with LangChain, LangGraph, LlamaIndex, and open-source vector DBs (Weaviate, Qdrant, FAISS)
Experience grounding models with structured data (SQL, graph, metadata) + unstructured sources
Experience training or customizing agent frameworks with multi-step reasoning and memory
Understand common agent loop patterns (e.g. Plan→Act→Reflect), memory recall, and tools
Familiar with self-correction, multi-agent communication, and agent ops logging
Strong background in token cost optimization, chunking strategies, reranking (e.g. Cohere, Jina), compression, and retrieval latency tuning
Experience running models under quantized (int4/int8) or multi-GPU settings with inference tuning (vLLM, TGI)

Nice To Haves

Worked with Neo4j, Puppygraph, RDF, OWL, or other semantic modeling systems

Responsibilities

Fine-tune and evaluate open-source LLMs (e.g. LLaMA 3, Mistral, Falcon, Mixtral) for enterprise use cases with both structured and unstructured data
Build and optimize RAG pipelines using LangChain, LangGraph, LlamaIndex, or Dust — integrated with our vector DBs and internal knowledge graph
Train agent architectures (ReAct, AutoGPT, BabyAGI, OpenAgents) using enterprise task data
Develop embedding-based memory and retrieval chains with token-efficient chunking strategies
Create reinforcement learning pipelines to optimize agent behaviors (e.g. RLHF, DPO, PPO)
Establish scalable evaluation harnesses for LLM and agent performance, including synthetic evals, trace capture, and explainability tools
Contribute to model observability, drift detection, error classification, and alignment
Optimize inference latency and GPU resource utilization across cloud and on-prem environments

Benefits

Competitive salary
Meaningful equity (founding tier)

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Entry Level

ML/AI Research Engineer — Agentic AI Lab (Founding Team)

About The Position

Requirements

Nice To Haves

Responsibilities

Benefits

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company