About The Position

We’re designing the future of enterprise AI infrastructure — grounded in agents, retrieval-augmented generation (RAG), knowledge graphs, and multi-tenant governance. We’re looking for an ML/AI Research Engineer to join our AI Lab and lead the design, training, evaluation, and optimization of agent-native AI models. You'll work at the intersection of LLMs, vector search, graph reasoning, and reinforcement learning — building the intelligence layer that sits on top of our enterprise data fabric. This isn’t a prompt engineer role. It’s full-cycle ML: from data curation and fine-tuning to evaluation, interpretability, and deployment — with cost-awareness, alignment, and agent coordination all in scope.

Requirements

  • Deep experience fine-tuning open-source LLMs using HuggingFace Transformers, DeepSpeed, vLLM, FSDP, LoRA/QLoRA
  • Worked with both base and instruction-tuned models; familiar with SFT, RLHF, DPO pipelines
  • Comfortable building and maintaining custom training datasets, filters, and eval splits
  • Understand tradeoffs in batch size, token window, optimizer, precision (FP16, bfloat16), and quantization
  • Experience building enterprise-grade RAG pipelines integrated with real-time or contextual data
  • Familiar with LangChain, LangGraph, LlamaIndex, and open-source vector DBs (Weaviate, Qdrant, FAISS)
  • Experience grounding models with structured data (SQL, graph, metadata) + unstructured sources
  • Experience training or customizing agent frameworks with multi-step reasoning and memory
  • Understand common agent loop patterns (e.g. Plan→Act→Reflect), memory recall, and tools
  • Familiar with self-correction, multi-agent communication, and agent ops logging
  • Strong background in token cost optimization, chunking strategies, reranking (e.g. Cohere, Jina), compression, and retrieval latency tuning
  • Experience running models under quantized (int4/int8) or multi-GPU settings with inference tuning (vLLM, TGI)

Nice To Haves

  • Worked with Neo4j, Puppygraph, RDF, OWL, or other semantic modeling systems

Responsibilities

  • Fine-tune and evaluate open-source LLMs (e.g. LLaMA 3, Mistral, Falcon, Mixtral) for enterprise use cases with both structured and unstructured data
  • Build and optimize RAG pipelines using LangChain, LangGraph, LlamaIndex, or Dust — integrated with our vector DBs and internal knowledge graph
  • Train agent architectures (ReAct, AutoGPT, BabyAGI, OpenAgents) using enterprise task data
  • Develop embedding-based memory and retrieval chains with token-efficient chunking strategies
  • Create reinforcement learning pipelines to optimize agent behaviors (e.g. RLHF, DPO, PPO)
  • Establish scalable evaluation harnesses for LLM and agent performance, including synthetic evals, trace capture, and explainability tools
  • Contribute to model observability, drift detection, error classification, and alignment
  • Optimize inference latency and GPU resource utilization across cloud and on-prem environments

Benefits

  • Competitive salary
  • Meaningful equity (founding tier)
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service