AI Research Engineer

Dropzone AI
Remote

About The Position

We are seeking a Senior to Principal-level AI Research Engineer to lead the design and development of next-generation agentic AI systems. This role sits at the intersection of research and production, with a strong emphasis on agent architecture design, harness and memory engineering, and robust evaluation and benchmarking of model and agent performance. You will work closely with product and engineering teams to translate cutting-edge research into scalable, real-world systems. In this role, you will directly shape the core intelligence layer of Dropzone AI. Your work will define how our agents reason, remember, and improve over time, influencing both our product capabilities and the broader direction of applied AI systems.

Requirements

  • 5+ years in software engineering, with at least 1+ year applying GenAI in production
  • Proven experience building or researching: Agent frameworks / tool-using LLMs
  • Proven experience building or researching: Memory / retrieval systems (RAG, vector DBs, hybrid retrieval)
  • Expert Python developer
  • Familiar with openclaw and Claude Code harness architecture
  • Early-stage startup mindset. You thrive on ambiguity and move with lightspeed execution

Nice To Haves

  • Experience with agent orchestration frameworks (LangGraph, AutoGen, custom systems)
  • Familiarity with AI safety guardrails, hallucination mitigation, and structured output enforcement
  • Experience designing LLM evals (offline + online, human-in-the-loop, synthetic data)
  • Publications or open-source contributions in relevant areas
  • Experience applying latest context/harness engineering techniques to customer facing products
  • Founder or early-stage (first 10 engineers) or experience in standing up a new technology bet within a more established company

Responsibilities

  • Design and implement advanced multi-step reasoning agents (tool use, planning, reflection, self-improvement loops)
  • Develop frameworks for multi-agent coordination and task decomposition
  • Improve reliability, latency, and cost efficiency of agent execution
  • Architect short-term and long-term memory subsystems (episodic, semantic, retrieval-based, hybrid)
  • Build mechanisms for context compression, retrieval, and grounding
  • Explore novel approaches to continual learning and state persistence
  • Define and implement evaluation frameworks for agent performance (task success, reasoning quality, robustness)
  • Build automated eval pipelines (synthetic data, adversarial testing, regression testing)
  • Establish metrics and benchmarks for agent reliability in production
  • Translate latest community research ideas into production-grade systems
  • Run experiments, analyze results, and iterate quickly
  • Contribute to internal knowledge sharing and technical direction

Benefits

  • company paid health insurance
  • 401K Plan with employer match
  • Self-Managed PTO
  • parental leave
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service