About The Position

e:cue is a fast-paced, high-growth startup building custom AI analysts for leaders in marketing, finance, and revenue. Our platform combines production-grade application services, cloud infrastructure, and agent systems that power high-stakes business decisions. We're looking for a senior engineer who can take work from ticket to outcome: scope it, build it, ship it, and own it in production. This role owns core parts of the agent stack, deciding how agents plan and execute, how they interact with data, and how we evaluate and improve them over time. You'll work across: Agent systems: planning, tool use, multi agent orchestration, long-context workflows Backend and infrastructure: agent services, data pipelines, and observability Evaluation and post-training: Designing evaluation harnesses, feedback loops, datasets, and improving agent behavior

Requirements

  • Experience building or working with LLM-powered systems
  • Familiarity with Agents, tool use, or structured reasoning systems
  • Experience with ML evaluation systems for ambiguous objectives
  • Ability to own problems end-to-end
  • Strong product intuition

Nice To Haves

  • Experience with ML systems or training workflows, finetuning (SFT, DPO, RLHF, etc.), dataset construction and evaluation pipelines
  • Experience building agent frameworks for tool-using LLMs for long-context or retrieval-heavy workflows
  • Familiarity with modern inference, frontier APIs, and serving stacks (vLLM, SGLang, or similar)
  • Experience at a startup owning large systems independently

Responsibilities

  • Design and build production agent systems: Tool execution frameworks (MCP servers, sandbox environments, tool architectures), Planning and reasoning pipelines, Context and dependency aware agent execution
  • Own services that power production agents: Reliability, latency, and scaling improvements, Observability integrity (logging, tracing, evaluation hooks for offline and online evaluation)
  • Develop evaluation and feedback systems: Define metrics for agent performance (offline and online), Own evaluation harnesses and test suites, Instrument systems to generate high-quality evaluation and training data
  • Contribute to post-training and model improvement: Dataset generation (trajectory collection, preference data), Fine-tuning (SFT, DPO, etc.) for modules where context engineering isn't enough, Prompting and system design for better reasoning and context management

Benefits

  • Remote team: work from where you need to
  • Flexible paid time off: because you're an adult
  • Generous health insurance reimbursement through QSEHRA
  • Competitive salary
  • Equity packages
  • Company-performance bonus
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service