Ditto AI-posted 3 months ago
Full-time • Entry Level
Berkeley, CA
1-10 employees

We’re looking for a self-starter who loves building new products in an iterative, fast-moving environment. As a Founding AI Engineer, you’ll report directly to the cofounders and work closely with product and engineering. You’ll bring our smartest matchmaking AI to life, design chat agents that feel human, and create internal tools that other AIs use to reason, retrieve, and act. This is an early, high-ownership role (<10 people on the team) where your decisions will define our agentic system’s foundations.

  • Ship agentic matchmaking from research to production—own the end-to-end loop (retrieval, reasoning, tool use, safety) and drive measurable accuracy improvements.
  • Build a prompt & model evaluation harness (offline + online) to compare prompts/models/policies, support A/B testing, and enable fast iteration.
  • Optimize AI chat systems for lower latency, higher perceived 'human-likeness,' and more consistent outcomes across providers.
  • Design and maintain context engineering pipelines (RAG, memory, summarization, compression, grounding) for conversations and matchmaking.
  • Stand up observability for agents (traces, costs, failures, hallucinations, guardrails) and create dashboards that guide product decisions.
  • Collaborate daily with the cofounders and product to translate user problems into agent behaviors, experiments, and shipped features.
  • Write clear, maintainable code; create small internal tools and SDKs other engineers (and AIs) will use.
  • 2–4+ years of relevant experience or a standout personal portfolio of agents/LLM apps—show us what you’ve built (GitHub, demos, write-ups).
  • Strong programming foundations (data structures, algorithms, testing, profiling).
  • TypeScript (product code, tools, services) and Python (model ops, evals, data) proficiency.
  • Experience building with multiple LLM providers and tool-calling/function-calling; comfortable swapping models and orchestrating fallbacks.
  • Hands-on with RAG (indexing, chunking, embeddings, reranking) and context engineering for reliability and cost/latency trade-offs.
  • Practical prompt engineering and prompt libraries; can reason about failure modes and systematically improve prompts/policies.
  • Ability to define metrics/KPIs (accuracy, latency, cost, safety), run A/B tests, and loop in human feedback for quality.
  • Comfortable with MongoDB in production; familiarity with vector databases (e.g., pgvector/Redis/Pinecone/Weaviate) is a plus.
  • MCP (Model Context Protocol), agent frameworks (LangGraph/CrewAI/Assistants), LLM observability/evals (e.g., Langfuse/Promptfoo/Ragas/TruLens), retrieval & embeddings know-how, safety/guardrails/red-teaming.
  • Builder’s mindset: thrives with ambiguity, ships quickly, debugs systematically, and sweats the user experience.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service