Member of Technical Staff - Code Generation

Embedding VCSan Mateo, CA
21hOnsite

About The Position

Introducing Moonlake, AI for creating real-time interactive content Mission : As an applied AI Research Engineer: Code agents (post training + systems) Scope of Work : - Agentic systems design: Tool catalogs, function calling, program synthesis/repair loops, ReAct/Reflexion/ToT/LangGraph-style control, self-verification, sandboxed execution. - Evaluation mindset: Build task suites for multi-step coding Full-stack LLM engineer: Prompt libraries, routing, retrieval, KV-cache management, streaming, telemetry. - Security & isolation: Docker/firejail, network egress controls, secrets hygiene, dependency pinning, supply-chain sanity. - Strong post-training: Supervised fine-tuning + preference/trace RL (DPO/RLAIF/RLHF), dataset curation, reward shaping, safety filters. Tech signals : - Has shipped agents that pass real repos' test suites end-to-end. - Has published papers in agentic systems/ code gen Contributions to agent frameworks or OSS evals (LangGraph/AutoGen/Guidance/LEAP, SWE-bench variants, EvalPlus). - Built datasets from execution traces; can show improvements from data > params. We are committed to being an on-site, in-person team currently based in San Mateo

Requirements

  • Has shipped agents that pass real repos' test suites end-to-end.
  • Has published papers in agentic systems/ code gen Contributions to agent frameworks or OSS evals (LangGraph/AutoGen/Guidance/LEAP, SWE-bench variants, EvalPlus).
  • Built datasets from execution traces; can show improvements from data > params.

Responsibilities

  • Agentic systems design: Tool catalogs, function calling, program synthesis/repair loops, ReAct/Reflexion/ToT/LangGraph-style control, self-verification, sandboxed execution.
  • Evaluation mindset: Build task suites for multi-step coding
  • Full-stack LLM engineer: Prompt libraries, routing, retrieval, KV-cache management, streaming, telemetry.
  • Security & isolation: Docker/firejail, network egress controls, secrets hygiene, dependency pinning, supply-chain sanity.
  • Strong post-training: Supervised fine-tuning + preference/trace RL (DPO/RLAIF/RLHF), dataset curation, reward shaping, safety filters.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service