Member of Technical Staff - Code Generation

Embedding VC•San Mateo, CA

38d•Onsite

About The Position

Introducing Moonlake, AI for creating real-time interactive content Mission : As an applied AI Research Engineer: Code agents (post training + systems) Scope of Work : - Agentic systems design: Tool catalogs, function calling, program synthesis/repair loops, ReAct/Reflexion/ToT/LangGraph-style control, self-verification, sandboxed execution. - Evaluation mindset: Build task suites for multi-step coding Full-stack LLM engineer: Prompt libraries, routing, retrieval, KV-cache management, streaming, telemetry. - Security & isolation: Docker/firejail, network egress controls, secrets hygiene, dependency pinning, supply-chain sanity. - Strong post-training: Supervised fine-tuning + preference/trace RL (DPO/RLAIF/RLHF), dataset curation, reward shaping, safety filters. Tech signals : - Has shipped agents that pass real repos' test suites end-to-end. - Has published papers in agentic systems/ code gen Contributions to agent frameworks or OSS evals (LangGraph/AutoGen/Guidance/LEAP, SWE-bench variants, EvalPlus). - Built datasets from execution traces; can show improvements from data > params. We are committed to being an on-site, in-person team currently based in San Mateo

Requirements

Has shipped agents that pass real repos' test suites end-to-end.
Has published papers in agentic systems/ code gen Contributions to agent frameworks or OSS evals (LangGraph/AutoGen/Guidance/LEAP, SWE-bench variants, EvalPlus).
Built datasets from execution traces; can show improvements from data > params.

Responsibilities

Agentic systems design: Tool catalogs, function calling, program synthesis/repair loops, ReAct/Reflexion/ToT/LangGraph-style control, self-verification, sandboxed execution.
Evaluation mindset: Build task suites for multi-step coding
Full-stack LLM engineer: Prompt libraries, routing, retrieval, KV-cache management, streaming, telemetry.
Security & isolation: Docker/firejail, network egress controls, secrets hygiene, dependency pinning, supply-chain sanity.
Strong post-training: Supervised fine-tuning + preference/trace RL (DPO/RLAIF/RLHF), dataset curation, reward shaping, safety filters.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume