Research Engineer

Gamma•San Francisco, CA

96d

About The Position

Responsibilities: Design and maintain evaluation frameworks that measure AI output quality across all experiences, developing metrics and benchmarks to assess model performance Systematically improve production prompts through iterative experimentation—diagnosing failure patterns, crafting targeted improvements, and validating against quality benchmarks Fine-tune models on targeted datasets to improve baseline performance (e.g., preventing poor layout choices, improving outline quality) Conduct rigorous experiments to understand model behavior, analyze results, and derive insights that inform prompt and model improvements Build tools and workflows to support rapid experimentation and quality analysis, enabling faster iteration on AI improvements Qualifications: 3+ years working with AI systems with demonstrated experience in shipping production grade AI products Deep hands-on experience with prompt engineering, LLM experimentation, and systematic evaluation of AI outputs Strong experimental mindset with ability to design tests, analyze model performance, and iterate toward quality improvements Experience post-training LLMs (RL, SFT, etc) Research-oriented approach to problem-solving; comfortable working in ambiguity and exploring novel solutions to AI quality challenges Exceptional attention to detail and quality obsession—cares deeply about output quality across all dimensions, including less visible aspects Bachelor's degree in Computer Science, ML, or related field (or equivalent hands-on experience with AI research/experimentation)

Requirements

3+ years working with AI systems with demonstrated experience in shipping production grade AI products
Deep hands-on experience with prompt engineering, LLM experimentation, and systematic evaluation of AI outputs
Strong experimental mindset with ability to design tests, analyze model performance, and iterate toward quality improvements
Experience post-training LLMs (RL, SFT, etc)
Research-oriented approach to problem-solving; comfortable working in ambiguity and exploring novel solutions to AI quality challenges
Exceptional attention to detail and quality obsession—cares deeply about output quality across all dimensions, including less visible aspects
Bachelor's degree in Computer Science, ML, or related field (or equivalent hands-on experience with AI research/experimentation)

Responsibilities

Design and maintain evaluation frameworks that measure AI output quality across all experiences, developing metrics and benchmarks to assess model performance
Systematically improve production prompts through iterative experimentation—diagnosing failure patterns, crafting targeted improvements, and validating against quality benchmarks
Fine-tune models on targeted datasets to improve baseline performance (e.g., preventing poor layout choices, improving outline quality)
Conduct rigorous experiments to understand model behavior, analyze results, and derive insights that inform prompt and model improvements
Build tools and workflows to support rapid experimentation and quality analysis, enabling faster iteration on AI improvements

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume