Research Engineer

GammaSan Francisco, CA
$180,000 - $340,000Onsite

About The Position

You'll own the quality of AI across everything Gamma creates. As our Research Engineer, you'll design evaluation frameworks that measure AI output quality, systematically improve production prompts, and fine-tune models to ensure millions of users get exceptional results every time they generate content. This role sits at the intersection of research rigor and product impact. You'll diagnose failure patterns in AI-generated presentations, docs, and websites, then craft targeted improvements through iterative experimentation. You'll build the tools and workflows that enable rapid testing, validate changes against quality benchmarks, and ensure our AI gets smarter with every iteration. You'll succeed here if you combine deep technical expertise with a research-oriented mindset, comfort working in ambiguity, and an attention to detail that catches dimensions of quality others might overlook. Our team has a strong in-office culture and works in person 4–5 days per week in San Francisco. We love working together to stay creative and connected, with flexibility to work from home when focus matters most.

Requirements

  • 2+ years working with AI systems, with demonstrated experience shipping production-grade AI products
  • Deep hands-on experience with prompt engineering, LLM experimentation, and systematic evaluation of AI outputs
  • Strong experimental mindset with the ability to design tests, analyze model performance, and iterate toward measurable quality improvements in ambiguous problem spaces
  • Experience with post-training techniques for LLMs including reinforcement learning and supervised fine-tuning
  • Exceptional attention to detail and genuine quality obsession, with care for output quality across all dimensions including less visible aspects
  • Bachelor's degree in Computer Science, Machine Learning, or a related field, or equivalent hands-on experience with AI research and experimentation

Responsibilities

  • Design and maintain evaluation frameworks that measure AI output quality across all Gamma experiences, developing metrics and benchmarks to assess model performance
  • Systematically improve production prompts through iterative experimentation, diagnosing failure patterns, crafting targeted improvements, and validating against quality benchmarks
  • Conduct rigorous experiments to understand model behavior, analyze results, and derive insights that inform prompt and model improvements
  • Build tools and workflows to support rapid experimentation and quality analysis, enabling faster iteration on AI improvements
  • Fine-tune models on targeted datasets to improve baseline performance, preventing issues like poor layout choices or low-quality outlines
  • Partner with product and engineering teams to ensure AI quality improvements ship quickly and work reliably at scale
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service