Staff Software Engineer, AI/ML

DigitalOcean•Seattle, WA

4h•$216,800 - $271,000•Hybrid

About The Position

Building AI agents that take real actions is the easy part. Building agents that get better over time — that learn from feedback, correct mistakes, and optimize toward outcomes users actually care about — is one of the hardest open problems in production AI today. That's what this team works on. As a Staff AI/ML Engineer on our Applied Research team, you'll own the technical direction for feedback-driven learning in DigitalOcean's agentic systems: reward modeling, preference optimization, reinforcement learning, and the evaluation infrastructure needed to measure whether any of it is actually working. This is a senior IC role with broad technical scope. You'll set direction, run experiments at scale, and close the loop between user signals and model behavior - shipping research into production, not just writing it up.

Requirements

8+ years of experience building production AI/ML systems — LLMs, GenAI, agentic systems, recommendation, search, personalization, or applied research at scale.
Hands-on experience improving AI systems through reinforcement learning, reward modeling, fine-tuning, human feedback, or preference optimization — with results you can point to.
Strong understanding of agentic AI: reasoning, planning, tool use, action execution, instruction following, and self-correction.
Strong software engineering in Python and at least one production systems language.
The judgment to balance model quality, product impact, latency, reliability, cost, and maintainability — and communicate those tradeoffs clearly.

Nice To Haves

Experience with agent evaluation, offline/online experiments, and human feedback loops in production.
Direct experience with RLHF, RLAIF, DPO, PPO, GRPO, or related optimization techniques.
Prior Staff, Senior Staff, Tech Lead, or equivalent senior IC experience.
Master's or PhD in CS, ML, AI, or a related field — or equivalent depth demonstrated through industry work.
Experience with production ML infrastructure: model serving, observability, data pipelines, feature stores, or experimentation platforms.
Research contributions via publications, patents, open-source work, or demonstrated applied research impact in RL, reward modeling, evaluation, or recommendation systems.

Responsibilities

Own the feedback learning roadmap
Define and execute the applied research agenda for feedback-driven agentic AI — from reward modeling and preference optimization to online learning and human feedback loops.
Translate user feedback, human evaluation data, and product signals into concrete training and optimization strategies.
Stay close to the research frontier on RLHF, RLAIF, DPO, PPO, GRPO, and related methods and know when to apply them versus when simpler approaches win.
Build production learning systems
Design and implement learning loops that improve agent reasoning, planning, tool use, and action execution over time.
Build evaluation frameworks that measure what matters: reasoning quality, instruction following, task success, safety, and real user outcomes — at both offline and online scale.
Run large-scale experiments that connect model changes to measurable improvements in user experience and business impact.
Provide technical leadership
Set technical direction across modeling, experimentation strategy, evaluation design, and production readiness — without requiring direct management authority.
Partner closely with product, engineering, design, and research teams to move work from prototype to shipped capability.
Communicate complex AI systems clearly to both technical and non-technical stakeholders.