Senior AI Evaluation Specialist — IP Guardrails and Agentic Workflows

Adobe•San Jose, CA

About The Position

Adobe is redefining creativity through generative AI. Our Firefly family of models powers commercially safe content generation across Creative Cloud, Experience Cloud, and Adobe Express— reaching millions of customers worldwide. As we expand into novel AI capabilities including agentic workflows, multi-model orchestration, and autonomous creative agents, we need a leader who can scale the evaluation and safety guardrails that protect our customers, our brand, and creative communities. We are looking for a Senior AI Evaluation Specialist to lead the IP Safety & Agentic Quality Evaluation practice within our Firefly Scientific Eval team. This role begins by owning safety evaluations tied to proprietary content. It ensures generative imagery outputs avoid infringing copyrights, trademarks, publicity rights, or other protected assets. Additionally, it involves crafting evaluation frameworks for novel AI features like agentic systems, multi-step content pipelines, and AI-assisted creative workflows.

Requirements

5+ years of experience in AI/ML evaluation, trust & safety, content‑moderation systems, responsible AI, or a closely related technical domain.
Deep expertise in evaluation methodology design: benchmark creation, annotation frameworks, inter‑rater reliability, precision/recall analysis, and failure‑mode taxonomies.
Strong working knowledge of intellectual property concepts as they apply to generative AI — including copyright, trademark, likeness rights, and fair‑use considerations.
Experience with multimodal ML systems (vision, language, audio, or video) and an understanding of how generative model architectures produce outputs.
Demonstrated ability to partner with legal, policy, and product teams and translate complex regulatory or legal requirements into actionable technical evaluation plans.
Excellent communication skills with the ability to present technical risk assessments to executive leadership and cross‑functional partners.
MS or PhD or equivalent experience in Computer Science, Machine Learning, Information Science, or a related field (or equivalent practical experience).

Nice To Haves

Experience building evaluation or safety systems specifically for generative AI (image, video, or multimodal models).
Hands‑on experience with AI agent evaluation, agentic workflow safety testing, or autonomous‑system verification.
Background in large‑scale data annotation operations, human‑in‑the‑loop evaluation pipelines, or dataset curation for safety.
Familiarity with content‑authenticity technologies such as C2PA, Content Credentials, or digital‑provenance systems.
Familiarity with regulatory frameworks relevant to AI safety (EU AI Act, NIST AI RMF, ISO/IEC 42001).
Research contributions or publications in responsible AI, AI safety, fairness, or IP protection in generative systems.
Prior experience at a creative‑tools company, media organization, or platform with IP‑sensitive content at scale.

Responsibilities

IP Safety Evaluation Leadership Own the end-to-end evaluation strategy for intellectual property safety across all Firefly models and surfaces, including image, video, audio, vector, and 3D generation.
Design, implement, and iterate on scalable evaluation frameworks that detect potential IP infringement — including copyrighted works, trademarks, recognizable characters, brand logos, and likeness/publicity rights.
Partner with Adobe’s Legal, AI Ethics, and Content Authenticity teams to translate legal and policy requirements into measurable evaluation criteria and test suites.
Establish benchmarks and quality gates that generative models must pass before shipping, maintaining Adobe’s industry‑leading IP indemnification commitment.
Extending to Novel AI Feature Evaluation Architect evaluation methodologies for emerging AI capabilities beyond image generation — including agentic workflows, autonomous multi‑step creative pipelines, AI‑powered content orchestration, and agent‑to‑agent interactions.
Define safety and quality evaluation criteria for agentic systems: scope adherence, action‑boundary enforcement, hallucination detection, unintended side effect monitoring, and graceful failure modes.
Build proactive red‑teaming and adversarial testing programs that stress‑test new AI features for misuse, prompt injection, and safety edge cases before they reach customers.
Collaborate with Product, Engineering, and Research to embed evaluation checkpoints into the AI feature development lifecycle — from prototype through GA release.
Develop and maintain dashboards, scorecards, and reporting systems that provide executive insight into AI safety posture across the Firefly portfolio.

Benefits

At Adobe, you will be immersed in an exceptional work environment that is recognized around the world.
You will also be surrounded by colleagues who are committed to helping each other grow through our unique Check-In approach where ongoing feedback flows freely.
If you’re looking to make an impact, Adobe's the place for you.
Discover what our employees are saying about their career experiences on the Adobe Life blog and explore the meaningful benefits we offer.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume