Red Team Reviewer

mpathic•Seattle, WA

3d•Remote

About The Position

mpathic is looking for a full-time Red Team Reviewer, ideally a candidate with a strong background in LLM Red Teaming, to join our team. The role centers on a confidential initiative focused on AI safety protocols and mental-health policy implementation for large language models (LLMs). You will help design, perform, and review realistic conversational scenarios, red-team model behavior, identify behavioral edge cases, and ensure appropriate recognition of distress or risk in AI-driven interactions. You may also help develop novel psychometrics, rubrics, behavioral taxonomies, evaluation criteria, and qualitative analyses. A strong commitment to safety, clinical ethics, and confidentiality is essential. This position is open to candidates without technical degrees or licensure who demonstrate commensurate experience working with LLMs and Red Teaming. This role will report to the Red Team Manager. This role involves roleplaying and reviewing clinical scenarios with AI agents. As such, we are ideally seeking candidates who bring creative or performance-driven strengths, as these competencies enhance the realism, nuance, and emotional depth needed for AI safety testing. Examples of these can include, but are not limited to: Theatre degrees or studies, Acting, theatre, improv, or voice-over experience, Strong writing skills, especially dialogue or scenario writing, Experience creating or inhabiting characters (e.g., performers, TTRPG roleplay, narrative designers), Conversational design, interaction writing, or scripted roleplay experience, Participation in gaming, interactive storytelling, or digital communities where roleplay is common. Successful candidates are proactive, reliable, collaborative, and skilled at balancing independent problem-solving with appropriate escalation. Candidates are comfortable navigating ambiguity and building durable systems for onboarding, training, and shared learning as the team continues to grow. Consistency and communication are key at mpathic.

Requirements

Knowledge of LLM Red Teaming and risk/safety assessment
Demonstrated experience in creative writing, theatre, improv, acting, voice acting, or character-driven roleplay (optional, but preferred)
Interest in NLP, AI, ML, safety evaluation, or speech-signal processing
Strong understanding of mental-health ethics, boundaries, and responsible handling of sensitive data
Ability to telecommute and use Slack, LLM tools (trainable), Google Workspace apps, and other remote-first productivity tools
Comfort with ambiguity, iteration, and emerging technology
Ability to give, take, and integrate constructive feedback
Must be willing to sign comprehensive NDA, confidentiality agreements, and any other agreements that may be required by the end customer
Comfortable working with sensitive mental health content and in an area of high impact for billions of end-users

Nice To Haves

Theatre degrees or studies
Acting, theatre, improv, or voice-over experience
Strong writing skills, especially dialogue or scenario writing
Experience creating or inhabiting characters (e.g., performers, TTRPG roleplay, narrative designers)
Conversational design, interaction writing, or scripted roleplay experience
Participation in gaming, interactive storytelling, or digital communities where roleplay is common
Deep experience with high-velocity online communities (e.g., Discord, Reddit, gaming spaces) and narrative roleplay environments that mirror real user interaction patterns.
Background in trust & safety, content moderation, or policy development
Experience with AI/ML in clinical or healthcare settings
Experience with data classification, annotation, or qualitative analysis projects

Responsibilities

Review, design, and roleplay chat experiences with AI agents across diverse clinical and emotional scenarios
Provide feedback on roleplays on the grounds of characterization, realism, and AI model boundary testing
Assess AI model responses for potential risk/safety violations
Help clinicians implement feedback to improve quality of roleplay scenarios
Perform or simulate characters across ages, backgrounds, severity levels, and emotional states (spoken or written)
Collaborate with clinicians to provide a holistic review of AI chat experiences
Conduct qualitative analyses of conversations to derive taxonomies, personas, and behavioral patterns
Translate red team expertise into structured prompt patterns and evaluation rubrics
Maintain proactive, timely communication with the team, including over-communicating when appropriate and demonstrating flexibility in availability and hours based on project needs.
Collaborate with engineering and research teams to define evaluation metrics for tone, realism, AI model behavior, and appropriateness
Identify and document failure cases, risk signals, and edge behaviors
Contribute to scenario modeling, red teaming, and rapid experimentation cycles
Ensure all work adheres to strict confidentiality agreements and NDAs
Implement quality-assurance protocols for conversation and behavioral analysis
Participate in review sessions with engineers, researchers, and clinical consultants, in addition to holding office hours for onboarding and/or continued training of red teamers