AI Red Teamer, Cybersecurity

Handshake•Seattle, WA

2d•$65 - $125

About The Position

As a Cybersecurity Red Teamer, you will evaluate whether AI models can be manipulated into generating functional malware, viable exploit code, attack tooling, or step-by-step operational guidance that would give a threat actor meaningful uplift in carrying out cyberattacks. Your job is to find the gaps between what a model's safety guardrails are supposed to block and what a skilled adversary can actually extract. This role requires you to think like an attacker who happens to have access to a very capable AI assistant. You will craft adversarial prompts and multi-turn interaction chains that simulate how real threat actors, from script kiddies to APT operators, might use LLMs to accelerate reconnaissance, weaponization, exploitation, lateral movement, persistence, and exfiltration. You will then evaluate whether the model's output is genuinely dangerous or just surface-level noise. Deep cybersecurity expertise is essential. The value you bring is the ability to look at a model-generated payload, exploit chain, or attack plan and assess whether it would actually work, how much refinement it would need, and what class of attacker it would meaningfully help.

Requirements

Professional experience in offensive security, penetration testing, red teaming, vulnerability research, malware analysis, threat intelligence, or incident response
Ability to read, write, and evaluate code across common languages used in offensive tooling (Python, PowerShell, Bash, C/C++, JavaScript, or similar)
Understanding of common attack frameworks, techniques, and procedures (MITRE ATT&CK, OWASP, etc.)
Ability to assess the functional correctness and real-world exploitability of model-generated technical output
Strong hands-on experience using multiple LLMs (ChatGPT, Claude, Gemini, open-source models, etc.)
Creative, adversarial problem-solving skills
Clear and precise written communication, including the ability to explain technical risk to non-specialist audiences
Strong ethical judgment and the ability to separate adversarial thinking from personal values
Self-directed, collaborative, and comfortable in feedback-heavy environments

Nice To Haves

Relevant certifications (OSCP, OSCE, GPEN, GXPN, CRTO, CRTL, CEH, or similar)
Active or prior security clearance
Experience with exploit development, reverse engineering, or binary analysis
Background in cloud security, container security, or infrastructure-as-code attack surfaces
Familiarity with AI/ML-specific attack surfaces (prompt injection, model extraction, training data poisoning, adversarial examples)
Experience building or operating C2 frameworks, custom implants, or offensive tooling
Bug bounty track record or published CVEs
Prior work in trust and safety, content moderation, or AI evaluation
Familiarity with LLM APIs or evaluation tooling

Responsibilities

Design technically grounded adversarial prompts that test whether models provide meaningful uplift across the cyber kill chain (reconnaissance through exfiltration and impact)
Evaluate model-generated code and technical output for functional correctness, assessing whether outputs represent real exploits, plausible attack tooling, or non-functional noise
Test model behavior across offensive categories including malware generation, vulnerability exploitation, social engineering content, credential harvesting, privilege escalation, C2 infrastructure setup, and data exfiltration techniques
Probe dual-use boundaries, testing how models handle queries that blend legitimate security research, penetration testing, and defensive operations with offensive applications
Simulate attacker personas at varying skill levels (opportunistic, intermediate, advanced/APT) to assess how model risk scales with user sophistication
Test multi-step and multi-turn attack chains, including scenarios where early turns establish benign context before pivoting to malicious requests
Score model responses against structured harm taxonomies and severity rubrics calibrated to real-world exploitability
Document findings with clear technical reasoning, including what a response gets right, what it gets wrong, and what level of attacker it would realistically assist
Contribute to the development and refinement of cybersecurity-specific evaluation frameworks and threat models
Collaborate with other red teamers, AI researchers, and policy teams to translate findings into actionable model improvements
Stay current on evolving TTPs, CVEs, jailbreak techniques, and the intersection of AI and offensive security