About The Position

You'll help define an emerging area: how to find and neutralize the security risks that emerge when agents act, plan, and use tools autonomously. This role is research-heavy and engineering-heavy: you'll design experiments, build prototypes, fine-tune models, and pressure-test systems against adversarial behavior. You'll iterate quickly, learn from failures, and scale what works, while building the monitoring and evaluation infrastructure that makes progress measurable.

Requirements

  • MS or PhD in CS/ML (or equivalent research experience)
  • Fine-tuned and evaluated models in practice and can reason about data quality, overfitting, evals, and deployment constraints
  • Can write strong production code, and you're comfortable owning the infrastructure that makes agentic evals run end-to-end
  • Care about reproducibility and instrumentation
  • Motivated by security problems and enjoy thinking like both builder and attacker
  • Reason about how capabilities combine into risk: not just individual vulnerabilities, but system-level attack surfaces across tool ecosystems
  • Communicate clearly, iterate fast, and can hold a technical narrative from "hypothesis" to "shipped"

Nice To Haves

  • Enjoy working under uncertainty
  • No AI slop

Responsibilities

  • Define and validate threat models for agentic systems, identifying which tool characteristics must co-exist to enable data exfiltration and malicious state change, and how to break those combinations
  • Design and run experiments: create synthetic environments like file systems and tools, create task distributions that have attack paths and apply different attack strategies
  • Break (manually and using optimization algorithms such as RL) agentic systems
  • Design and improve static and dynamic analysis methods that automatically map tool capabilities to risk across diverse tool ecosystems, and make those methods scale
  • Turn research insights into product-facing capabilities: risk classification, automated guardrail generation, and quantitative threat measurement
  • Build measurement tools: eval harnesses, monitoring, dashboards, and feedback loops that quantify security outcomes
  • Build capability and regression evals
  • Optimize systems for real-world constraints (latency, cost, reliability) without losing scientific rigor

Benefits

  • Competitive salary + equity
  • Work at the forefront of AI security, helping define a new category
  • Fully funded team retreats every 8 weeks
  • Health insurance allowance for you and your dependents
  • Wellbeing, learning, and home office allowances
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service