Research Intern, Quality & Safety - AI Safety

ZoomSeattle, WA
2d$67 - $107Hybrid

About The Position

At Zoom, we're redefining how AI bridges Communications to Completions. We are the Responsible AI Team, dedicated to ensuring that advanced AI systems are safe, reliable, and aligned with human values. Our mission is to pioneer research and develop practical solutions that make AI systems trustworthy, controllable, and beneficial for society. What you can expect We are seeking exceptional PhD students to join our team as Research Interns. You will work on cutting-edge research to ensure the safety, reliability, and alignment of the world's most advanced AI systems. This internship offers the opportunity to actively contribute to both foundational research and real-world product impact. Research Focus Areas As an intern, you will contribute to one or more of the following interconnected research areas: 1. AI Agent System Quality Evaluation & Benchmarking Design and develop comprehensive evaluation frameworks and benchmarks for state-of-the-art AI systems, including agentic AI and foundation models Create novel metrics and methodologies to assess AI system capabilities, safety, and reliability Build datasets and benchmarks for quality evaluation across diverse scenarios and modalities 2. Controllable & Aligned AI Research methods to ensure AI systems remain controllable and aligned with human intentions and values Develop techniques for robustness, interpretability, controllability, and ethicality (RICE principles) Explore alignment training approaches including reinforcement learning from human feedback (RLHF) and related methodologies Investigate methods to prevent agentic misalignment in autonomous AI systems 3. Red Teaming & Adversarial Robustness Design and execute adversarial testing strategies to identify vulnerabilities in AI systems Develop automated red teaming tools and frameworks to stress-test model safety controls Research novel attack vectors including prompt injection, jailbreaking, and multimodal adversarial inputs Contribute to identifying failure modes in both adversarial and benign user scenarios 4. AI Guardrail Solutions & Defense Systems Build robust guardrail systems to defend against adversarial attacks and harmful outputs Develop runtime monitoring and content moderation solutions Create self-improving security systems that anticipate evolving threats Research defense-in-depth strategies for comprehensive AI safety

Requirements

  • Currently enrolled in a PhD program in Machine Learning, Artificial Intelligence, Computer Science, or a related technical field
  • Have programming skills in Python and experience with ML frameworks (PyTorch, Huggingface, vLLM, SGLang)
  • Possess a strong foundation in machine learning, deep learning, and/or NLP
  • Demonstrate ability to conduct independent research
  • Have strong written and verbal communication skills

Nice To Haves

  • Have publications in top-tier venues (NeurIPS, ICML, ICLR, ACL, EMNLP, FAccT, AIES, or similar)
  • Have prior research experience in AI safety, alignment, red teaming, adversarial robustness, or evaluation
  • Have experience with large language models (LLMs) or generative AI systems
  • Have familiarity with RLHF, constitutional AI, or other alignment methodologies
  • Have previous internship experience in ML/AI research

Responsibilities

  • Design and develop comprehensive evaluation frameworks and benchmarks for state-of-the-art AI systems, including agentic AI and foundation models
  • Create novel metrics and methodologies to assess AI system capabilities, safety, and reliability
  • Build datasets and benchmarks for quality evaluation across diverse scenarios and modalities
  • Research methods to ensure AI systems remain controllable and aligned with human intentions and values
  • Develop techniques for robustness, interpretability, controllability, and ethicality (RICE principles)
  • Explore alignment training approaches including reinforcement learning from human feedback (RLHF) and related methodologies
  • Investigate methods to prevent agentic misalignment in autonomous AI systems
  • Design and execute adversarial testing strategies to identify vulnerabilities in AI systems
  • Develop automated red teaming tools and frameworks to stress-test model safety controls
  • Research novel attack vectors including prompt injection, jailbreaking, and multimodal adversarial inputs
  • Contribute to identifying failure modes in both adversarial and benign user scenarios
  • Build robust guardrail systems to defend against adversarial attacks and harmful outputs
  • Develop runtime monitoring and content moderation solutions
  • Create self-improving security systems that anticipate evolving threats
  • Research defense-in-depth strategies for comprehensive AI safety
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service