Research Engineer, Preparedness - Meta Superintelligence Labs

MetaMenlo Park, CA
$74 - $217,000

About The Position

Meta is seeking Research Engineers to join the Preparedness team within Meta Superintelligence Labs. The Preparedness team evaluates the increasing capabilities of our AI systems, with a focus on frontier AI capabilities and risks. We ensure that evaluations are in place to mitigate these risks and responsibly handle the development of frontier AI. As a Research Engineer on Preparedness, you will work alongside world-class AI researchers to develop new evaluations grounded in real world threat models, maintain existing evaluations so they remain current and reliable, and produce written artifacts that Meta can trust during high-stakes launches. This is a highly technical role requiring the ability to solve machine learning and engineering with high reliability. The evaluations you build will directly inform risk assessments and launch decisions within MSL, making engineering reliability, rigor, and scalability paramount. You will excel by maintaining high velocity while adapting to rapidly shifting priorities as we advance the technical research frontier. Preparedness is a highly interdisciplinary team, tightly connecting evaluations, internal red teaming, and mitigations for our frontier models. The evaluations you produce will be read and acted upon by Meta leadership during model launches and policy reviews.

Requirements

  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • 3+ years of experience in machine learning engineering, machine learning research, or a related technical role
  • Proficiency in Python and experience with ML frameworks
  • Experience identifying, designing and completing medium to large technical features independently, without guidance
  • Proven experience in software engineering practices including version control, testing, and code review practices

Nice To Haves

  • Experience implementing or developing benchmarks for agentic large language models and multimodal models (e.g., vision-language, audio, video, browser agents)
  • Publications at peer-reviewed venues (NeurIPS, ICML, ICLR, ACL, EMNLP, or similar) related to language model evaluation, AI safety, or deep learning
  • Experience working with large-scale distributed systems and data pipelines
  • Experience in red-teaming AI systems, adversarial machine learning, or abuse prevention systems
  • Background in biology or chemistry, particularly chemical, biological, radiological, and nuclear (CBRN) risk domains and experience designing evaluations or threat assessments related to dual-use scientific knowledge
  • Background in cybersecurity, penetration testing, or security research, particularly as it relates to assessing AI-enabled cyber capabilities or designing mitigations for AI-assisted exploitation
  • Track record of open-source contributions to ML evaluation tools or benchmarks

Responsibilities

  • Build and continuously refine evaluations for multimodal and agentic frontier AI models, including in cybersecurity, chemical security, and biosecurity
  • Build robust, reusable evaluation pipelines that scale across multiple model lines and product areas
  • Produce auditable technical artifacts, including evaluation reports and model cards, at high reliability and speed
  • Scope and deliver end-to-end evaluations under ambiguous and rapidly shifting requirements, re-prioritizing as the threat landscape and Meta’s frontier models evolve
  • Work across research, engineering, policy, and legal teams to align evaluation priorities with launch timelines

Benefits

  • bonus
  • equity
  • benefits

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service