Research Engineer, Frontier Safety Risk Assessment

DeepMind•San Francisco, CA

55d•$136,000 - $245,000

About The Position

Snapshot Artificial Intelligence could be one of humanity’s most useful inventions. At Google DeepMind, we’re a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority. Our team identifies, assesses, and mitigates potential catastrophic risks from current and future AI systems. As a member of technical staff, you will design, implement, and empirically validate approaches to assessing and managing catastrophic risk from current and future frontier AI systems. About Us The Risk Assessment team measures and assesses the possible risks posed by frontier systems, making sure that GDM knows the capabilities and propensities of frontier models so that adequate mitigations are in place. We also make sure that the mitigations do enough to manage the risks. But the risks posed by frontier systems are, themselves, unclear. Forecasting the possible risk pathways is challenging, as is designing and implementing sensors that could reliably detect emerging risks before we actually have real-world examples. We focus on building decision-relevant and trustworthy evaluation systems that prioritise compute and effort on risk measurements with the highest value of information. We then need to be able to assess the extent to which proposed and implemented mitigations actually cover the identified risks, and to measure how successfully they generalize to novel settings. The Risk Assessment team is part of Frontier Safety which is responsible for measuring and managing severe potential risks from current and next-generation Frontier models. Our approach is one of adaptively scaling risk assessment and mitigation processes to handle the near-future. We are part of GDM’s AGI Safety and Alignment Team, whose other members focus on research aimed at enabling systems further in the future to be aligned and safe. These include interpretability, scalable oversight, control, and incentives. The Role We are seeking 2 Research Engineers for the Frontier Safety Risk Assessment team within the AGI Safety and Alignment Team. In this role, you will contribute novel research towards our ability to measure and assess risk from frontier models. This might include: Identifying new risk pathways within current areas (loss of control, ML R&D, cyber, CBRN, harmful manipulation) or in new ones; Conceiving of, designing, and developing new ways to measure pre-mitigation and post-mitigation risk. Forecasting and scenario planning for future risks which are not yet material. Your work will involve complex conceptual thinking as well as engineering. You should be comfortable with research that is uncertain, under-constrained, and which does not have an achievable “right answer”. You should also be skilled at engineering, especially using Python, and able to rapidly familiarise yourself with internal and external codebases. Lastly, you should be able to adapt to pragmatic constraints around compute and researcher time that require us to prioritise effort based on the value of information. Although this job description is written for a Research Engineer, all members of this team are better thought of as members of technical staff. We expect everyone to contribute to the research as well as the engineering and to be strong in both areas. The role will mostly depend on your general ability to assess and manage future risks, rather than from specialist knowledge within the risk domains, but insofar as specialist knowledge is helpful, knowledge in ML R&D and loss of control as risk domains are likely the most valuable.

Requirements

You have extensive research experience with deep learning and/or foundation models (for example, a PhD in machine learning).
You are adept at generating ideas and designing experiments, and implementing these in Python with real AI systems.
You are keen to address risks from foundation models, and have thought about how to do so.
You plan for your research to impact production systems on a timescale between “immediately” and “a few years”.
You are excited to work with strong contributors to make progress towards a shared ambitious goal.
With strong, clear communication skills, you are confident engaging technical stakeholders to share research insights tailored to their background.

Nice To Haves

Experience in areas such as frontier risk assessment and/or mitigations, safety, and alignment.
Engineering experience with LLM training and inference.
PhD in Computer Science or Machine Learning related field.
A track record of publications at venues such as NeurIPS, ICLR, ICML, RL/DL, EMNLP, AAAI and UAI.
Experience with collaborating or leading an applied research project.

Responsibilities

Identifying new risk pathways within current areas (loss of control, ML R&D, cyber, CBRN, harmful manipulation) or in new ones
Conceiving of, designing, and developing new ways to measure pre-mitigation and post-mitigation risk.
Forecasting and scenario planning for future risks which are not yet material.

Benefits

enhanced maternity, paternity, adoption, and shared parental leave
private medical and dental insurance for yourself and any dependents
flexible working options
healthy food
an on-site gym
faith rooms
terraces
relocating candidates and offer a bespoke service and immigration support to make it as easy as possible (depending on eligibility).

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume