METR is looking for an infrastructure engineer to manage our cloud services, notably the deployment of the open source LLM eval tooling Inspect and our cloud-native wrapper Hawk . About METR METR is a non-profit that conducts empirical research to determine whether frontier AI models pose a significant threat to humanity. It is robustly good for civilization to have a clear understanding of what types of danger AI systems pose, and know how high the risk is. You can learn more about our goals from our published talks ( overall goals , recent update ). Some highlights of our work so far: Establishing autonomous replication evals : Thanks to our work, it’s now taken for granted that autonomous replication (the ability for a model to independently copy itself to different servers, obtain more GPUs, etc) should be tested for. Pre-release evaluations : We’ve worked with OpenAI and Anthropic to evaluate their models pre-release , and our research has been widely cited by policymakers, AI labs, and within government. Inspiring lab evaluation efforts : Multiple leading AI companies are building their own internal evaluation teams, inspired by our work. Early commitments from labs : The safety frameworks of Google DeepMind, OpenAI, and Anthropic all credit or endorse our work in developing responsible scaling policies. We have been mentioned by the UK government , Time Magazine , and others. We’re sufficiently connected to relevant parties (labs, governments, and academia) that any good work we do or insights we uncover can quickly be leveraged.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed