Research Fellowship - Open Endedness

Vmax•San Francisco, CA

10d•Hybrid

About The Position

Vmax is an applied research lab focused on developing AI capable of open-ended learning. The lab aims to build systems that can exceed human capabilities by optimizing beyond the local maxima of learning from human expertise. A key area of focus is on agents that can discover their own objectives in the world. This fellowship is designed for PhD students or equivalent early-career researchers interested in working on Large Language Models (LLMs) that can learn in open-ended settings. Fellows will lead a focused research project, collaborate with Vmax technical staff, and contribute to research publications. The fellowship duration is typically 3 to 6 months.

Requirements

Currently enrolled in a PhD program in machine learning, computer science, artificial intelligence, computational neuroscience, mathematics, or a related technical field. Exceptional candidates with equivalent research experience may also be considered.
Track record of research excellence or strong research promise, demonstrated through publications, preprints, open-source work, technical projects, competitions, or publicly available artifacts.
Working understanding of reinforcement learning.
Familiarity with unsupervised/automated environment design, asymmetric self-play, and/or intrinsic motivation.
Strong programming ability in Python.
Experience with at least one major ML framework such as PyTorch or JAX.
Clear written and verbal communication of technical ideas.

Nice To Haves

Experience with LLM post-training methods.
Experience with scalable ML experimentation, distributed training, experiment tracking, or reproducible research infrastructure.
Demonstrated taste for identifying non-obvious research directions and converting them into tractable experiments.

Responsibilities

Develop Reinforcement Learning (RL) methods for agents that can discover useful objectives, tasks, and curricula without solely relying on human-specified rewards.
Design systems for open-ended learning, encompassing areas like unsupervised/automated environment design, asymmetric self-play, and intrinsic motivation.
Build training loops where agents learn from interaction, exploration, novelty, competence progress, self-generated challenges, or other non-standard reward signals.
Investigate methods for agents to avoid collapsing into trivial, degenerate, or easily exploitable objectives.
Own and develop a research agenda within Vmax, from identifying promising research directions to executing experiments and communicating results.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume