Senior Research Engineer, Olmo + Molmo

The Allen Institute for Artificial Intelligence•Seattle, WA

25d•Onsite

About The Position

As a Research Engineer on the team, you'll be a core member responsible for training Ai2's flagship open models (e.g. Olmo, Molmo, and beyond). From system design to experiment release, you'll own end-to-end delivery while collaborating closely with research and engineering colleagues to push the boundaries of open model research. We are a non-profit AI institute focused on developing foundational AI research and innovation to deliver real-world positive impact through large-scale open models, data, and artifacts (e.g., Olmo , Tulu , Molmo , FlexOlmo ). Balancing academic freedom with corporate-level scale, Ai2 is uniquely resourced and positioned to deliver high-impact, truly open research. Our team unites the best and brightest scientific and engineering minds to explore the potential of truly open AI. Through our efforts, including the pioneering Olmo and Molmo releases, we endeavor to empower academics, researchers, and AI developers more broadly to advance the science of language models, multimodal models, and generative AI. If you are passionate about advancing the science of AI through open, rigorous research and believe in accessible AI for the common good, we want to hear from you!

Requirements

4+ years of ML infrastructure experience — data preprocessing, model training, evaluation, inference, and deployment
Experience with end-to-end model development — dataset construction, training, fine-tuning, evaluation, profiling, and monitoring
Familiarity with modern model architectures — including LLMs (MoEs, long-context models), vision-language models (e.g., Molmo, LLaVA), and experience training and evaluating both
Agentic systems knowledge — tools, memory, and long-running workflows
Strong software engineering fundamentals — performant, scalable systems and confident debugging
Proficiency in Python and a major ML framework (PyTorch, JAX, or TensorFlow), with the flexibility to pick up new tools as needed
Familiarity with cloud and containerization (e.g., GCP, AWS, Docker)
Strong communication and collaboration skills — we're a small, close-knit team and work best when everyone's pulling in the same direction
BS or MSc in Computer Science, Statistics, Engineering, Applied Mathematics, or a related quantitative field (or equivalent experience)
A minimum of 2 years of software development experience. (or equivalent experience)

Responsibilities

Building and optimizing infrastructure for LLM, multimodal, and agentic research — including training/inference pipelines, dataset curation, and large-scale preprocessing
Designing, training, and evaluating multimodal models (vision + language) and agentic workflows, including tool use, planning, and long-horizon tasks
Scoping and leading research projects, prioritizing experiments for highest impact
Bringing strong software engineering practices to a research environment and bridging cutting-edge work to production-quality products
Contributing to and supporting the open-source community through model releases, datasets, public APIs, and technical reports