Research Scientist/ Research Engineer

Collinear AI•Sunnyvale, CA

About The Position

About Collinear At Collinear , we help teams fearlessly ship AI. Frontier labs and AI-native companies use our SimLab to find capability gaps in their agents and generate high-quality data to close them. We believe that the next generation of AI progress won't come from just bigger models, but from more rigorous, long-horizon simulation and programmatic verification. SimLab allows researchers to spin up realistic environments, run agents through complex tasks, and surface failure modes under real-world conditions. We then close the loop by generating targeted synthetic data to retrain models, delivering measurable quality lift on the metrics that actually matter. About the Role We are looking for Research Scientists and Research Engineers to help us build the data engine for frontier AI. In this role, you will bridge the gap between frontier research and production engineering. You will develop the high-fidelity environments and evaluation stacks that the world’s leading AI labs rely on to stress-test their most advanced agents. Your work will involve iterating on novel RL approaches and translating them into robust, scalable infrastructure that moves the needle on real-world model metrics.

Requirements

A Bachelor’s, Master’s, or PhD in a technical field (CS, Math, Physics, etc.), or a demonstrated "proof of work" through significant open-source contributions or industry experience.
A strong foundation in software engineering with the ability to build robust, scalable infrastructure. You should be comfortable in a Python-friendly, CLI-first development environment.
A principled understanding of foundation models, including how they are constructed, evaluated, and optimized.
Experience conducting research or technical experiments with a focus on reproducibility and data-driven results.

Nice To Haves

Research Taste: You have a strong intuition for identifying what matters in complex problem spaces. You can balance deep research exploration with the pragmatism needed to ship a product.
Impact-Driven Agency: You care about outcomes, not just activity. You don't wait for a ticket; you identify gaps in the system, build the solution, and ensure it moves real-world metrics for frontier AI labs.
Prior experience with Reinforcement Learning (RLHF/RLAIF), simulation systems, or building long-horizon agentic environments.
A history of contributing to influential ML research (e.g., publications at NeurIPS, ICLR, ICML) or maintaining high-impact open-source projects.
Experience fine-tuning or evaluating large-scale models to deliver "frontier performance" on open-source benchmarks.

Responsibilities

Build Agentic Environments: Design and implement the next generation of "SimLabs", ultra-realistic, long-horizon simulation environments where agents learn to navigate ambiguity and maintain context.
Programmatic Verification: Develop rigorous, policy-aware judges and evaluations that measure genuine capability and safety beyond simple benchmarks.
Close the Loop: Design and execute high-quality post-training runs (CPT, SFT, RL) to deliver frontier performance on open-source models using curated, high-signal data.
Rapid Iteration: Debug and iterate across the full ML stack, from infrastructure to model behavior, ensuring our tools remain "command-line first" and developer-friendly.
Collaborate: Work daily with the founders and research staff to shape the roadmap and push the state-of-the-art in AI reliability.