Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. This is a project-based opportunity, not permanent employment. You will design computational material science problems to challenge a frontier AI model. The problems must have answers verifiable by code and require specialized tools like ObsPy, instaseis, pyrocko, MITgcm, flopy/MODFLOW, or others. Generic data wrangling around synthesized toy data will not suffice. Each problem runs inside a sealed Linux container with the tool pre-installed and a programmatic judge that grades the model's answer. As an expert author, you will pick an anchor tool and design a problem that hinges on its specific functionalities. You will write a Python reference solution, supply input files, and decide the numerical answer and acceptable tolerance. You will test the problem against the model in batches, tuning the difficulty until the agent succeeds in a small number of attempts. After review and approval, the task is passed to a senior reviewer in your subfield for final quality assurance. This process involves calibrating the problem against batches of parallel runs, aiming for a pass rate of 10-30%, which may require rewriting scenarios, tightening parameters, and observing agent behavior. This experience will deepen your command of the anchor tool and provide practical insight into how AI models navigate complex problems.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Part-time
Career Level
Entry Level
Education Level
Associate degree