The REDWOOD project, funded by DOE ASCR, focuses on studying the resilience of scientific payloads in heterogeneous distributed computing environments across various scientific domains like HEP, NP, astro-particle physics, and fusion science, as well as other data-intensive physics and astrophysics projects. The work program involves three main thrusts: 1) Maintenance, development, support, and exploitation of CGSim, a simulation tool for distributed computing systems, including its expansion to incorporate more grid sites and testing AI-based job scheduling algorithms. 2) Demonstrating AI tools, specifically large language systems like AskPanDA, for monitoring distributed computer systems by providing natural language feedback to users. This involves generating database queries and utilizing Model Context Protocol (MCP) technology to access large language models (LLMs). The monitoring system will be demonstrated on the CGSim simulation of distributed workflow management systems. 3) Supporting synergistic activities at Brookhaven for AI-based simulation of workflow management systems and studies of AI-based tools for resilient workflow management, under the ModSim track of the REDWOOD project, also studying the impact on novel workflow scheduling algorithms. The successful candidate will focus on the first and third thrusts: Modeling and Complex Systems simulation. Finally, disseminating software and research findings through journals and professional meetings is an integral part of this work program.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Entry Level
Education Level
Ph.D. or professional degree
Number of Employees
501-1,000 employees