Research Aide - MCS - Zhu, Yue - 3.17.26.

Argonne National LaboratoryLemont, IL
2d$31 - $47

About The Position

SR-APPFL is a scalable and resilient Argonne Federated Learning platform that features efficient and accurate modeling and simulation toolkits for federated learning systems. As part of this effort, we have developed FedDES, a discrete-event performance simulation framework for large-scale federated learning systems, and PACER, a userspace network rate controller in MPI with adaptive compression for parallel applications. With FedDES and PACER, we can perform large-scale simulations of federated learning workflows, providing an efficient platform for studying system performance and resilience. In this project, the student will characterize the SR-APPFL platform by running real-world scientific applications and AI workloads on it. The AI tasks to be evaluated may include AI-for-science applications such as PowerGrid and SmartMeter. The overall workflow will be systematically analyzed to identify performance bottlenecks across computation, communication, and data movement. The insights gained from this characterization will be used to optimize system performance and improve the efficiency of AI applications running on the platform. The expected deliverables include optimized software implementations and publications in top-tier HPC conferences such as IPDPS, ICS, and HPDC.

Requirements

  • The entirety of the appointment must be conducted within the United States.
  • Applicants must be: Currently enrolled in undergraduate or graduate studies at an accredited institution; Graduated from an accredited institution within the past 3 months; or Actively enrolled in a graduate program at an accredited institution.
  • Must be 18 years or older at the time the appointment begins.
  • Must possess a cumulative GPA of 3.0 on a 4.0 scale.
  • If accepting an offer, candidates may be required to complete pre-employment drug testing based on appointment length. All students remain subject to applicable drug testing policies.
  • Must complete a satisfactory background check.

Responsibilities

  • Characterize the SR-APPFL platform by running real-world scientific applications and AI workloads on it.
  • Evaluate AI tasks, including AI-for-science applications such as PowerGrid and SmartMeter.
  • Systematically analyze the overall workflow to identify performance bottlenecks across computation, communication, and data movement.
  • Optimize system performance and improve the efficiency of AI applications running on the platform.
  • Develop optimized software implementations.
  • Publish findings in top-tier HPC conferences such as IPDPS, ICS, and HPDC.

Benefits

  • comprehensive benefits are part of the total rewards package.
  • Click here to view Argonne employee benefits!

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Part-time

Career Level

Intern

Education Level

No Education Listed

Number of Employees

1,001-5,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service