Research Intern - AI Evaluation and Alignment

Microsoft•Redmond, WA

27d•Onsite

About The Position

Research Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers, who pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment. Microsoft Research and Copilot Studio team are seeking Research Interns to help advance the quality, reliability and evaluation of Large Language Model (LLM)-based systems. Research Interns will collaborate with applied scientists and engineers to explore new machine learning methods that improve how Artificial Intelligence (AI) systems assess and align with human expectations.

Requirements

Currently enrolled in a PhD program in Statistics, Computer Science, Physics, Operations Research, or a related technical field.
At least 1 year of hands-on experience working on LLM-related projects (e.g., prompt engineering, building and evaluating LLM-based systems, rewards modeling etc.).
At least 1 year of experience coding in Python.
Research Interns are expected to be physically located in their manager’s Microsoft worksite location for the duration of their internship.
In addition to the qualifications below, you’ll need to submit a minimum of two reference letters for this position as well as a cover letter and any relevant work or research samples. After you submit your application, a request for letters may be sent to your list of references on your behalf. Note that reference letters cannot be requested until after you have submitted your application, and furthermore, that they might not be automatically requested for all candidates. You may wish to alert your letter writers in advance, so they will be ready to submit your letter.
Please submit a list of projects you worked on in the last 2 years with the following information: Start and end date for the project. Brief overview of what the project is about. What you did on the project. What technologies you used for the project.
You can upload documents by going to your profile on the career site and clicking on the Resume Manager tab in the top right of the page, and from there selecting Other Documents.
If you are having trouble submitting your application, please go to the bottom of the page and click "Support" and fill out the requested information.

Nice To Haves

Prior experience in reward models for large language models or LLM-as-a-Judge.
Strong experience with deep learning frameworks (e.g., PyTorch, TensorFlow) and familiarity with software engineering best practices (e.g. git).
Experience with LLM post-training and evaluation or LLM-based judge systems.
Research experience demonstrated through publications or projects.
Ability to work independently in ambiguous or rapidly evolving situations and collaborate effectively across disciplines.

Responsibilities

Co-developing a research project in collaboration with the supervisor and research mentors.
Designing and implementing machine learning approaches, including training and fine-tuning using real-world datasets.
Developing evaluation frameworks and benchmarking methods to assess model quality, robustness, and generalization.
Presentation and communication of research findings

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume