Machine Learning Systems Research Intern, PhD, Summer 2026

Red River•Boston, MA

38d

About The Position

At Red Hat we believe the future of AI is open and we are on a mission to bring the power of open-source LLMs and vLLM to every enterprise. We are seeking a highly motivated summer intern to join our Machine Learning Research Team. As an intern, you will work on cutting-edge AI inference and model optimization techniques, and contribute to research and engineering efforts that make LLMs faster and more efficient. This is an exciting opportunity to gain hands-on experience in applied machine learning research while working with leading experts in the field.

Requirements

Currently pursuing a Ph.D. degree in Computer Science, Electrical Engineering, Machine Learning, or a related field.
Strong programming skills in C++, CUDA, and Python.
Experience with tensor math libraries such as PyTorch.
Familiarity with AI model optimization techniques such as quantization (e.g., INT4, FP8), pruning, and knowledge distillation.
Deep understanding and experience in GPU performance optimizations.
Excellent knowledge of large language model architectures
Strong analytical and problem-solving skills.
Excellent communication skills and ability to work in a team-oriented research environment.
Background in efficient inference techniques for large-scale language models or computer vision models.
Prior experience contributing to open-source ML frameworks or research publications.

Nice To Haves

1 or more co-authored papers at a top tier conference like NeurIPS, ICLR, ACL, CVPR, MLSys is a big plus.

Responsibilities

Research and implement techniques for LLM inference and LLM optimizations.
Conduct experiments to evaluate the impact of optimization methods on model accuracy, latency, and throughput.
Collaborate with researchers and engineers to integrate optimizations into real-world machine learning workflows.
Document findings and contribute to technical reports, blog posts, or research publications.