About The Position

At Red Hat we believe the future of AI is open and we are on a mission to bring the power of open-source LLMs and vLLM to every enterprise. We are seeking a highly motivated summer intern to join our Machine Learning Research Team. As an intern, you will work on cutting-edge AI inference and model optimization techniques, and contribute to research and engineering efforts that make LLMs faster and more efficient. This is an exciting opportunity to gain hands-on experience in applied machine learning research while working with leading experts in the field.

Requirements

  • Currently pursuing a Ph.D. degree in Computer Science, Electrical Engineering, Machine Learning, or a related field.
  • Strong programming skills in C++, CUDA, and Python.
  • Experience with tensor math libraries such as PyTorch.
  • Familiarity with AI model optimization techniques such as quantization (e.g., INT4, FP8), pruning, and knowledge distillation.
  • Deep understanding and experience in GPU performance optimizations.
  • Excellent knowledge of large language model architectures
  • Strong analytical and problem-solving skills.
  • Excellent communication skills and ability to work in a team-oriented research environment.
  • Background in efficient inference techniques for large-scale language models or computer vision models.
  • Prior experience contributing to open-source ML frameworks or research publications.

Nice To Haves

  • 1 or more co-authored papers at a top tier conference like NeurIPS, ICLR, ACL, CVPR, MLSys is a big plus.

Responsibilities

  • Research and implement techniques for LLM inference and LLM optimizations.
  • Conduct experiments to evaluate the impact of optimization methods on model accuracy, latency, and throughput.
  • Collaborate with researchers and engineers to integrate optimizations into real-world machine learning workflows.
  • Document findings and contribute to technical reports, blog posts, or research publications.

Benefits

  • Hands-on experience with state-of-the-art AI inference optimization research.
  • Mentorship from leading experts in machine learning and model efficiency.
  • Opportunity to contribute to research papers, patents, or open-source projects.
  • Competitive stipend and potential for full-time opportunities.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service