Senior Software Engineer - vLLM Inference

Red River•Boston, MA

About The Position

At Red Hat we believe the future of AI is open and we are on a mission to bring the power of open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and brings operational simplicity to GenAI deployments. As leading developers, maintainers of the vLLM project, and inventors of state-of-the-art techniques for model compression, our team provides a stable platform for enterprises to build, optimize, and scale LLM deployments. We are seeking an experienced Senior Software engineer to work closely with our technical and research teams on vLLM, llm-compressor, speculators, llm-d, create DevOps and CI/CD infrastructure, and scale our current technology stack. If you are someone who wants to contribute to solving challenging technical problems at the forefront of AI Inference, this is the role for you! You would be joining the core team behind 2025's most popular open source project on GitHub. In this role, your primary responsibility will be to build and release the Red Hat AI Inference Server, continuously improve the processes and tooling used by the DevOps team, and find opportunities to automate procedures and tasks. Join us in shaping the future of AI!

Requirements

2+ years of experience in MLOps, DevOps, Automation and/or modern Software Deployment practices
Experience with Release Engineering
Experience evaluating LLMs for performance and accuracy (think HellaSwag, MMLU, Chatbot Arena, TruthfulQA, etc.)
Being super comfortable with Python and PyTest is a must
Strong experience with Git, Github Actions including self-hosted runners, BuildKite, Terraform, Jenkins, Ansible, and/or other common technologies for automation and monitoring
Experienced with administering Kubernetes/OpenShift and/or docker/podman
Experience with Cloud Computing using at least one of the following Cloud infrastructures: AWS, GCP, Azure, or IBM Cloud
Familiar with Agile development methodology
Solid troubleshooting skills
Ability to interact comfortably with the other members of a large, geographically dispersed team
Experience maintaining an infrastructure and ensuring stability
While a Bachelor’s degree or higher in computer science, mathematics, or a related discipline is valued, we prioritize technical prowess, initiative, problem solving, and practical experience

Nice To Haves

Familiarity with contributing to the vLLM CI community is a big plus

Responsibilities

Collaborate with research and product development teams to scale machine learning products for internal and external applications
Actively contribute to managing and releasing upstream and midstream product builds
Test to ensure correctness, responsiveness, and efficiency
Troubleshoot, debug and upgrade Dev & Test pipelines
Identifying and deploying cybersecurity measures by continuously performing vulnerability assessment and risk management
Collaborate with a cross-functional team about market requirements and best practices
Keep abreast of the latest technologies and standards in the field

Benefits

Comprehensive medical, dental, and vision coverage
Flexible Spending Account - healthcare and dependent care
Health Savings Account - high deductible medical plan
Retirement 401(k) with employer match
Paid time off and holidays
Paid parental leave plans for all new parents
Leave benefits including disability, paid family medical leave, and paid military leave
Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more!

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume