The Red Hat Performance and Scale Engineering team is seeking a Senior Performance Engineer to join the Performance and Scale for AI Platforms (PSAP) team. In this role, you will help drive the performance and scalability of distributed inference for Large Language Models (LLMs) as part of the llm-d open source project. Serving modern LLMs in production requires distributing models, computation, and requests across specialized hardware accelerators and multi-node environments. In this role, you will characterize, model, and optimize these systems to deliver industry leading throughput, latency, and cost efficiency across Red Hat’s AI platforms. We are looking for an engineer who is curious, adaptable, and excited to work at the intersection of distributed systems, performance engineering, and AI. You will join a highly collaborative, open source driven team focused on advancing performance across Red Hat’s product and cloud services portfolio. At Red Hat, open source principles guide how we build and innovate. We encourage teams to thoughtfully leverage AI to improve workflows, reduce complexity, and unlock higher impact work.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level