We are seeking a Principal GenAI Inference Optimization Engineer to join our Models and Applications team. This role focuses on improving performance, efficiency, and scalability of generative AI inference workloads on AMD GPU platforms. You will contribute to optimizing latency, throughput, and cost efficiency for real-world deployment of large-scale models, working across the software-hardware stack.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Principal
Education Level
No Education Listed
Number of Employees
5,001-10,000 employees