R&D Principal Software Engineer

Broadcom•Austin, TX

1d•$127,100 - $226,000

About The Position

Broadcom is a global leader in semiconductor and infrastructure software solutions. As part of our commitment to innovation and excellence, our VMware subsidiary is dedicated to shaping the future of virtualization technology. We are seeking talented individuals to join the GPU Virtualization Team, which is responsible for integrating GPUs in the ESXi Operating System and providing acceleration to AI/ML and Graphics applications running inside the Virtual Machines. The GPU Virtualization Team is part of the VMware Cloud Foundation (VCF) Division which enables readily deployable, easily managed solutions with GPUs to unleash the power of heterogeneous computing for modern applications.

Requirements

Bachelor's degree in Computer Science or related field and 12+ years of related experience or Masters degree and 10+ years of related experience.
5+ years of experience in ML framework/runtime development, GPU/XPU backend engineering.
Strong understanding and direct experience with ML frameworks (PyTorch, JAX) and graph/ML compiler technologies (e.g. OpenXLA).
Experience with C++ and Python programming languages.
Strong problem-solving skills and ability to troubleshoot complex issues.
Excellent communication and collaboration skills.
Experience with version control systems such as Git.
Ability to thrive in a fast-paced and dynamic work environment.
Familiarity with enterprise coding standards and best practices.
Must have legal authorization to work in the US

Nice To Haves

Experience with inference servers such as vLLM, Triton.
Experience with low-level GPU kernel development and writing custom kernels (e.g., CUDA, ROCm, or similar).

Responsibilities

Research, design, and develop the AI Virtualization Stack for our ESXi server product.
Implement and optimize PyTorch and JAX backends using the OpenXLA framework to ensure high-performance AI/ML workload execution across GPUs and XPUs.
Analyze and re-architect performance-critical sections of the ML acceleration code, focusing on optimization techniques for LLM inference such as KV-caching and FlashAttention.
Troubleshoot and address bugs related to AI/ML acceleration functionality.
Deliver software that meets the coding guidelines and quality standards set by the VCF.
Develop and maintain technical documentation for delivered features.
Work closely with the larger team, including virtual driver and device team, as well as external GPU/XPU vendors, to provide end-to-end support for ML frameworks.
Stay up-to-date with the latest GPU/XPU hardware architecture and AI/ML compiler technologies.