As a Technical Program Manager (TPM) for AI Infrastructure Operations, you will be the operational backbone of our high-scale, high-performance AI and High-Performance Computing (HPC) environment. You will be responsible for driving complex, cross-functional programs that ensure the stability, availability, and growth of our cutting-edge GPU fleet and Infiniband network fabrics. This role requires a blend of deep technical understanding, rigorous program management, and a relentless focus on delivering against key operational metrics (SLAs, Uptime, Availability). You will bridge the gap between engineering execution and strategic business goals, directly impacting our ability to serve customer workloads at scale.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Principal