Senior Technical Program Manager - Capacity

NVIDIASanta Clara, CA
Hybrid

About The Position

Hardware Infrastructure is seeking a Technical Program Manager to lead Infrastructure Capacity Management programs and workstreams. Given this Infrastructure directly supports our near-term and long-term chip roadmap, it must be highly reliable, performant and efficient for our internal users. This is a fast paced and evolving landscape that requires a TPM to guide engineering roadmaps to be delivered with high quality outcomes and a strong foundation of operational excellence. They will partner both internally within Hardware Infrastructure and externally with senior management and our HW Engineering partners to manage capacity operations and scale processes supporting our next stage of growth. They will develop and standardize planning, reporting and execution methodologies and metrics to enable meeting the challenging objectives.

Requirements

  • B.S. (or equivalent experience) in Electrical Engineering, Computer Science or a related technical field
  • 12+ years of proven experience across Capacity Engineering, Capacity Management and/or Technical Program Management roles within the Capacity space
  • Experience working with large scale infrastructure with various CPU/GPU architectures both on-prem and cloud
  • Exceptional communication and presentation skills for diverse technical and non-technical audiences
  • Proactive in identifying and implementing positive changes in both system and process design in a fast-paced environment

Nice To Haves

  • Deep experience leading Capacity Operations & Management work streams
  • Prior experience procuring Data Center Hardware and Public Cloud Services

Responsibilities

  • Own end-to-end capacity management strategy and execution for EDA Farm, including server procurement, vendor negotiations, server capacity allocation, data center space & power, delivering measurable efficiency and cost optimization across the organization
  • Identify and help drive implementation of improvements to EDA Farm infrastructure tooling, automation, and workflows that accelerate server provisioning, reduce manual overhead, and scale capacity management operations
  • Drive capacity and procurement initiatives using agile program methodology, aligning planning, prioritization, and delivery across engineering, procurement, and vendor partner teams
  • Build and maintain a data-driven capacity model, using metrics and business objectives to improve farm utilization, procurement performance, and vendor SLA consistency — turning insights into actionable cost and capacity optimizations
  • Create clear, consistent communication channels that give customers at every level real-time insight into farm capacity health, procurement timelines, supply chain risks, and mitigation plans
  • Act as a primary technical and strategic partner between engineering, procurement, finance, and hardware vendors to ensure EDA Farm capacity optimally meets the demands of design and verification teams
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service