About The Position

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world. NVIDIA operates one of the largest GPU computing fleets in the world, powering AI workloads across NVIDIA Cloud Partners and enterprise customers. Our Kubernetes AI Platform team is responsible for the platform layer that makes these GPU clusters operational at scale: cluster provisioning, runtime publication, self-service operations, and workload orchestration. We are looking for a product manager who wants to pioneer new platform technologies for AI infrastructure. In this role, you will identify gaps in how customers and partners operate GPU clusters at scale, develop new platform projects to fill them, and bring those innovations to market. You will also help NVIDIA develop its internal Kubernetes platform that powers innovation across the company. Have you built platform technologies that changed how teams operate infrastructure? Are you excited about developing new open-source projects at the intersection of Kubernetes and AI? We’d love to hear from you! What We Need to See: In this role, we will count on you to drive innovation for our partners, customers, and internal teams. Here is what that looks like:

Requirements

  • 12+ years of product management experience in Kubernetes platform engineering, cloud infrastructure, or GPU-accelerated compute environments.
  • Experience shipping Kubernetes platform products for hardware-aware compute environments.
  • Deep understanding of Kubernetes architecture: API server, scheduler, controller patterns, CRDs, device plugins, and operator frameworks.
  • Experience developing or leading open-source projects in the cloud-native or infrastructure space.
  • Track record defining multi-quarter strategy and leading execution with multiple engineering teams.
  • Experience working with cloud service providers or platform partners in a delivery or enablement capacity.
  • Bachelor's degree or equivalent experience in Business, Engineering, Computer Science, or a related field.

Nice To Haves

  • Crafting or leading open-source projects that gained meaningful community adoption.
  • AI/ML runtime lifecycle management, container image pipelines, or OCI distribution.
  • GPU scheduling, topology-aware placement, or multi-tenant GPU cluster management.
  • HPC workload orchestration in production environments.
  • Shipping platform products that partners or third-party operators depend on as well as contributions to Kubernetes SIGs, CNCF projects, or GPU-related open-source work.

Responsibilities

  • Identify gaps in how customers and partners operate GPU clusters at scale and develop new projects to address their needs.
  • Bring internal platform innovations to market as open source software projects, including community strategy, contribution models, and ecosystem engagement.
  • Own the product roadmap for AI runtime generation, testing, packaging, and publication across cloud partners and deployment targets.
  • Drive platform-level cluster provisioning and lifecycle management across NVIDIA Cloud Partners and enterprise environments.
  • Own our self-service cluster operations surface: the APIs, control planes, and automation that let customers provision, upgrade, and run clusters independently.
  • Work directly with cloud partners and operators to translate their operational requirements into platform capabilities.
  • Partner with engineering on the architecture and delivery of Kubernetes operators, controllers, and platform services that support GPU-aware cluster behavior.

Benefits

  • NVIDIA offers highly competitive salaries and a comprehensive benefits package.
  • As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com/
  • You will also be eligible for equity and benefits.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service