About The Position

Vultr is seeking a highly skilled and experienced Senior Technical Product Manager to own the GPU Orchestration product line. This platform powers managed Kubernetes, managed Slurm, SUNK, and Run:ai integration for GPU-based AI and HPC workloads. The ideal candidate will possess deep technical fluency in container orchestration, HPC scheduling, and distributed systems, coupled with a strong product instinct for developer and operator platforms. This is a highly visible role in a high-growth technology company, requiring close partnership with Infrastructure, Compute, Networking, and Platform teams to build a reliable, scalable, and cost-efficient orchestration platform. This is an opportunity to join a fast-growing team and make a significant impact on Vultr and the future of AI Infrastructure.

Requirements

  • 7+ years of product management experience in cloud infrastructure, container orchestration, HPC, or developer platforms
  • Deep understanding of Kubernetes, Slurm, or similar orchestration and scheduling systems, including GPU scheduling, resource management, and multi-tenant isolation
  • Experience defining product strategy and roadmaps for platform or infrastructure products at scale
  • Strong technical background — ability to engage with engineering on cluster lifecycle, control plane reliability, API design, and distributed systems
  • Experience with AI/ML infrastructure, including training workloads, inference serving, and GPU resource optimization
  • Track record of shipping developer- and operator-facing products with measurable impact on reliability, adoption, or operational efficiency
  • Experience working across cross-functional teams (engineering, design, marketing, sales) in a fast-paced environment
  • Excellent written and verbal communication skills, with the ability to translate complex technical concepts for diverse audiences
  • Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience)

Responsibilities

  • Define and execute the roadmap for managed Kubernetes, managed Slurm services, SUNK, and Run:ai integration
  • Own the end-to-end cluster lifecycle, including provisioning, configuration, upgrades, scaling, high availability, and decommissioning
  • Establish scheduling and resource management capabilities for GPU workloads, including quotas, fair-share policies, multi-tenant isolation, and priority handling
  • Drive integration between orchestration services and core infrastructure components, including networking, storage, identity, observability, and billing systems
  • Define service-level objectives for control plane reliability, job scheduling latency, cluster availability, and upgrade stability
  • Design APIs, CLI tooling, and UI workflows that enable self-service cluster management and workload operations
  • Partner with customer-facing teams to understand training, inference, and HPC use cases, translating real workload requirements into product capabilities
  • Monitor industry trends in container orchestration, HPC scheduling, distributed systems, and AI infrastructure to inform product direction

Benefits

  • 100% company-paid insurance premiums for employee medical, dental and vision plans
  • 401(k) plan that matches 100% up to 4%, with immediate vesting
  • Professional Development Reimbursement of $2,500 each year
  • 11 Holidays + Paid Time Off Accrual + Rollover Plan
  • Increased PTO at 3 year and 10 year anniversary
  • 1 month paid sabbatical every 5 years
  • Anniversary Bonus each year
  • $500 stipend for remote office setup in first year + $400 each following year
  • Internet reimbursement up to $75 per month
  • Gym membership reimbursement up to $50 per month
  • Company paid Wellable subscription
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service