VP, AI Infrastructure - Highrise.ai

Hut 8San Francisco, CA
10hRemote

About The Position

At Highrise.ai, we recognize that the shift from traditional software to AI represents one of the most significant technological transformations of our era. Our mission is to accelerate this shift across industries by providing the cutting-edge infrastructure needed to build, scale, and deploy AI at an unprecedented level. Our products empower the development and operation of the world’s most advanced large language models (LLMs), generative models, and computer vision models. By offering highly optimized GPU infrastructure, we enable organizations to unlock the full potential of AI, pushing the boundaries of what’s possible in fields such as natural language processing, image recognition, and beyond. We are committed to driving innovation and efficiency in AI infrastructure, making it easier for companies to transition from concept to production at scale, with the reliability and performance necessary to stay ahead in a rapidly evolving technological landscape. Hut 8 is scaling our GPU platform that integrates power, colocation, and compute into a single, operationally owned stack. Customers don’t ask whether we have power or GPUs — they ask who actually runs the operation. This role exists to be that answer. Hut 8 leadership owns strategy, capital, and customer relationships. You own execution. You are accountable for taking raw, power-backed infrastructure and turning it into production-ready, SLA-backed GPU capacity — at scale. This is not a lab role or a traditional enterprise ops role. It is enterprise -grade operational discipline applied to hyperscale infrastructure, built on direct control of power, facilities, and compute .

Requirements

  • You are a senior infrastructure operator who has scaled real systems, not just designed them.
  • 10+ years in large-scale infrastructure or hyperscale data center operations
  • 5+ years operating GPU-accelerated and/or HPC environments
  • Direct experience deploying and operating 10,000+ GPUs in managed, production settings
  • Deep expertise in RDMA networking (InfiniBand and/or RoCE)
  • Proven ownership of 99.9%+ uptime and customer-facing SLAs
  • Hands-on operator mindset — you drive issues from detection → root cause → resolution
  • Track record of building repeatable deployment and commissioning playbooks
  • Experience leading teams of 20–50+ across field ops, deployment, and infrastructure
  • Senior leadership background (Director / Senior Director / VP) at a recognized operator

Nice To Haves

  • Direct working relationships with NVIDIA and/or AMD
  • Experience with H100, H200, GB200, and/or MI300X platforms
  • Multi-site, parallel deployment experience
  • Background spanning greenfield buildouts and steady-state hyperscale operations

Responsibilities

  • Run the full operational layer between Hut 8’s facilities and the customer: Deployment, commissioning, and production sign-off
  • Performance, uptime, and SLA ownership
  • Incident response, escalation, and root cause resolution
  • OEM, vendor, and hardware lifecycle management
  • Scaling operations from ~1,100 GPUs to 20,000+ GPUs
  • Accountable for end-to-end managed GPU operations , including: RDMA networking (InfiniBand and/or RoCE)
  • Multi-tenant and single-tenant production environments
  • 99.9%+ availability targets
  • Repeatable, auditable commissioning processes
  • Enterprise readiness layered onto hyperscale infrastructure
  • Coordinate tightly with Hut 8 teams on: Power, cooling, rack density, and facility readiness
  • Deployment sequencing and capacity expansion
  • Operating constraints driven by energy, thermal, and site realities
  • Build and lead the operational organization: Field operations, deployment, and infra ops teams
  • Hiring, structure, and escalation paths as scale ramps
  • Clear ownership across sites and customers
  • Run a disciplined, hands-on operating rhythm: Daily standups on deployment progress, risks, and blockers
  • Direct oversight of GPU failures, RDMA performance, and thermal issues
  • Production sign-off for each deployment tranche
  • Weekly capacity and readiness planning with facilities
  • Monthly OEM and vendor performance reviews
  • Quarterly planning for expansion, refresh cycles, and new platforms

Benefits

  • Hut 8 offers a benefits and wellness program that includes medical, dental, vision, life, and short-term and long-term disability insurance, as well as paid time off.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service