Head of Infrastructure, Stealth Edge AI Co

Montauk CapitalNew York, NY
Hybrid

About The Position

We’re building the automation, orchestration, and monitoring layer that unifies disparate metro edge GPU nodes into a single software-managed compute platform. You’ll own the definition, design, implementation, and execution of the hardware and infrastructure buildout, executing strategy across edge data center requirements, GPU selection, supply chain, technical implementation, operational maintenance and deployment as we scale. You’ll take the foundational groundwork and execute across the entire hardware and infrastructure side of our company, transforming our roadmap into production scale compute for AI inferencing. You’ll ensure the GPU clusters deliver on customer requirements, are highly-available, and will be the hands on expert for the hardware side of our business. Most importantly, you’ll turn our high-level plans into real, technical execution, and will play a key role in making supply chain decisions about infrastructure and how we deploy, scale, and support it.

Requirements

  • Strong infrastructure engineering experience and systems-level technical judgment
  • Experience deploying or managing compute infrastructure in real-world environments
  • Experience with data center, hardware, or GPU-based systems implementation
  • Experience owning GPU provisioning, hardware selection, and systems configuration
  • GPU scheduling and orchestration specifics: GPU type awareness, memory management, topology considerations, placement strategies for multi-GPU jobs, and fragmentation minimization
  • Bare-metal provisioning lifecycle: IPMI/Redfish, BMC-based remote management, PXE boot, and automated OS deployment workflows
  • On-board storage
  • Observability stack: distributed configuration and troubleshooting, plus monitoring, alerting, and tracing
  • Deployment planning, Hardware configuration, Operational troubleshooting
  • Linux systems depth: RHEL/Ubuntu, low-level troubleshooting, shell scripting
  • Security and operational best practices for bare metal
  • Deployment tooling at production scale
  • Networking fundamentals for inference workloads and OOB management
  • Startup / 0→1 DNA: You ship fast and communicate clearly.

Responsibilities

  • Own GPU infrastructure design and implementation details from planning through deployment
  • Own hardware selection, configuration, and deployment across early compute infrastructure
  • Help turn early technical groundwork into a functioning deployed system
  • Own the GPU roadmap we use to entice customers and build partnerships
  • Deploy, operate, and tune GPU clusters for both bare-metal and internal software stack
  • Own resilient networking implementation from each site to the cluster, including a robust OOB network for constant monitoring and management
  • Manage deployments at production scale
  • Interface with site ops on power, cooling, and connectivity
  • Build the automation and monitoring stack for distributed edge nodes
  • Own the supply chain for all infrastructure gear
  • Manage third party hardware vendors on provisioning, maintenance and break-fix support

Benefits

  • Competitive compensation + equity: True ownership over what you build
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service