Technical Support Engineer

Voltage ParkSeattle, WA
95d

About The Position

Voltage Park is your enterprise AI factory. We offer scalable compute power, on-demand and reserved bare metal AI infrastructure using NVIDIA GPUs, with world-class service, performance, and value. Founded with the mission of making accessible AI computing for all, our flexible, affordable GPU solutions power everyone from builders to enterprises. We’re looking for a Technical Support Engineer to join our customer experience team. In this role, you’ll take full command of high-impact incidents, lead real-time response efforts, and ensure clear communication internally and with customers directly. You’ll be the person everyone counts on when something breaks. As a Technical Support Engineer, you’ll operate at the intersection of engineering, data center operations, and customers. You’ll know how to work across engineering functions to get to resolution — fast. You're someone who can speak fluently to both C-suite and staff engineers, make customers feel heard, and push for the technical rigor needed to prevent repeat problems. You're decisive, unflinching in the face of ambiguity, and driven to own outcomes end-to-end. This is an on-site role, and you must be available to work from our Redmond, WA or San Francisco offices.

Requirements

  • Track record of managing customer escalations and technical comms across all levels, from execs to engineers
  • Proven ability to deliver complex systems or projects from 0 to 1
  • Willingness and ability to participate in weekend on-call rotation
  • Experience running or supporting infrastructure at scale (cloud, bare metal, or both)
  • 5+ years as a Senior Linux Systems Administrator, Infrastructure Support Engineer, or Data Center Operations Lead
  • Senior-level Linux system administration experience; able to operate confidently from the command line
  • Scripting experience in Bash, Python, or JavaScript
  • Experience diagnosing distributed training workloads and GPUs
  • Familiarity with job schedulers like Slurm or Kubernetes

Nice To Haves

  • AI/ML infrastructure support experience — especially involving model training and orchestration
  • Experience with cloud support, data center operations, or startup environments
  • Strong documentation and process improvement skills
  • Project management experience across technical and non-technical teams

Responsibilities

  • Serve as Incident Commander during outages and service degradation, leading response efforts across engineering and customer experience
  • Own technical incidents from detection to resolution, driving urgency and accountability
  • Communicate clearly with internal stakeholders and customers, keeping everyone aligned and informed
  • Help implement long-term solutions to issues uncovered by root cause analysis
  • Develop tools, documentation, and processes to improve incident response and support quality
  • Partner closely with customers to understand their business, leveraging this knowledge to provide a personalized, consistent experience
  • Continuously look for ways to improve the support experience, both human and technical
  • Maintain on-call availability for urgent incidents — you’re ready to jump in when others need you most
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service