Cowboy Space Corp. is building the infrastructure to power and connect the orbital economy. Our satellites operate in Low Earth Orbit to collect sunlight and enable a new class of capabilities—from powering on-orbit compute, to transmitting energy via infrared lasers (space-to-earth and space-to-space), powering on-orbit compute to delivering secure, high-bandwidth optical data. By rethinking how energy and data are generated and distributed in space, we’re unlocking entirely new ways to operate both in orbit and on Earth. Founded in 2024 by Baiju Bhatt (co-founder of Robinhood), Cowboy Space Corp. is backed by leading investors and built by a team from top aerospace and defense organizations. We’re moving quickly to solve complex technical challenges and build a new category of space infrastructure. The Role Deploying high-performance GPU compute in Low Earth Orbit introduces a fundamentally different fault landscape than ground-based datacenter operation. This role sits at the frontier of that problem. When a fault occurs 500km above Earth, the system must detect it, classify it, contain it, and recover from it autonomously. You will own the end-to-end RAS validation strategy for GPU server systems, working directly with GPU and HBM silicon partners to analyze failures, characterize fault propagation paths, and ensure detection and recovery mechanisms function correctly. The right candidate combines deep knowledge of processor and memory architecture with hands-on system-level validation experience and the ability to drive partner engagements to resolution. This role is located in San Carlos or Seattle.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
No Education Listed