You are a hands-on engineer who builds the software and processes that keep a large fleet of GPU servers healthy and productive. You write systems and tooling for managing 1000s of servers including provisioning, health monitoring, error detection, and recovery — and when something breaks that automation can’t fix, you drive resolution with partners.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed