Luma AI runs on thousands of H100 GPUs across multiple providers and clusters for Training, Data Processing and Inference. Working with the Infrastructure and Research teams, the Staff Software Engineer - Reliability maintains the health of our GPU clusters, developing the monitoring and management tools necessary to maximize their performance.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Industry
Publishing Industries
Number of Employees
51-100 employees