Luma's mission is to build multimodal AI to expand human imagination and capabilities. This requires a massive, reliable, and performant GPU infrastructure that pushes the boundaries of scale. Our SRE team is the foundation of our research and product velocity, responsible for the thousands of NVIDIA and AMD GPUs across multiple providers that power our work. This is not a typical cloud SRE role. We are looking for a hands-on, first-principles engineer who is fluent in Linux and comfortable operating close to the metal. You will build, maintain, and scale Luma's large-scale GPU infrastructure, working directly on on-prem and multi-vendor cloud clusters. You'll solve complex systems problems, ensure reliability through clear SLOS/SLIs, and build automation that allows us to operate at an unprecedented scale with a lean team.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Industry
Publishing Industries
Education Level
No Education Listed
Number of Employees
51-100 employees