We are now looking for a Senior Software Engineer for AI Resiliency. At NVIDIA, we are pushing the boundaries of what's possible in AI. We are currently seeking a Senior Software Engineer to lead the development of AI software resiliency for the most powerful AI supercomputers in the world. As a member of our AI Software Resiliency team, you will play a pivotal role in defining and implementing critical resiliency features for AI supercomputers at a scale of 100,000+ GPUs. Your expertise will be crucial in driving down cluster downtime towards zero, ensuring that our AI systems remain robust and reliable at all times.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Career Level
Senior
Industry
Computer and Electronic Product Manufacturing
Education Level
Bachelor's degree
Number of Employees
5,001-10,000 employees