About The Position

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology - and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world. We are seeking an expert System Software Engineer with a strong background in Linux systems programming, systems administration, and technical support. This hybrid role combines software development, system-level troubleshooting, and direct support for internal or customer-facing environments. You will design and maintain Slurm written in C, diagnose sophisticated system issues, and collaborate with other engineers to ensure Slurm runs optimally and efficiently. The ideal candidate is equally comfortable writing efficient, reliable, and maintainable code, analyzing systems performance, and supporting production environments.

Requirements

  • Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience)
  • 5+ Years of professional experience in C development
  • Strong understanding of memory management, pointers, data structures, and algorithms
  • Experience with debugging tools such as GDB and performance profiling
  • Solid understanding of Linux kernel interfaces, system calls, and file system including work with Automake
  • Understanding of software development lifecycles and agile methodologies.
  • Strong problem-solving and analytical skills
  • An environment with a focus on quality and reliability
  • Experience with containers and GPU technologies
  • Curious, self-motivated, and eager to learn new technologies

Nice To Haves

  • Experience with C and other low-level languages.
  • Background in system administration or High Performance Computing
  • Experience with Slurm Workload Manager or other HPC scheduling systems
  • Knowledge of operating system internals or hardware-software interaction.
  • Contributions to open-source C projects are a plus

Responsibilities

  • Maintain, improve and optimize software components in C
  • Develop and maintain system-level and application-level code for Slurm which includes networking, system and device level components
  • Debug and troubleshoot complex Slurm issues related to reliability and performance
  • Write clean, maintainable, and well-documented code that adheres to industry standards
  • Collaborate with cross-functional teams including Operations, Infrastructure, and Deployment
  • Provide direct technical support to internal teams or external customers
  • Develop automated tests to ensure software reliability and regression prevention
  • Stay current with best practices in C programming, compilers, build systems, and related technologies

Benefits

  • NVIDIA offers highly competitive salaries and a comprehensive benefits package.
  • As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com
  • You will also be eligible for equity and benefits.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service