Principal Engineer, CUDA UMD - GPU Kernel Scheduling

NVIDIASanta Clara, CA
$272,000 - $431,250

About The Position

NVIDIA is seeking a motivated system software engineer with deep understanding of device drivers and phenomenal C/C++ skills to work on the CUDA Driver. This role is integral to a team that delivers features and improvements to better realize the potential of NVIDIA hardware for a growing range of computational workloads, including deep learning, scientific computation, data science, self-driving cars, video games, and virtual reality. As a member of the team, you will use your design abilities, coding expertise, and creativity to deliver the best compute platform in the world, crafting elegant solutions to exciting problems and shaping the future direction of CUDA through collaboration with peers across NVIDIA.

Requirements

  • BS or MS degree in Computer Science, Electrical Engineering or related field (or equivalent experience)
  • Strong C and C++ programming skills
  • Minimum of 15+ years of related development experience
  • Experience driving projects across multiple teams
  • Experience working with large codebases
  • Background with operating system interfaces for threads, process control, and virtual memory
  • Experience writing and debugging multithreaded programs
  • Good written communication as well as presentation skills

Nice To Haves

  • Prior experience with parallel computing - preferably writing CUDA Programs or Libraries that use CUDA
  • Understanding of system level architecture, such as interconnects, memory hierarchy, interrupts, and memory-mapped IO
  • Knowledge of memory coherence and consistency models
  • Background with kernel mode development
  • Experience with Linux Systems Software development as well as experience maintaining and extending programming models or higher-level language support for similar environments

Responsibilities

  • Evangelize, architect, and implement new features
  • Coordinate and drive development efforts across multiple teams
  • Help define forward-looking improvements to the CUDA APIs and programming model
  • Extend important CUDA programming models and functionality such as CUDA Graphs
  • Explore ways to use Graphs to improve the scheduling of AI/ML workloads on our GPUS to be more efficient and faster
  • Write effective, maintainable, and well-tested code
  • Develop code for multiple operating systems

Benefits

  • Equity
  • Benefits
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service