About The Position

NVIDIA has been a leader in computer graphics, PC gaming, and accelerated computing for over 25 years, driven by innovation and talented individuals. The company is now leveraging AI to define the next era of computing, with GPUs acting as the brains for computers, robots, and self-driving cars that can understand the world. This role is part of a team that builds solutions to optimize all layers of the CUDA ecosystem, aiming for class-leading speedups in modern high-performance workloads and models. The Senior Software Engineer will architect and implement highly scalable solutions for various use-cases, develop new innovative workflows, and work on compiler- or runtime-driven solutions to accelerate critical workloads, generate optimal code patterns, and address high-impact AI challenges. The position involves close collaboration with internal NVIDIA software and hardware teams to integrate the latest developments into NVIDIA products.

Requirements

  • Bachelor ’s degree in Computer Science, Electrical Engineering, or related field (or equivalent experience)
  • 6+ years of industry or academia experience with software engineering, compilers and developer tools; exposure to building comprehensive optimization frameworks, and hands-on experience with product environments
  • Strong knowledge of compilers, code generation, and GPU architecture
  • Experience with GPU programming and performance optimization (CUDA or equivalent)
  • Extensive Python programming skills, along with software engineering fundamentals
  • Basic programming skills in other languages such as C/C++, Racket and Rust
  • Strong mathematical and scientific foundation relevant to optimization heuristics/algorithms, ML and data science
  • Track record developing and productizing software, optimization frameworks and/or developer tooling

Nice To Haves

  • Familiarity with genetic/evolutionary algorithms, predictive modeling, and complex systems
  • Deep expertise in GPU performance optimizations, evidenced by benchmark wins or published results
  • Hands-on experience building compilers or compiler components using the LLVM framework, including optimization passes and code generation
  • Familiarity with NVIDIA and open source compilers like LLVM, MLIR, PTX and OpenAI Triton
  • Experience with Data Science projects, specifically with MLOPS workflows and tools, like W&B, MLflow, etc.

Responsibilities

  • Design and build high-performance optimization frameworks for the entire CUDA ecosystem
  • Co-design novel solutions with software, hardware and algorithm teams; influence and adopt new capabilities as they become available
  • Develop reproducible, high-fidelity evaluation frameworks covering performance, quality and developer productivity
  • Collaborate across the AI stack — from hardware through compilers/toolchains, kernels/libraries, frameworks, distributed training, and inference/serving

Benefits

  • competitive salaries
  • generous benefits package
  • equity
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service