Senior Systems Performance Engineer

NVIDIASanta Clara, CA
1dOnsite

About The Position

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world. NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing. NVIDIA is a “learning machine” that constantly evolves by adapting to new opportunities that are hard to solve, that only we can tackle, and that matter to the world. This is our life’s work, to amplify human imagination and intelligence. Make the choice to join us today. We are now looking for a Senior Validation Engineer in the DGX Server Product Engineering Team. In this role you will be working with a team of HW/SW engineers to develop and implement complex automated test plans for our industry leading GPU accelerated computing products.

Requirements

  • Ability to work on site in hardware lab environment 5 days a week
  • BSEE or BSCE or equivalent experience
  • High speed IO validation
  • 5+ years or more of experience in validating and debugging complex systems.
  • Developing/running real world workload and factory diagnosis tools
  • Dynamo, TensorRT, Slurm skills mandatorily required.
  • Knowledge of vLLM, SG Lang preferred.
  • Proficiency in Cuda, Cublas and Cutlass
  • Deep understanding of computing architectures.
  • Coding experience with python programming, running simulators.

Nice To Haves

  • Experience with datacenter products including system management, security, networking, and storage.
  • Background with x86/Arm server architectures and accelerated GPU computing.
  • Track record of continuous process improvement with a passion for tools and automation.
  • Proven knowledge in Circuit and Waveform analysis

Responsibilities

  • System architecture, design, performance modelling, estimation across new models and new packages.
  • Collaborate with ODMs, component suppliers, and QA teams to ensure no gaps in coverage.
  • Enable GPU SKU bring up, validation and model enablement.
  • Develop system level stress and performance testing strategies using industry leading Deep Learning/AI applications.

Benefits

  • NVIDIA offers highly competitive salaries and a comprehensive benefits package.
  • We have some of the most forward-thinking and talented people in the world working for us and, due to unprecedented growth, our world-class engineering teams are growing fast.
  • If you're a creative and autonomous engineer with real passion for technology, we want to hear from you!
  • Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package.
  • As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com/
  • Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.
  • You will also be eligible for equity and benefits.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service