Senior GPU Performance Engineer
Wayve
·
Posted:
April 21, 2023
·
Onsite
About the position
The job overview for this position is that the GPU Performance Engineer will be responsible for optimizing the hardware on autonomous vehicles to enable larger and more powerful models to be deployed on the road. This will involve profiling, debugging, and optimizing model architecture for fast inference, as well as building a runtime environment that maximizes model capacity and speed. The ideal candidate will have deep knowledge of GPU architectures, experience with PyTorch and CUDA, and knowledge of model optimization techniques such as pruning and quantization. The role offers competitive compensation and stock options, as well as the opportunity to shape the future of autonomous driving.
Responsibilities
- Profiling, debugging, and optimizing model architecture for fast inference
- Rewriting core parts of the model architecture for maximum performance
- Building a runtime environment that maximizes the model capacity and speed
- Applying model optimization techniques to improve training time
- Staying current with developments in GPU computing and machine learning techniques
- Identifying, debugging, and fixing performance bottlenecks in models
- Having deep knowledge of GPU architectures and how to squeeze the most out of them
- Having experience with PyTorch and CUDA
- Having strong C++ and Python ability
- Knowing model optimization techniques such as pruning and quantization
- Having experience with GPU profiling and debugging tools such as NVIDIA Nsight and Intel VTune
Requirements
- Deep knowledge of GPU architectures and how to squeeze the most out them
- Ability to identify, debug and fix performance bottlenecks in our models
- Experience with PyTorch and CUDA
- Strong C++ and Python ability
- Knowledge of model optimization techniques such as pruning and quantisation
- Experience with GPU profiling and debugging tools such as NVIDIA Nsight and Intel VTune
- Knowledge of more recent technologies such as Torch inductor or Triton (desirable)
- A strong foundation in deep learning concepts would be helpful (desirable)