GPU Accelerated Matrix Multiplication

Accelerated matrix multiplication using CUDA C/C++. To make things interesting, let us try to match the performance of NVIDIA cuBLAS.

What is SGeMM

SGeMM stands for Single-Precision General Matrix Multiplication. Let's analyze matrix multiplication on a CPU and a GPU.