28 Nov 2024 4 min read SGeMM: NVIDIA's Most Important Function Matrix Multiplication is probably the algorithm of the 21st Century.
28 Nov 2024 8 min read What is SGeMM SGeMM stands for Single-Precision General Matrix Multiplication. Let's analyze matrix multiplication on a CPU and a GPU.
28 Nov 2024 12 min read Step 1: Getting Started with CUDA Programming Parallel matrix multiplication using CUDA C++.
28 Nov 2024 11 min read Step 2: GPU Global Memory Coalescing Memory coalescing is the most crucial concept in GPU programming. With matrix multiplication, we can get upwards of 7x improvement.
28 Nov 2024 11 min read Step 3: GPU Shared Memory Tiled matrix multiplication using GPU shared memory.