A plethora of science

My One-Month Journey into GPU Programming

For the past month, I have embarked on an intense and rewarding journey into the world of GPU programming. Starting from scratch, I dedicated myself to learning the intricacies of parallel computation and harnessing the power of GPUs for a wide range of applications. This blog post chronicles my experiences, challenges, and key takeaways from this exciting adventure.

The Initial Spark

My motivation stemmed from a desire to delve into the heart of modern computing—understanding how to leverage the massive parallelism offered by GPUs. With the guidance of my mentor (hkproj) and inspiration from fellow learners like 1y33, I committed to a 100-day challenge, pushing myself to code and learn something new every single day.

Learning CUDA and Expanding to AMD

While CUDA was my primary tool, I also focused on ensuring compatibility with AMD GPUs. This broadened my understanding of different GPU architectures and programming models. By Day 30, I had optimized multiple kernels for AMD hardware, testing them on an AMD MI250 with 128 cores per node and 1TB of memory.

Some key kernel implementations included:

This hands-on experience provided valuable insights into optimizing performance across different GPU architectures.

Challenges and Triumphs

Like any learning journey, this one wasn’t without hurdles. Some of the biggest challenges included:

However, the triumphs made it all worthwhile:

Key Projects and Learnings

Throughout the month, I tackled a variety of projects that deepened my understanding of GPU programming:

Fundamental Algorithms

Convolutional Neural Networks (CNNs)

Flash Attention

Sparse Matrix-Vector Multiplication (SpMV)

Monte Carlo Tree Search (MCTS)

Other Implemented Algorithms

I also explored optimized libraries such as cuBLAS, rocBLAS, and cuDNN, which significantly simplified complex computations on NVIDIA and AMD architectures.

Tools and Resources

My learning was significantly aided by the following resources:

Looking Ahead

This first month has laid a strong foundation. Going forward, I plan to:

My journey is documented on GitHub (a-hamdi-cuda), where you can find all my code and track my progress. I also share my learnings on my blog (https://hamdi.bearblog.dev/) so feel free to check it out!

This is just the beginning of my GPU programming adventure—I’m eager to continue pushing the boundaries of parallel computation.

Social Media:

LinkedIn

Twitter