Overview
Practicing GPU kernel development by solving CUDA challenges on LeetGPU, covering parallel algorithms and GPU memory optimisation.
Details
Ongoing self-directed learning of CUDA GPU programming through LeetGPU challenges. Joined February 2026.
Progress
- 10 / 87 problems solved
- Primary hardware: NVIDIA Tesla T4 · NVIDIA B200
Key Features
- Thread hierarchy - grids, blocks, and warps for parallel kernel launches
- Memory management - global, shared, and constant memory access patterns
- Reduction kernels - parallel sum, max, and dot product implementations
- Warp-level optimisations - shuffle instructions to minimise shared memory usage