Lecturer: Dr Zoltán Juhász, associate professor
The goal of this course is to provide an in-depth overview of the architecture, programming and applications of modern GPU systems, and explore the advanced programming methods that can be used for creating high-performance scientific algorithm implementations.
The main topics of the course are the followings:
- the GPU architecture
- the CUDA programming model
- development of parallel kernels
- programming cooperative threads
- the CUDA memory model
- the pipeline execution model and GPU streams
- performance analysis and optimisation methods
- the Roofline performance model
- multi-GPU systems and their programming
- using parallel CUDA GPU libraries
- using OpenMP and OpenACC for generating parallel GPU code
- literature review of the use of GPUs in a selected scientific computing area
- case studies from mathematics
- hands-on case studies and practical work in EEG signal processing algorithms and processing pipelines.
References:
1. David B. Kirk, Wen-mei W. Hwu: Programming Massively Parallel Processors, Morgan Kaufmann (2010), p. 279
2. Shane Cook: CUDA Programming, Morgan Kaufmann (2013), p. 591
3. John Cheng, Max Grossman, Ty McKercher: Professional Cuda C Programming, Wrox (2014) p. 527
4. Nvidia: Cuda C Programming Guide, https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html
5. Selected research papers