Instructor-Led Workshop: Fundamentals of Accelerated Computing with CUDA C/C++
NHR Workshop (Online)
Montag, Dienstag, 08., 09. September 2025, jeweils 9 - 13 Uhr
Speaker: Markus Velten
This workshop teaches the fundamental tools and techniques for accelerating C/C++ applications to run on massively parallel GPUs with CUDA®. You’ll learn how to write code, configure code parallelization with CUDA, optimize memory migration between the CPU and GPU accelerator, and implement the workflow that you’ve learned on a new task—accelerating a fully functional, but CPU-only, particle simulator for observable massive performance gains. At the end of the workshop, you’ll have access to additional resources to create new GPU-accelerated applications on your own.
Agenda
- Accelerating Applications with CUDA C/C++
- Learn the essential syntax and concepts to be able to write GPU-enabled C/C++ applications with CUDA:
- Write, compile, and run GPU code.
- Control parallel thread hierarchy.
- Allocate and free memory for the GPU.
- Learn the essential syntax and concepts to be able to write GPU-enabled C/C++ applications with CUDA:
-
Managing Accelerated Application Memory with CUDA C/C++
-
Learn the command-line profiler and CUDA-managed memory, focusing on observation-driven application improvements and a deep understanding of managed memory behavior:
- Profile CUDA code with the command-line profiler.
- Go deep on unified memory.
-
Optimize unified memory management.
-
-
Asynchronous Streaming and Visual Profiling for Accelerated Applications with CUDA C/C++
-
Identify opportunities for improved memory management and instruction-level parallelism:
- Profile CUDA code with NVIDIA Nsight Systems.
- Use concurrent CUDA streams.
-
-
Final Review
- Review key learnings and wrap up questions.
- Complete the assessment to earn a certificate.
-
Take the workshop survey.
Handouts
The course material (slides) will be made available to the class participants.
HPC-Certification Forum Links
Prerequisites
- Basic C/C++ competency, including familiarity with variable types, loops, conditional statements, functions, and array manipulations
- No previous knowledge of CUDA programming is assumed
Learning Objectives
At the conclusion of the workshop, you’ll have an understanding of the fundamental tools and techniques for GPU-accelerating C/C++ applications with CUDA and be able to:
- Write code to be executed by a GPU accelerator
- Expose and express data and instruction-level parallelism in C/C++ applications using CUDA SD1.2.6 CUDA C/C++ Programming Fundamentals
- Utilize CUDA-managed memory and optimize memory migration using asynchronous prefetching
- Leverage command-line and visual profilers to guide your work PE2.3.7 NVIDIA Nsight Systems
- Utilize concurrent streams for instruction-level parallelism
- Write GPU-accelerated CUDA C/C++ applications, or refactor existing CPU-only applications, using a profile-driven approach
Hardware Requirements
Desktop or laptop computer capable of running the latest version of Chrome or Firefox. Each participant will be provided with dedicated access to a fully configured, GPU-accelerated server in the cloud. A stable, reasonable broadband internet connection is required.
Registration
Link: https://indico.ecap.work/event/156/
The NHR workshop is limited to 60 participants.
You will receive the access data shortly before the event by email to your registered email address.
Further Information
Course language: English
Target group: HPC Dev
If you have any further questions, please contact Markus Velten (markus.velten@tu-dresden.de).