12.05.2020
Moren: A framework for automatic mapping of data-parallel tasks on heterogeneous platforms (Statusvortrag)
12.05.2020, 10:00 Uhr
Einladung zum Statusvortrag im Promotionsverfahren von Herrn M. Sc. Konrad Moren
Thema: A framework for automatic mapping of data-parallel tasks on heterogeneous platforms
Betreuerin: Prof. Dr. Diana Göhringer
Fachreferent: Prof. Dr. Wolfgang E. Nagel
Abstract: Heterogeneous platforms exist in the modern computing landscape almost everywhere. Platforms consisting of multicore CPUs and GPUs have emerged as mainstream computing systems. They are available in many configurations, ranging from supercomputers to embedded systems. They integrate processing units of distinct hardware characteristics and processing capabilities, having the potential to improve application performance. However, with the potential to accelerate the processing, the development complexity and time are also increased. The application designer needs to apply architecture-specific optimization strategies for each processing unit. Moreover, for a given data parallel application, achieving the best performance on a heterogeneous platform depends not only on the performance exploitation of a single processing unit, but also on the careful balancing of the workload between all the platform processing units. From the application-developers perspective, this is a time consuming process, because they have to consider several complex aspects. Firstly, the computing units have different hardware architectures, and memory interconnections. Secondly, desktop systems with multi-core CPUs and discrete GPUs expose different computing capabilities compared to low-power mobile systems with multi-core CPUs and integrated GPUs. Other important aspects are correct synchronization and partial results coherence. Finally, the common programming frameworks still offer machine-experts programing primitives for heterogeneous programming. Many of them offer very similar low level programming primitives. They cover the same semantic but differ only in implementation and syntax. In our opinion, there is a need to raise the level of abstraction and reduce the development complexity. Currently, application developers still need to manually distribute and synchronize the workload execution between the CPU and GPU. To fill this gap and enable simplified collaborative execution on CPU and GPU we present the CoopCL framework. CoopCL includes an abstraction layer that simplifies and accelerates the development of heterogeneous applications. In this work, we propose also an approach, which hides the complexity of workload distribution and synchronization. We have implemented a special GPU-centric data allocator, which ensures the virtual shared-data view for GPU and CPU and thus enables a collaborative CPU-GPU execution. Furthermore, we present an API to the underlying runtime-design that transparently handles the workload splitting, distribution and synchronization. We target the most popular hardware setting, a desktop CPU with discrete GPUs. Furthermore, we evaluate the mobile scenario, with low-power SoC platforms with a CPU and an integrated GPU.