10.10.2018; Kolloquium

ZIH-Kolloquium- Exploring alternative Designs for HPC Interconnects and HPC Processors

Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH)
10:00 - 11:30 Uhr
Willers-Bau A317
Willers-Bau, A-Flügel
Zellescher Weg 12
01069 Dresden
Jens Domke (Tokyo Institute of Technology)
Müller-Pfefferkorn © ZIH

Dr. Ralph Müller-Pfefferkorn

Abteilungsleiter VDR


Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH)

Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH)

Adresse work


Falkenbrunnen, Raum 240 Chemnitzer Str. 46b

01187 Dresden

Kolloquium © ZIH Kolloquium © ZIH


This talk will disseminate the current status and lessons-learned of two work-in-progress HPC projects conducted at the Tokyo Institute of Technology. For the first part, we will introduce our HyperX topology project. The HyperX topology was proposed by HP Labs in 2009. This topology is based on an n-dimensional mesh network with additional links added to the base topology (which only connects to the nearest neighbors in each dimension) to connect to all other switches within each dimension. HyperX should perform similar to a Clos network (a.k.a fat-tree) with respect to bisection bandwidth and other related network metrics, while reducing the costs to build an equally sized supercomputer. However, as of today, no large-scale HPC installation uses this type of network topology. We, at Tokyo Tech, are in the process of building the first multi-Petaflop/s HyperX supercomputer from the remains of the TSUBAME2 system which got replaced by T3 recently. The resulting system will allow us a direct and real-world comparison between a fat-tree and a HyperX topology. The second part of the talk will cover our current efforts to analyze the bottlenecks in modern CPU architectures, specifically the Intel Xeon Phi family. Common wisdom in supercomputing, also partially driven by the Top500 list, is that double precision floating point calculations is what matters, with respect to both application requirements and performance. Accordingly, chip vendors for HPC compute nodes have traditionally allocated a significant portion of chip area to double precision FPUs. We conducted an exhaustive FPU-requirement and performance study using 22 HPC (proxy/mini) applications from various scientific domains, which comprise the majority of CPU cycles in HPC, and which all have been used in the USA and Japan to procure the current generation of supercomputers, such as Summit and Post-K. This study will give us and the rest of the community a valuable insight into the precision or unit requirements, and identified performance bottlenecks in modern HPC codes, to guide the procurement towards more/less FP64, FP32, ..., or faster/bigger memory and caches, or more cores instead.
Jens Domke is a postdoctoral researcher at the GSIC, which hosts the TSUBAME3 supercomputer for the Tokyo Institute of Technology, Japan. He received his doctoral degree from the Technische Universität Dresden in 2017 for his work on HPC routing algorithms and interconnects. Jens contributed the DFSSSP and Nue routing algorithms to the subnet manager of InfiniBand. His research focus is on interconnects, topologies, and
routing algorithms for HPC systems, as well as performance evaluation and optimization of parallel applications.

Zu dieser Seite

Ralph Müller-Pfefferkorn
Letzte Änderung: 06.09.2018