HPC Systems
ZIH operates a high performance computing system with more than 60.000 cores, 720 GPUs, and a flexible storage hierarchy with about 16 PB total capacity. The HPC system provides an optimal research environment especially in the area of data analytics and machine learning as well as for processing extremely large data sets. Moreover it is also a perfect platform for highly scalable, data-intensive and compute-intensive applications.
With shared login nodes and file systems our HPC system enables users to easily switch between the components, each specialized for different application scenarios. Pre-installed software environments allow for a quick start. To access our HPC resources, a short project application is required.
Table of contents
For data-intensive and compute-intensive HPC Applications
The High Performance Computing and Storage Complex (HRSK-II) by Bull/Atos provides the major part of the computing capacity available at ZIH, especially for highly parallel, data-intensive and compute-intensive HPC applications.
Typical applications: FEM simulations, CFD simulations with Ansys or OpenFOAM, molecular dynamics with GROMACS or NAMD, computations with Matlab or R
- ca. 40.000 cores (Intel Haswell and Broadwell)
- Memory configuration: typically 2.6 GB/core, up to 36 GB/core
- 256 GPUs (Nvidia K80)
- Documentation
For HPC Data Analytics and Machine Learning
The extension High Performance Computing – Data Analytics (HPC-DA) is available for users in entire Germany. Its flexibility allows to combine the different technologies of HPC-DA to individual and efficient research infrastructures. Especially for applications in the area of machine learning and deep learning 192 powerful Nvidia V100 GPUs are installed. Resources can be used also interactively, for example with Jupyter Notebooks. For data analytics on CPUs a cluster with high memory bandwidth is provided. To efficiently access large data sets 2 petabytes of flash memory with a total bandwidth of about 2 terabytes/s are available. Additionally, 272 Nvidia A100 GPUs are provided especially for machine learning applications in ScaDS.AI.
Typical applications: training of neural nets with Tensorflow (deep learning), data analytics with Big Data frameworks such as Apache Spark
- 34 AMD Rome nodes, each 8 Nvidia A100 GPUs for machine learning (primarily for ScaDS.AI)
- 32 IBM Power 9 nodes, each 6 Nvidia V100 GPUs for machine learning
- 192 AMD Rome nodes, each 128 cores, 512 GB RAM with 400 GB/s bandwidth
- 2 PB fast flash memory (NVMe)
- 10 PB archive with access via S3, Cinder, NFS, QXFS
-
For Processing of extremely large Data Sets
The shared memory system HPE Superdome Flex is especially well suited for data intensive application scenarios, for example to process extremely large data sets completely in main memory or in very fast NVMe memory. As a part of HPC-DA the system is also available for users in entire Germany.
Typical applications: applications that require a very large shared memory, such as genome analysis
- Shared memory system with 32 Intel CascadeLake CPUs with total 896 cores
- 48 TB main memory in a shared address space
- 400 TB NVMe memory cards as a very fast local storage
- Documentation