Historie 2013
Vergangene Termine
12. Dezember 2013: zusätzliches Kolloquium mit Andrew Grimshaw (University of Virginia, Charlottesville, USA) The XSEDE Global Federated File System (GFFS) - Breaking Down Barriers to Secure Resource Sharing" (Folien)
Lowering the barriers to collaboration and increasing access to high-end resources will accelerate the pace and productivity of science and engineering. Toward this end, the eXtreme Science and Engineering Discovery Environment (XSEDE) is a single virtual system that allows scientists to seamlessly and interactively share computing resources, data, and expertise. The XSEDE project will allow researches to link and access resources at both domestic and foreign supercomputing centers as well as resources belonging to university campuses and research labs around the world.
The complexity of distributed systems creates obstacles for scientists who wish to share their resources with collaborators. Obstacles include: complex, unreliable, and unfamiliar tools and environments; multiple administrative domains each with their own passwords and file systems; the need to keep track of which resources are on which machines; the need to manually copy files and applications from place to place; the need to monitor and interact with multiple execution services, each with their own idiosyncratic behavior; and the need to manage authorization, identities and groups. The best way to manage complexity and make sharing data and resources possible on a large scale to provide users with a familiar, easy-to-use tool that manages aspects of the collaboration on the user’s behalf.
The first principle of XSEDE’s approach to designing a collaborative interface is familiarity: give the user interaction paradigms and tools that are similar to those she already uses. XSEDE deploys what it calls the Global Federated File System (GFFS) in order to leverage the user’s familiarity with the directory-based paradigm. The GFFS is a global shared namespace designed so that the user can easily organize and interact with files, execution engines, identity services, running jobs, and much more. Many types of resources, such as compute clusters, directory trees in local file systems, and storage resources, can be linked into the GFFS directory structure by resource owners at centers, on campuses, and in individual research labs. GFFS resources can be accessed (subject to access control) in a variety of ways: from the command line (useful for scripting); via a GUI; or by being mapped directly into the local file system. When mapped into the local file system, remote resources can be accessed by existing applications as if they were local resources.
In this talk I will present the GFFS, its functionality, its motivation, as well as typical use cases. I will demonstrate many of its capabilities, including: how to secure shared data with collaborators; how to share storage with collaborators; how to access data at the centers from campus and vice versa; how to create shared compute queues with collaborators who can then schedule jobs on collaboration “owned” resources; how to create jobs and how to interact with them once started. I will present the GFFS’s various access mechanisms, i.e., the GUI and local file system mapping; if facilities permit, I will include this latter mechanism in the live demonstration.
7. November 2013: zusätzliches Kolloquium mit Sascha Hunold (TU Wien): "Can I repeat your parallel computing experiment? Yes, you can’t" (Folien)
Parallel computing as a sub-field of computer science is also an experimental science. Today’s parallel systems are in a total state of flux, so are parallel programming frameworks and languages. Thus, experiments are necessary to substantiate, complement, and challenge theoretical modeling and analysis. As a consequence, experimental work is as important as are advances in theory. Parallel computing contributions are therefore very often based on experimental data, with a substantial part devoted to presenting and discussing the experimental findings. As in all of experimental science, experiments must be presented in a way that makes reproduction by other researchers possible, in principle. Despite appearance to the contrary, we contend that reproducibility plays a small role, and is typically not achieved. As can be found, articles often have an insufficiently detailed description of experiments, and the software used to obtain the claimed results is unavailable. As a consequence, parallel computational results are most often impossible to reproduce, often questionable, and therefore of little or no scientific value. We believe that the description of how to reproduce findings should play an important part in every experiment-based parallel computing research article.
In this talk, I will discuss the reproducibility issue in parallel computing, and elaborate on the importance of reproducible research for (1) better and sounder technical/scientific articles, (2) a sounder and more efficient review process, and (3) more effective collective work. In addition, I will explore the origins of the reproducibility movement and why reproducible science is more relevant than ever, also in parallel computing. I will cover existing approaches (e.g., tools) that could help to solve the reproducibility problem and also name challenges for obtaining reproducible experiments. In particular, I will describe the special requirements of experimental research in parallel computing, which may complicate the reproduction of results.
24. Oktober 2013: Friedel Hoßfeld (Jülich) "Über Wirken und Wirkung John von Neumanns – Anmerkungen zu seinem 110. Geburtstag"
Am 28. Dezember 2013 jährt sich der Geburtstag John von Neumanns zum 110. Mal – ein Anlaß, um dem nachhaltigen Wirken und den tiefen Wirkungsspuren einer der größten Wissenschaftlerfiguren des 20. Jahrhunderts nachzugehen. John von Neumann hat der Mathematik (mit wichtigen Beiträgen zur Axiomatisierung der Mathematik und zum Fundamentalprogramm David Hilberts), der Informatik (mit der Grundlegung der Computerarchitektur und den Grundlagen der Automatentheorie), der Physik (mit der mathematischen Fundierung der Quantenmechanik) und nicht zuletzt den Wirtschaftswissenschaften (als Pionier der Spieltheorie) markante Stempel aufgeprägt. Mythologisierung („schnellstes Gehirn“) und Dämonisierung („Dr. Seltsam“) begleiten seine wissenschaftliche Reputation. John von Neumann starb am 8. Februar 1957 – allzu früh!
14.Oktober 2013, 15:00, Beyerbau 117Z : zusätzliches Kolloquium mit Hermann Engesser (Springer Verlag) "Die e-Welle - Veränderungen in der Verlagswelt am Beispiel von Enzyklopädien und wissenschaftlichen Büchern und Zeitschriften" (Folien)
Hermann Engesser, Programmleiter für Informatik, IT und Elektrotechnik Chefredakteur des Informatik-Spektrums bei SpringerVieweg, stellt in seinem Vortrag die Veränderungen innerhalb der wissenschaftlichen Verlagsbranche dar. Er beleuchtet die Reaktionen im Verlag von zwei Seiten: zum einen die Umwälzungen im Markt der Enzyklopädien, die durch Wikipedia und andere elektronische Medien forciert wurden, zum anderen die generellen Veränderungen innerhalb wissenschaftlicher Verlage, die durch Modifizierung der Prozessketten immer mehr zu globalen „Medienhäusern“ werden. Dabei geht Engesser speziell auf die Themen E-Books, E-Journals und Printing-on-Demand ein.
26. September 2013: David Böhme (GRS Aachen): "Analyse von Lastungleichgewichten in massiv-parallelen Programmen" (Folien)
Auf modernen Supercomputern mit teils mehreren Millionen Prozessorkernen führen Last- oder Kommunikationsungleichgewichte in den Anwendungsprogrammen zu erheblichen Leistungseinbußen. Gängige Leistungsanalysewerkzeuge sind aber oft nicht in der Lage, komplexe Muster von Lastungleichgewichten zu erkennen und ihren Einfluss auf die Ausführungsgeschwindigkeit zu beurteilen. In diesem Vortrag werden zwei neue Verfahren vorgestellt, die Lastungleichgewichte in zuvor aufgezeichneten Event Traces identifizieren und den Nutzer intuitiv zu den Schwachstellen mit dem größten Optimierungspotential führen. Das erste der beiden Verfahren untersucht den Einfluss von Ungleichgewichten auf Wartezeiten an nachfolgenden Synchronisationspunkten. Dadurch können die Ursachen dieser Wartezeiten lokalisiert und die Kosten von Ungleichgewichten in Bezug auf die entstandenen Wartezeiten bestimmt werden. Das zweite Verfahren untersucht den Einfluss von Ungleichgewichten auf die Programmlaufzeit anhand der Analyse des kritischen Pfades, wodurch auch komplexe Phänomene auf einfache Weise erkennbar werden. Skalierbarkeit und Nutzen der beiden Verfahren werden anhand einer Reihe von Beispielen mit realen HPC-Anwendungen auf bis zu 262,144 Prozessorkernen gezeigt.
13. September 2013, 10:00, WIL C207: zusätzliches Kolloquium mit Anthony A. Maciejewski (Colorado State University) - "Bi-Objective Optimization for Scheduling in Parallel Computing Systems" (Folien)
Most challenging engineering problems consider domains where there exist multipleobjectives. Often the different objectives will conflict with each other and these conflictsmake it difficult to determine performance trade-offs between one objective and another.Pareto optimality is a useful tool to analyze these trade-offs between the two objectives.To demonstrate this, we explore how Pareto optimality can be used to analyze the
trade-offs between makespan and energy consumption in scheduling problems for heterogeneous parallel computing systems. We have adapted a multi-objective genetic algorithm from the literature for use within the scheduling domain to find Pareto optimal solutions. These solutions reinforce that consuming more energy results in a lower makespan, while consuming less energy results in a higher makespan. More interestingly, by examining specific solution points within the Pareto optimal set, we are able to perform a deeper analysis about the scheduling decisions of the system. These insights in balancing makespan and energy allow system administrators to efficiently operate their parallel systems based on the needs of their environment. The bi-objective optimization approach presented can be used with a variety of performance metrics including operating cost, reliability, throughput, energy consumption, and makespan.
13. September 2013, 11:00, WIL C207: zusätzliches Kolloquium mit Howard Jay Siegel (Colorado State University) - "Energy-Aware Robust Resource Management for Parallel Computing Systems" (Folien)
In heterogeneous parallel and distributed computing systems, there are a collection of interconnected different machines. A critical research problem is energy-aware allocation of resources to tasks to optimize some performance objective, possibly under a given constraint. Often, these allocation decisions must be made when there is uncertainty in relevant system parameters, such as the data-dependent execution time of a given task on a given machine. It is important for system performance to be robust against uncertainty. We have designed models for deriving the degree of robustness of a resource allocation using history-based stochastic (probabilistic) information. Energy-aware robust resource allocation heuristics for two example environments will be presented. The first involves static heuristics, which are executed off-line for production “bag-of-tasks” environments. The goal is to minimize energy given a robustness constraint based on a common deadline. The second environment involves dynamic heuristics, which are executed on-line for situations where tasks must be assigned resources as they arrive into the system. The goal is to complete as many tasks as possible by their individual deadlines, with a constraint on total energy consumption. These energy-aware resource management approaches presented can be applied to a variety of computing and communication system environments.
5. Juli 2013, 10:00, WIL C207: zusätzliches Kolloquium mit Ioana Banicescu (Mississippi State University): "Towards a Technology for Robust and Cost-Effective Autonomic Execution of Scientific Applications"
Computational science and engineering research communities are continuously interested in solving problems of increased complexity. The rapid development of computing technology has increased the complexity of computational systems and the ability to solve large and more complex scientific problems.
Over the years, the computing technology has benefited from many research advances in architecture, hardware platforms and software environments, programming models, algorithms, and from many tools and techniques that evolved from these advances.
Many scientific problems are intractable (very large, complex), often exhibit irregular and stochastic behavior (data or time-dependent), and therefore require adaptive algorithms. The resulting scientific applications run on heterogeneous environments (clusters, grids, clouds) which often are expected to offer an efficient, robust, and cost-effective (high utility and green) execution to multiple applications.
A number of solutions involving adaptivity have been proposed and implemented at the application and system levels. They often rely on adaptive algorithms and optimization techniques which may use probabilistic analyses, and other approaches such as: queuing theory, a control-theoretic approach, a machine learning approach, a biologically-inspired approach, and others. During the last decade, an autonomic computing approach has been proposed as a solution to system complexity due to self-management capabilities known to be exhibited by an autonomic computing system.
In this talk, she will present the challenges with which the research community is confronted in addressing these issues at application and system levels, and she will focus on a few recent steps taken towards a technology that would enable simultaneously at both levels a robust, cost-effective execution of scientific applications using an autonomic computing approach.
13. Juni 2013: zusätzliches Kolloquium mit Greg Stuart und Scott Michael (SciApT, Indiana University): "The Power of User Statistics: Using analytics to improve service to end users"
In this talk I will present an overview of the statistics tracking system developed for use on supercomputing platforms by the Research Technologies division at Indiana University (IU). Combining and viewing statistical data from many sources including multiple compute platforms, high performance file systems, archive systems, can be a daunting task. Finding useful and actionable information in this data presents an even greater challenge. However, by aggregating data from multiple resources we have been able to gain a more comprehensive view of users activity and identify users whose workflows could be improved. In addition to describing the basic design and infrastructure of the system, I will present several use cases where we have been able to improve user experience and increase the usage efficiency of IU supercomputing resources.
23. Mai 2013: Horst Malchow (Universität Osnabrück, Institut für Umweltsystemforschung): "Pattern formation in non-equilibrium systems"
The exploration of pattern formation mechanisms in nonlinear complex systems is one of the central scientific problems. The development of the theory of self-organized temporal, spatial or functional structuring of nonlinear systems far from equilibrium has been one of the milestones of structure research. The occurrence of multiple steady states and transitions from one to another after critical fluctuations, the phenomena of excitability, oscillations, waves and, in general, the emergence of macroscopic order from microscopic interactions in various nonlinear nonequilibrium systems in nature and society has required and stimulated many theoretical and, if possible, experimental studies.
Mathematical and computational modelling has turned out to be one of the useful methods to improve the understanding of such structure generating mechanisms. After a more general introduction, examples from population dynamics are presented.
25. April 2013: Wolfgang Frings (FZ Jülich) "SIONlib: Scalable Massively Parallel I/O to Task-Local Files"
Parallel applications often store data in multiple task-local files, for example, to create checkpoints, to circumvent memory limitations, or to record performance data. When operating at very large processor configurations, such applications often experience scalability limitations when the simultaneous creation of thousands of files causes metadata-server contention. Furthermore, large file counts complicate the file management and operations on those files can even destabilize the file system. Even if a parallel I/O library is used and task-local files are replaced by a single shared file, new meta-data bottlenecks will be observed especially at very large scale. This talk will introduce SIONlib, a parallel I/O library that addresses also those new bottlenecks by transparently mapping a large number of task-local files onto a small number of physical files via internal metadata handling and by block alignment to ensure high performance. By discussing the design principles of SIONlib we address also the challenges that will arise on upcoming exascale systems and show how they could be solved with a software-only approach.
28. März 2013: Ulrich Kähler(DFN) "Die Authentifizierung- und Autorisierungs-Infrastruktur des Deutschen Forschungsnetzes (DFN-AAI)- - sicherer und einfacher Zugang zu geschützten Ressourcen"
Der DFN-Verein betreibt eine Authentifizierung- und Autorisierungs-Infrastruktur (DFN-AAI), um Nutzern von Einrichtungen aus Wissenschaft und Forschung (Teilnehmer) über das Wissenschaftsnetz einen Zugang zu geschützten Ressourcen (z.B. wissenschaftlichen Veröffentlichungen, lizenzpflichtiger Software, Großrechnern, GRID-Ressourcen) von Anbietern zu ermöglichen. Nutzer, die auf geschützte Ressourcen zugreifen wollen, können sich an ihrer Heimateinrichtung authentifizieren und nach Übertragung der zur Autorisierung notwendigen Daten (Attribute) Zugang zu den Ressourcen erlangen.
Der DFN-Verein koordiniert in Rücksprache mit den Teilnehmern und Anbietern die Modalitäten und Richtlinien für die Kommunikation innerhalb der DFN-AAI und passt sie dem technischen Fortschritt an.
28. Februar 2013: Andreas Dress (Bielefeld) "Modelling, Simulating and Analysing Structure Formation in Tissue and in Cellular Automata"
Structure formation has fascinated mankind ever since antiquity. And, after the introduction of infinitesimal calculus, partial differential equations appeared for a long time to constitute the only proper mathematical methodology to generate and analyse spatio-temporal models of structure formation. So, it came as a slight surprise that even rather coarse-grained and simplistic CA-type models could capture essential features of structure formation processes. In the lecture, I will first recall how we first learned about this almost 30 years ago when, jointly with Martin Gerhardt, Nils Jaeger, Peter Plath, and Heike Schuster, we tried to analyse heterocatalytic processes on metal surfaces and, this way, discovered that such processes might potentially form interesting spatial-temporal patterns -- a finding that was confirmed only later by Ronald Imbihl in a number of beautiful experiments (now in Hannover, then working at Gerhard Ertl's lab in Berlin). I will then go on to present various CA-type models that were later developed jointly with Peter Serocka at Bielefeld in the 1990s and clearly exhibited striking phase transitions in "pattern space", going e.g. from patterns of ever turning spirals to patterns of "growing, intermingling, and solidifying empires" upon one slight change of parameters for one time step, only, -- thus furnishing intriguing metaphors for the onset and establishment of e.g. cancer in healthy tissue caused (perhaps) by a slight disturbance of cell metabolism. Finally, I will turn to a mathematical analysis of certain CA dynamics and discuss some mathematical tools like discrete Fourier transforms that, as was recently shown by LIN Wei from Fudan U, can be used to treat at least some aspects of these models in proper mathematical terms.
Zusätzliches Kolloquium am 7. Februar 2013: Ivo Sbalzarini (Max-Planck-Institut für Molekulare Zellbiologie und Genetik, Zentrum für Systembiologie Dresden) - The Parallel Particle Mesh (PPM) library and a domain-specific language for particle methods
Particle methods provide a unifying framework for simulations of both discrete and continuous models. We present the PPM library, a scalable and transparent middleware for hybrid particle-mesh simulations on distributed-memory parallel computers. We discuss recent progress in hybrid multi-threading/multi-processing and adaptive-resolution simulations. In addition, we present a domain-specific language for parallel particle-mesh methods and the associated compiler and programming environment.
24. Januar 2013: Andre Brinkmann (Johannes Gutenberg-Universität Mainz) "HPC Storage: Challenges and Opportunities of new Storage Technologies" (Folien)
Storage systems have been regarded as a necessary, but mostly uninteresting component of high performance computers. Several trends are currently changing this role.
First of all, the amount of data written and read from HPC storage is increasing at an incredible speed even in the domain of traditional HPC. While, e.g., checkpointing has been a common, but feasible task in mid-sized cluster environments, it becomes extremely costly and frequent in multi-petabyte installations. A second trend is that several new data centric applications from the life science and physics domain appear and that HPC is widening its realm to include these applications from the field of big data and data analytics. This new role of storage is reinforced by a widening performance gap between processors and traditional hard disks.
This talk will cover several techniques to use existing hpc storage more efficiently and to include big data solutions within an HPC center as well as approaches to include new storage technologies, like flash and phase-change memory in the HPC stack.
Vergangene Jahre:
Historie 2012
Historie 2011
Historie 2010
Historie 2009
Historie 2008
Historie 1998 - 2007