Apr 04, 2019; Colloquium
ZIH Colloquium: “High performance computing using low performance computers: How to perform advanced pattern detection in π using minimal resources”
For the past two decades, the big data concept has been invading our lives. A huge variety of electronic devices produce enormous numbers of data every day. A complex and diverse universe of mobile, stationary, wearable and other kinds of devices and sensors floods our computers with vast datasets, defined by the three, five or seven Vs
of big data. However, another parallel, quiet, very beautiful and infinite universe awaits to surprise us with its simplicity and potential in big data science: mathematics. Number theory provides us with the biggest dataset we could ever imagine having access to. With the incredible number of 31.4 trillion decimal digits or 28.57 TB (Emma
Haruka Iwao, 2019), π is the biggest, most intrinsic and fascinating, single piece of information we probably know so far; and it gets bigger. Remove, alter or insert just one digit and we have something completely different. Either we can analyze π as it is, or we cannot. In this lecture we will examine how π, irrational numbers in general, can be used to define the problem of detecting all repeated patterns existing in a string. Novel, problem specific, pattern detection data structures and algorithms will be presented, which allow us to perform amazing results, very fast, using limited resources. We will examine how methodologies for big data analytics, built on top of these data structures and algorithms, can be used in supercomputers to perform instant results by maximum utilization of parallelization. Furthermore, we will investigate the importance of these methodologies and how the outcome of all repeated patterns detection can formulate the basis for solving problems in many other artificial intelligence and data analytics fields, such as text mining, frequent and non-frequent sequential itemsets detection, network security, anomaly detection, bioinformatics, forecasting, recommendation systems, image analysis and many more.
Konstantinos holds a PhD in computer science from the University of Calgary, AB, Canada. He has professional experience in software engineering and financial analysis while currently holds a position of big data scientist in powertrain industry. He specialized in data structures and algorithms for single, multiple and all repeated patterns detection, existing in single, multivariate or multi-dimensional sequences, with simultaneous space and time complexity optimization. He has several publications with applications in many, diverse, fields such as data/text mining, time series analysis, web/network analytics and security, sequential itemsets detection and many more.