Forschungsthemen
[BA] Implementation and Analysis of a Software Framework for the Evaluation of Machine Learning required Data Transposition Technology Evaluation
With regard to the emergent technologies of Machine Learning (ML) and the challenges to gain prots from heterogeneous and varying Big Data, it is essential to dene and develop ecient interfaces between techniques or platforms to embed them. In particular, the prerequisites for data from ML techniques w.r.t. structure, dimensionality, scaling, format heterogeneity <6> and value domains require expensive preparation steps and impede the exploration the technologies' suitability for specic use cases, as well as the eciency of the entire algorithm pipeline. There exist several techniques such as the Principal Component Analysis <5, 1>, Multi-Dimensional Scaling <2> or the t-SNE algorithm <3> and according frameworks to treat these problems. In fact, they lack of systematic reviews and investigations to compare their performance w.r.t. to transposition quality and run-time scaling. To make progress in this scientic area, this thesis should investigate how an automated approach to test a set of these data transformation tech- niques can be designed, implemented and tested. As no large-scaling data sets are available, a generic approach should be developed <4>. <1> Rasmus Bro and Age K. Smilde. Principal component analysis. Anal. Methods, 6:28122831, 2014. <2> Stephen L France and J Douglas Carroll. Two-way multidimensional scaling: A review. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 41(5):644661, 2011. <3> Laurens van der Maaten and Georey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(Nov):25792605, 2008. <4> Karsten Wendt. Multi-Objective Optimization Utilizing Cluster Analysis Applied to Dimen- sional Transposed Problems. PhD thesis, Technische Universit"at Dresden, 2016. <5> Svante Wold, Kim Esbensen, and Paul Geladi. Principal component analysis. Chemometrics and intelligent laboratory systems, 2(1-3):3752, 1987. <6> Lina Zhou, Shimei Pan, Jianwu Wang, and Athanasios V Vasilakos. Machine learning on big data: Opportunities and challenges. Neurocomputing, 237:350361, 2017.
Betreuer: Karsten Wendt