# Komplexpraktikum Theoretische Informatik

### Beschreibung

Im Master-Praktikum (Modul *INF-MA-PR*) sowie im Profilprojekt (Modul *INF-PM-FPG*) wird eine wissenschaftliche Aufgabenstellung unter Anleitung eines Mitarbeiters des Lehrstuhls für Automatentheorie bearbeitet. Das Ziel ist es, mit den Grundlagen des wissenschaftlichen Arbeitens vertraut zu werden. Dazu gehören, je nach Thema, die Literaturrecherche, Erarbeitung eigenständiger Forschungsergebnisse, sowie Implementierungsarbeiten. Falls das Thema eine Implementierung beinhaltet, soll diese gut strukturiert und dokumentiert werden. Zum erfolgreichen Abschluss des Praktikums gehört eine schriftliche *Dokumentation* (ca. 15 Seiten) sowie eine mündliche *Präsentation* (30 Minuten) vor Mitarbeitern des Lehrstuhls.

### SWS/Module

Das Praktikum wird für folgende Module angeboten:

*INF-PM-FPG*(Profilprojekt Grundlagenforschung) in den Studiengängen Master Informatik und Diplom Informatik

Im Rahmen dieses Moduls umfasst das Praktikum -/-/8 SWS.

Koordination: Prof. Franz Baader*INF-MA-PR*(Forschung und Entwicklung in der Informatik) im Studiengang Master Informatik

In diesem Fall kann die Bearbeitung des Praktikums im Umfang von -/-/4 SWS oder -/-/8 SWS erfolgen. Es muss im Vorfeld mit dem Betreuer abgesprochen werden, welche Option gewählt wird.

Koordination: Dr. Stefan Borgwardt

### Themen

Die folgenden Themen werden am **18.04.2018** um **11:10** im Raum **APB/3027** genauer vorgestellt (auf Englisch).

It is well-known that regular languages are recognised by finite automata and are closed under union, intersection, complement, and also language operations, such as concatenation and quotient. However, some operations, like complement or intersection of linearly many automata, require exponentially larger automata.

Tree-like automata were considered in [1] in order to represent languages of the form \(K\cup L \Sigma^*\) for finite languages \(K,L\) and they demonstrate better closure properties. The aim of this project is to develop the theory of tree-like automata: after formally introduce them, prove their closure properties (or, rather, of the languages they accept) providing suitable constructions, and compute the complexity of such constructions. Furthermore, the study can consider extensions and limitations of tree-like automata.

This project requires basic knowledge of automata theory.

*Literature:*

- F Baader, R Küsters, A Borgida, DL McGuinness; Matching in description logics, Journal of Logic and Computation, Volume 9, Issue 3, 1 June 1999, Pages 411–447, https://doi.org/10.1093/logcom/9.3.411

*Tutor:* Pavlos Marantidis

For lightweight description logics such as \(\mathcal{EL}\), which is often used to formulate medical ontologies, subsumption algorithms additionally compute a so-called canonical model, which satisfies exactly those subsumption relationships that follow from the ontology. Such a model can be viewed as a graph, in which nodes and edges are labeled by concepts and relations. The goal of this project is to visualize properties of this model. The student should 1) transform the internal representation of the canonical model from \(\mathcal{EL}\)-reasoners (e.g., ELK https://www.cs.ox.ac.uk/isg/tools/ELK/) into a form suitable for visualization; 2) display the canonical model, even for large ontologies with more than 300,000 concepts; 3) provide functions to visualize properties of the model (e.g., color all elements that belong to concept \(C\)).

This project will be co-supervised by the Chair of Multimedia-Technology.

*Tutor*: Stefan Borgwardt

An important operation for ontologies is to compute the subsumption hierarchy, i.e., all subconcept-superconcept relationships between the concepts defined by the ontology. The goal of this project is to develop suitable visualisation approaches for this hierarchy. The student should 1) transform the internal representation of the subsumption hierarchy (http://owlcs.github.io/owlapi/) into a form suitable for visualization; 2) display the subsumption graph, even for large ontologies with more than 300,000 concepts; 3) provide functions to zoom into the neighborhood of a concept and to mark all sub- or superconcepts.

This project will be co-supervised by the Chair of Multimedia-Technology.

*Tutor*: Stefan Borgwardt

Ontologies are often used for integrated querying of heterogeneous data sources, such as classical databases or RDF triple stores. Query rewriting is a key technique for efficient query answering. The aim of the project is to adapt an existing query rewriting algorithm for Datalog+/- ontologies [1] to the case of so-called multi-linear Datalog+/- rules, prove its correctness, and consider the impact of standard optimization techniques.

The project requires basic knowledge of first-order logic and proof techniques.

*Literature*:

- Georg Gottlob, Giorgio Orsi, and Andreas Pieris. Query rewriting and optimization for ontological databases. ACM Transactions on Database Systems, 39(3), pp. 25:1–25:46, 2014.

*Tutor*: Stefan Borgwardt

Ontologies formulated in description logics are used to define terminological knowledge in a range of areas such as medicine, biology, the semantic web, and others. Modern ontologies often cover large vocabularies, such as the SNOMED CT ontology used in many national healthcare services, which defines over 300,000 concepts.

Maintaining and understanding these ontologies is hindered by their complexity, especially since a lot of information is implicit. A technique to help with this is uniform interpolation [2]. Given a vocabulary of concepts and relations, uniform interpolation computes a restricted view of the ontology, that uses only terms from this vocabulary. As such, it allows to get insights about relations between concepts of interest, which helps in understanding and debugging of complex ontologies.

While there are implemented systems for uniform interpolation, they are currently not integrated into common ontology editors. The goal of this project is to solve this issue, by developing a plugin for the popular ontology editor Protégé, using the uniform interpolation library LETHE [1]. The plugin will allow ontology users and developers to select a vocabulary, and then show a view of the uniform interpolant, which has to be represented in a convenient way.

The project requires experience in Java programming as well as some foundational knowledge on description logics.

*Literature*:

- Patrick Koopmann, Renate A. Schmidt. LETHE: Saturation-based reasoning for non-standard reasoning tasks. In Proceedings of the 4th OWL Reasoner Evaluation Workshop (ORE), pp. 23–30, 2015.
- Patrick Koopmann, Renate A. Schmidt: Forgetting concept and role symbols in ALCH-ontologies. In Proceedings of the 19th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning (LPAR), pp. 552–567, 2013.

*Tutor*: Patrick Koopmann

The description logic \(\mathcal{EL}\) is used to express biomedical knowledge in many important Semantic Web ontologies, such as SNOMED CT or the Gene Ontology. Standard and non-standard reasoning tasks can help with building and maintaining these ontologies. For example, computing the difference between two concepts is useful in many ways: it can explain the concept hierarchy of the ontology ("What makes A a subconcept of B?") or refine answers to relaxed queries ("Under which assumptions do the instances of B become instances of A?"). However, existing definitions of concept difference operators [1] do not adequately deal with the structure of \(\mathcal{EL}\) concepts.

The goal of this project is to investigate a new definition for computing the concept difference, and determine its properties, such as existence, uniqueness, and computational complexity.

This project requires basic knowledge of logic-based knowledge representation.

*Literature*:

- Gunnar Teege. Making the difference: A subtraction operation for description logics. In Proceedings of the 4th International Conference on Principles of Knowledge Representation and Reasoning (KR), pp. 540–550, 1994.

*Tutor*: Oliver Fernández Gil

The light-weight description logic \(\mathcal{EL}\) has gained serious attraction for knowledge representation and reasoning, mainly due to its tractable reasoning problems and its easy readability. For instance, the Web Ontology Language (OWL) includes the profile "EL", which is based on \(\mathcal{EL}\). In this logic, the common reasoning problems still have polynomial time complexity, e.g., the problem of subsumption between two concept descriptions with respect to a TBox.

In the past, several non-standard reasoning tasks in \(\mathcal{EL}\) have been defined and their complexities have been investigated. The goal of this project is to consider a new non-standard reasoning task, namely that of the neighborhood of \(\mathcal{EL}\) concept descriptions. In particular, we say that an \(\mathcal{EL}\) concept description \(C\) is a lower neighbor of an \(\mathcal{EL}\) concept description \(D\) with respect to an \(\mathcal{EL}\) TBox \(\mathcal{T}\) if \(C\) is strictly subsumed by \(D\) w.r.t. \(\mathcal{T}\) and furthermore "there is no \(\mathcal{EL}\) concept description in between", i.e., each \(\mathcal{EL}\) concept description \(E\) with \(\mathcal{T}\models C\sqsubseteq E\sqsubseteq D\) is either equivalent to \(C\) or equivalent to \(D\). This topic can be investigated in various directions:

- \(\mathcal{T}\) can be empty.
- Determining the complexity of deciding neighborhood.
- Design and implementation of an efficient procedure for deciding neighborhood.
- Design and implementation of an efficient procedure for computing all upper/lower neighbors.
- More expressive description logics, e.g., \(\mathcal{ALC}\), or \(\mathcal{EL}\) with greatest fixed-point semantics.
- etc.

This project requires basic knowledge about description logics.

*Literature*:

- Franz Baader, Ian Horrocks, Carsten Lutz, Ulrike Sattler. An Introduction to Description Logic. Cambridge University Press, 2017.
- Francesco M. Donini, Simona Colucci, Tommaso Di Noia, Eugenio Di Sciascio: A Tableaux-Based Method for Computing Least Common Subsumers for Expressive Description Logics. In Proceedings of the 21st International Joint Conferences on Artificial Intelligence (IJCAI), pp. 739-745, 2009.

*Tutor*: Francesco Kriegel

Formal Concept Analysis (FCA) was developed as an attempt to formalize the philosophical notion of a “concept”. In particular, it has a strong correspondence to propositional logic, and provides methods for handling implicational knowledge. As a famous example, the canonical base of a given formal context \(\mathbb{K}\) is a minimal implication set which is sound and complete for the implicational knowledge of \(\mathbb{K}\). For the entailment relation between implication sets, both a semantic and several equivalent syntactic characterizations exist.

Recently in [2], a probabilistic extension of FCA was investigated, and a technique for the axiomatization of implications over probabilistic attributes was introduced. However, the implicational entailment was only described semantically, and no sound and complete syntactic deduction method exists. The goal of this project is to find such a syntactic characterization, and to provide an appropriate implementation. Since the logic used in [2] can be embedded in the logic from [1], it may be helpful to consider [1] for this task.

*Literature*:

- Ronald Fagin, Joseph Y. Halpern, Nimrod Megiddo. A logic for reasoning about probabilities. Information and Computation 87(1/2), pp. 78–128, 1990.
- Francesco Kriegel. Implications over probabilistic attributes. In Proceedings of the 14th International Conference on Formal Concept Analysis (ICFCA), pp. 168–183, 2017.

*Tutor*: Francesco Kriegel

The light-weight description logic \(\mathcal{EL}\) has gained serious attraction for knowledge representation and reasoning, mainly due to its tractable reasoning problems and its easy readability. For instance, the Web Ontology Language (OWL) includes the profile "EL", which is based on \(\mathcal{EL}\). In this logic, the common reasoning problems still have polynomial time complexity, e.g., the problem of subsumption between two concept descriptions with respect to a TBox.

Knowledge bases can not only be built manually, but there are also means for an automatic construction. In particular, in [1] it is described how the theory of concept inclusions valid in a given interpretation can be automatically axiomatized in a sound and complete manner. A weakness of the approach is that no existing knowledge in form of a TBox can be incorporated in the construction. Later, [2] provided an appropriate extension of the axiomatization procedure which took both an interpretation and a TBox as input. This construction makes heavy use of so-called model-based most specific concept descriptions relative to a TBox. So far, it was not investigated how to check their existence, and how to compute them effectively; both questions shall be addressed in this project. Ideas from [3] are helpful for answering these questions.

This project requires basic knowledge about description logics.

*Literature*:

- Pages 60–68 in: Felix Distel. Learning description logic knowledge bases from data using methods from formal concept analysis. PhD thesis, Technische Universität Dresden, 2011.
- Francesco Kriegel. Incremental learning of TBoxes from interpretation sequences with methods of Formal Concept Analysis. In Proceedings of the 28th International Workshop on Description Logics (DL), pp. 452–464, 2015.
- Benjamin Zarrieß, Anni-Yasmin Turhan. Most specific generalizations w.r.t. general \(\mathcal{EL}\)-TBoxes. In Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (IJCAI), pp. 1191–1197, 2013.

*Tutor*: Francesco Kriegel

Formal Concept Analysis (FCA) was developed as an attempt to formalize the philosophical notion of a “concept”. In particular, it has a strong correspondence to propositional logic, and provides methods for handling implicational knowledge. As a famous example, the canonical base of a given formal context \(\mathbb{K}\) is a minimal implication set which is sound and complete for the implicational knowledge of \(\mathbb{K}\). If the context describing the domain of interest is not complete, but experts for this domain are available, then a technique called “Attribute Exploration” [1], can be used to construct, in interaction with the experts, a minimal set of implications that describes the considered domain in a sound and complete manner.

A disadvantage of the existing variants of Attribute Exploration is that in each step only one question can be posed to the experts, and the algorithm has to wait for the answer before it can continue; this renders the method impractical for larger domains. In [2], a variation was introduced that allows for parallel interaction with the experts. The goal of this project is to design and implement a platform for collaborative knowledge acquisition, which utilizes the parallel Attribute Exploration at its core. The implementation shall have a client-server architecture, and an appropriate use case shall be sought and tested.

*Literature*:

- Bernhard Ganter. Two basic algorithms in concept analysis. In Proceedings of the 8th International Conference on Formal Concept Analysis (ICFCA), pp. 312–340, 2010.
- Francesco Kriegel. Parallel attribute exploration. In Proceedings of the 22nd International Conference on Conceptual Structures (ICCS), pp. 91–106, 2016.

*Tutor*: Francesco Kriegel

For classical Descritpion Logics (DLs) it is often insufficient to reason with monotone semantics, especially when aspects of human cognition such as belief revision are to be modelled. A recent approach to extend DLs by nonmonotonic semantics is using so-called typicality models [1], which extend classical canonical models for the description logic \(\mathcal{EL}_{\bot}\). The goal of this project is to extend the completion procedure that is used to construct classical canonical models in \(\mathcal{EL}_{\bot}\) [2], in order to allow for defeasible knowledge bases and construct minimal and maximal typicality models.

This project requires basic knowledge about description logics.

*Literature*:

- Maximilian Pensel, Anni-Yasmin Turhan. Including quantification in defeasible reasoning for the description logic \(\mathcal{EL}_{\bot}\). In Proceedings of the 14th International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR), pp. 78–84, 2017.
- Franz Baader, Sebastian Brandt, Carsten Lutz. Pushing the \(\mathcal{EL}\) envelope. In Proceedings of the 19th International Joint Conferences on Artificial Intelligence (IJCAI), pp. 364–369, 2005.

*Tutor*: Maximilian Pensel

In case where data is noisy and can contain errors, knowledge bases can easily become inconsistent. In such settings inconsistency-tolerant semantics for reasoning can be employed. If data is collected over a time span, the resulting data is often time-stamped admitting queries that refer to different points in time over a temporal knowledge base.

In this project the temporal instance queries (i.e. queries that retrieve instances of a concept) over a temporal knowledge base shall be investigated in combination with inconsistency-tolerant semantics. More precisely, instance queries combined with LTL (linear temporal logic) operators shall be investigated in regard of computational complexity and/or an algorithm for one of the simpler settings shall be devised and shown to be correct (and perhaps even implemented).

This project requires basic knowledge about description logics.

*Literature*:

- Rafael Penaloza. Inconsistency-Tolerant Instance Checking in Tractable Description Logics. International Joint Conference on Rules and Reasoning, 2017.
- Camille Bourgaux and Anni-Yasmin Turhan: Temporal Query Answering in DL-Lite over Inconsistent Data. In Proceedings of the 16th International Semantic Web Conference (ISWC 2017), LNCS 2017.

*Tutor*: Anni-Yasmin Turhan

*Eigene Themenvorschläge* sind auch sehr willkommen.

### Organisation

Interessierte Studierende nehmen bitte bis zum *31.04.2018* Kontakt zu Prof. Franz Baader, Dr. Stefan Borgwardt, oder direkt zum jeweiligen Betreuer auf (siehe Themenliste).

Falls das Praktikum bis zum Ende des Semesters (*30.09.2018*) nicht abgeschlossen ist, kann die Leistung nicht anerkannt werden. Es liegt in der Verantwortung des Studierenden, die Arbeit rechtzeitig zu beginnen, und regelmäßige Treffen mit dem Betreuer zu vereinbaren.