Extension of a Datalog Reasoner with Top-down Evaluation

Type:

Bachelor

Student name:

Christop Fuchs

Download:

Attachment	Size
christoph-fuchs-thesis-final-09-2008.pdf	1.54 MB

Datalog [1] is a simple yet universal knowledge representation formalism. Its main characteristic is that it is a rule-based formalism, where one can express simple if-then rules to describe derivations of knowledge. Further, datalog underlies many industrial data management systems, i.e. it is the core (and extends) the well-known relational datamodel and SQL as the respective industrial query language.
Models in datalog are expressed as so-called programs, i.e. a set of simple rules. These rules capture schematic knowledge about the models. An example is the following rules describing reachability in a graph.
(1) reachable(?x,?y) <= edge(?x, ?y)
(2) reachable(?x,?y) <= reachable (?x, ?z) and edge(?z, ?y)

Further, a datalog program contains so-called facts, which is assertional knowledge or a database respectively.
In our graph example, this could be the edges of a specific graph:
edge(bregenz, innsbruck)
edge(innsbruck, salzburg)
edge(innsbruck, brenner)
edge(innsbruck, muenchen)
edge(salzburg, linz)
edge(linz, steyr)
edge(linz, wien)
edge(muenchen, wien)
edge(wien, muenchen)
edge(muenchen, augsburg)
edge(augsburg, ulm)
located-in(muenchen, germany)
located-in(augsburg, germany)
located-in(ulm, germany)
located-in(innsbruck, austria)
...
The fundamental task to be solved by a datalog engine is query answering, which is database-style queries. An example is to ask what cities in Germany can be reached from Innsbruck in our specific model (i.e. the graph described above):
?- reachable(innsbruck, ?destination) and located-in(?destination, germany)
which would result in a range of answers (bindings for the variable ?destiantion in the query): muenchen, augsburg, ulm ...
There are various algorithms how to compute the result for a given query. All these algorithms have their own strengths and weaknesses. One specific class
of evaluation algorithms are so-called top down evaluation strategies [1,2,3].
We are currently developing a datalog engine called IRIS [4] which implements a couple of standard evaluation strategies on top of common datastructures. So far, the implementation of IRIS supports only bottom-up evaluation strategies. The aim of this project is to implement top-down evaluation strategies within the IRIS system and to evaluate the method wrt. to other algorithms.

Implementation will be done in Java.

The project provides some insight in some interesting topic in the intersection of knowledge representation and databases, allows you to learn about a specific elegant evaluation strategy for queries over datalog programs and to learn how to integrate it in an existing system. The focus of the project is design and implementation of a component providing a specific functionality, as well as to develop sufficient understanding of the theory and techniques for top-down evaluation strategies. Finally, one might want to think about optimizations of the implementation.

References
[1] J.D. Ullman. Principles of Database and Knowledge-Base Systems. Volume 1. Computer Science Press, 1995.
[2] A.B. Cremers, U. Griefahn, R. Hinze: Deduktive Datenbanken, Vieweg, 1994.
[3] S. Ceri, G. Gottlob, L. Tanca: Logic Programming and Databases, Springer-Verlag, 1990.
[4] IRIS Reasoner:
Project Website