257 Pages
English

Xcerpt: a rule-based query and transformation language for the web [Elektronische Ressource] / von Sebastian Schaffert

-

Gain access to the library to view online
Learn more

Description

Xcerpt: A Rule-Based Query and Transformation Languagefor the WebDissertationzur Erlangung des akademischen Grades desDoktors der Naturwissenschaften¨ ¨an der Fakultat fur Mathematik, Informatik und Statistik¨ ¨der Ludwig-Maximilians-Universitat MunchenvonSebastian SchaffertOktober 2004Erstgutachter: Prof. Dr. Franc¸ois Bry (Universita¨t Mu¨nchen)Zweitgutachter: Prof. Dr. Georg Gottlob (TU Wien)Tag der mu¨ndlichen Pru¨fung: 13. Dezember 2004AbstractThis thesis investigates querying the Web and the Semantic Web. It proposes a new rule-based query language called Xcerpt. Xcerpt differs from other query languages in that it usespatterns instead of paths for the selection of data, and in that it supports both rule chainingand recursion. Rule chaining serves for structuring large queries, as well as for designingcomplex query programs (e.g. involving queries to the Semantic Web), and for modellinginference rules. Query patterns may contain special constructs like partial subqueries, optionalsubqueries, or negated subqueries that account for the particularly flexible structure of data onthe Web.Furthermore, this thesis introduces the syntax of the language Xcerpt, which is illustratedon a large collection of use cases both from the conventional Web and the Semantic Web.In addition, a declarative semantics in form of a Tarski-style model theory is described, andan algorithm is proposed that performs a backward chaining evaluation of Xcerpt programs.

Subjects

Informations

Published by
Published 01 January 2004
Reads 27
Language English
Document size 2 MB

Xcerpt: A Rule-Based Query and Transformation Language
for the Web
Dissertation
zur Erlangung des akademischen Grades des
Doktors der Naturwissenschaften
¨ ¨an der Fakultat fur Mathematik, Informatik und Statistik
¨ ¨der Ludwig-Maximilians-Universitat Munchen
von
Sebastian Schaffert
Oktober 2004Erstgutachter: Prof. Dr. Franc¸ois Bry (Universita¨t Mu¨nchen)
Zweitgutachter: Prof. Dr. Georg Gottlob (TU Wien)
Tag der mu¨ndlichen Pru¨fung: 13. Dezember 2004Abstract
This thesis investigates querying the Web and the Semantic Web. It proposes a new rule-
based query language called Xcerpt. Xcerpt differs from other query languages in that it uses
patterns instead of paths for the selection of data, and in that it supports both rule chaining
and recursion. Rule chaining serves for structuring large queries, as well as for designing
complex query programs (e.g. involving queries to the Semantic Web), and for modelling
inference rules. Query patterns may contain special constructs like partial subqueries, optional
subqueries, or negated subqueries that account for the particularly flexible structure of data on
the Web.
Furthermore, this thesis introduces the syntax of the language Xcerpt, which is illustrated
on a large collection of use cases both from the conventional Web and the Semantic Web.
In addition, a declarative semantics in form of a Tarski-style model theory is described, and
an algorithm is proposed that performs a backward chaining evaluation of Xcerpt programs.
This algorithm has also been implemented (partly) in a prototypical runtime system. A salient
aspect of this algorithm is the specification of a non-standard unification algorithm called
simulation unification that supports the new query constructs described above. This unification
is symmetric in the sense that variables in both terms can be bound. On the other hand it is in
contrast to standard unification assymmetric in the sense that the unification determines that
the one term is a subterm of the other term.
Zusammenfassung
Diese Arbeit untersucht das Anfragen des Webs und des Semantischen Webs. Sie stellt
eine neue regel-basierte Anfragesprache namens Xcerpt vor. Xcerpt unterscheidet sich von an-
deren Anfragesprachen insofern, als dass es zur Selektion von Daten sog. Pattern (,,Muster”)
verwendet und sowohl Regelschliessen als auch Rekursion unterstu¨tzt, was sowohl zur Struk-
turierung gro¨ßerer Anfragen als auch zur Erstellung komplexer Anfrageprogramme, und zur
Modellierung von Inferenzregeln dient. Anfrage-Pattern ko¨nnen spezielle Konstrukte, wie
partielle Teilanfragen, optionale Teilanfragen, oder negierte Teilanfragen, enthalten, die der
besonders flexiblen Struktur von Daten im Web genu¨gen.
In dieser Arbeit wird weiterhin die Syntax von Xcerpt eingefu¨hrt, und mit Hilfe mehrerer
Anwendungsszenarien sowohl aus dem konventionellen als auch aus dem semantischen Web
erla¨utert. Ausserdem wird eine deklarative Semantik im Stil von Tarski’s Modelltheorie
beschrieben und ein Algorithmus vorgeschlagen, der eine ru¨ckwa¨rtsschliessende Auswer-
tung von Xcerpt durchfu¨hrt und in einem prototypischen Laufzeitsystem implementiert wurde.
Wesentlicher Bestandteil des Ru¨ckwa¨rtsschliessens ist die Spezifikation eines nicht-standard
Unifikations-Algorithmus, der die oben genannten speziellen Xcerpt-Konstrukte beru¨ck-
sichtigt. Diese Unifikation ist symmetrisch in dem Sinne, dass sie Variablen in beiden
angeglichenen (,,unifizierten”) Termen binden kann. Andererseits ist sie im Gegensatz zur
Standardunifikation assymmetrisch in dem Sinne, dass der dadurch geleistete Angleich den
einen Term als ,,Teilterm” des anderen erkennt.
Sebastian Schaffert III“At times our own light goes out and is rekindled by a spark from another person. Each of us
has cause to think with deep gratitude of those who have lighted the flame within us.”
— Albert Schweitzer
Acknowledgements
The language Xcerpt presented in this thesis would not exist today without the continuous support and
contribution of many fellow researchers and students at the University of Munich, at the University of
Linko¨ping (Sweden), and elsewhere. Of those, particular gratitude goes to the following colleagues:
• Franc¸ois Bry, University of Munich, with whom I had many – sometimes heated but always fruitful
– discussions on almost all issues concerning Xcerpt.
• Włodzimierz Drabent, Polish Academy of Sciences, Warszawa, for discussing and proof-reading,
and hinting to some important mistakes.
• Norbert Eisinger, University of Munich, for spending much time with me discussing syntax and
semantics of the language, and for being (almost) always there when I needed him.
• Georg Gottlob, Technical University of Vienna, for being willing to take the burden of being second
supervisor of this thesis
• Jan Małuszyn´ski, University of Linko¨ping, for giving me the chance to present my work and for
many interesting discussions and ideas.
Also, I thank Tim Furche (University of Munich), Paula-Lavinia Pa˘traˆnjan (University of Munich), and
Artur Wilk (University of Linko¨ping) for working with me on several aspects of Xcerpt.
Furthermore, several graduate students, who worked on diploma and project theses during the develop-
ment of Xcerpt, deserve to be mentioned separately:
• Sacha Berger, who is now a fellow researcher, developed in his diploma thesis the visual language
visXcerpt which builds upon Xcerpt and has received a lot of attention in the research community. A
short introduction into this work can be found near the end of this thesis.
• Oliver Bolzer currently investigates Semantic Web querying with Xcerpt as part of his diploma thesis
and contributed many improvements to the source code of the prototypical runtime system presented
in this thesis.
• Sebastian Kraus worked extensively with the language Xcerpt during his diploma thesis, in which
he developed a comprehensive set of use cases for Xcerpt, some of which also appear in this thesis.
His work had much influence on the development of appropriate language constructs for Xcerpt.
• Andreas Schroeder developed in his project thesis several different approaches to backward chain-
ing in Xcerpt. Discussions and work with him ultimately lead to the operational semantics as it is
presented in this thesis.
. . . and last but not least: most gratitude goes to my family for supporting me patiently during the course
of this thesis.
This research has been partly funded by the European Commission and by the Swiss Federal Office
for Education and Science within the 6th Framework Programme project REWERSE number 506779 (cf.
http://rewerse.net).
Traunstein and Munich, October 2004
IV Sebastian SchaffertCONTENTS
I Introduction and Motivation 1
1 Introduction 3
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Outline of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Design Principles of Xcerpt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 Referential Transparency and Answer Closedness . . . . . . . . . . . . . . . . . . 5
1.3.2 Answers as Arbitrary XML Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.3 Pattern-Based Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.4 Incomplete Specification of Query Patterns . . . . . . . . . . . . . . . . . . . . . 9
1.3.5 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.6 Forward and Backward Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.7 Separation of Querying and Construction . . . . . . . . . . . . . . . . . . . . . . 12
1.3.8 Reasoning Capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 Data Representation on the Web 15
2.1 Semistructured Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.1 Traditional Database Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.2 Semistructured Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.3 Other Languages for Representing Semistructured Data . . . . . . . . . . . . . . . 18
2.2 XML – the Extensible Markup Language . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.1 Markup Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.2 A Generic Markup Language for the Web . . . . . . . . . . . . . . . . . . . . . . 21
2.2.3 Anatomy of an XML Document . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.4 XML Schema Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.5 XML References: ID/IDREF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2.6 XML Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3 XML, Semistructured Expressions and Semistructured Data . . . . . . . . . . . . . . . . 31
2.4 Three Scenarios for Querying Semistructured Data . . . . . . . . . . . . . . . . . . . . . 32
2.4.1 Student Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4.2 Bookstore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.4.3 Document-Centric: PhD Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.5 Graph Representation of Semistructured Data . . . . . . . . . . . . . . . . . . . . . . . . 38
2.6 Rooted Graph Simulation – A Similarity Relation for Rooted Graphs . . . . . . . . . . . . 39
3 Web Query Languages 43
3.1 Database vs. Web Query Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Desirable Characteristics of Web Query Languages . . . . . . . . . . . . . . . . . . . . . 44
3.3 Existing Web Query Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3.1 XPath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3.2 XSL/XSLT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
VCONTENTS
3.3.3 XQuery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.3.4 Survey over other Web Query Languages . . . . . . . . . . . . . . . . . . . . . . 56
II The Language Xcerpt 61
4 Xcerpt 63
4.1 Two Syntaxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2 Data Terms: An Abstraction for Data on the Web . . . . . . . . . . . . . . . . . . . . . . 64
4.2.1 Term Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.2.2 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.2.3 Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.2.4 Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.3 Query Terms: Patterns for Selecting Data . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.3.1 Incompleteness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.3.2 Term Variables, Label/Namespace Variables and the→-Construct . . . . . . . . . 71
4.3.3 Position Specification and Positional Variables . . . . . . . . . . . . . . . . . . . 73
4.3.4 Subterm Negation: without . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.3.5 Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.4 Query Evaluation: Ground Query Term Simulation . . . . . . . . . . . . . . . . . . . . . 78
4.4.1 Ground Query Terms and Ground Query Term Graphs . . . . . . . . . . . . . . . 79
4.4.2 Term Sequences and Successors . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.4.3 Ground Query Term Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.4.4 Simulation Order and Simulation Equivalence . . . . . . . . . . . . . . . . . . . . 87
4.5 Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.5.1 Resource Declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.5.2 Conjunctions and Disjunctions of Queries . . . . . . . . . . . . . . . . . . . . . . 88
4.5.3 Query Negation: not . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.5.4 Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.6 Construct Terms: Patterns for Constructing Data . . . . . . . . . . . . . . . . . . . . . . . 92
4.6.1 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.6.2 Grouping and Sorting: all and some . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.6.3 Functions and Aggregations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.6.4 Optional Subterms: optional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.7 Construct-Query Rules (or Views) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.7.1 Rule Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.7.2 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5 Xcerpt Use Cases 107
5.1 Restructuring Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.1.1 List of Authors vs. List of Titles . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.1.2 Resolving ID/IDREF references . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.1.3 Completing an HTML table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.1.4 List of Students . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.1.5 Separation of Concerns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.2 Querying the Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.2.1 Personal Portal Page: News and Weather . . . . . . . . . . . . . . . . . . . . . . 116
5.2.2 Web Crawler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.3 Semantic Web Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.3.1 Clique of Friends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.3.2 Ontology Reasoning: The Book Ontology . . . . . . . . . . . . . . . . . . . . . . 126
VI Sebastian SchaffertCONTENTS
6 Range Restrictedness, Standardisation Apart, and Stratification 133
6.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
6.2 Range Restrictedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6.2.1 Polarity of Subterms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6.2.2 Range Restrictedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.3 Standardisation Apart (or Rectification) . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
6.4 Stratification for Grouping Constructs and Negation . . . . . . . . . . . . . . . . . . . . . 137
6.4.1 Grouping Stratification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
6.4.2 Negation Stratification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
6.4.3 Full Stratification: Combining Grouping Stratification and Negation Stratification . 142
7 Declarative Semantics 143
7.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.2 Terms as Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
7.2.1 Term Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
7.2.2 Xcerpt Programs as Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
7.3 Substitutions and Substitution Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7.3.1 Preliminary Notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7.3.2 Application to Query Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
7.3.3 Application to Construct Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
7.3.4 Application to Query Term Formulas . . . . . . . . . . . . . . . . . . . . . . . . 150
7.4 Interpretations and Entailment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
7.4.1 Interpretations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
7.4.2 Satisfaction and Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
7.5 Fixpoint Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.6 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8 Operational Semantics 157
8.1 A Simple Constraint Solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.1.1 Data Structures and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
8.1.2 Solution Set of a Constraint Store . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.1.3 Constraint Simplification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
8.1.4 Consistency Verification Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
8.1.5 Constraint Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
8.1.6 Program Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
8.2 Simulation Unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
8.2.1 Simulation Unifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
8.2.2 Decomposition Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
8.2.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.2.4 Soundness and Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
8.3 Backward Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
8.3.1 Dependency Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
8.3.2 Query Unfolding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
8.3.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
8.3.4 Soundness and Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
III Conclusion 189
9 Perspectives 191
9.1 Advanced Query Constructs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
9.1.1 Advanced Text Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
9.1.2 Duplicate Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
9.1.3 Advanced Filter and Exclusion Mechanisms . . . . . . . . . . . . . . . . . . . . . 192
Sebastian Schaffert VIICONTENTS
9.1.4 Advanced Constraint Solving . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
9.2 Support for Special Theories and Reasoners . . . . . . . . . . . . . . . . . . . . . . . . . 194
9.3 Meta-Programming and Meta-Querying . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
9.3.1 Meta-Programming on the Web . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
9.3.2 Supporting Meta-Programming in Xcerpt . . . . . . . . . . . . . . . . . . . . . . 196
9.4 Distributed and Peer-to-Peer Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
9.4.1 Distributed Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
9.4.2 Peer-to-Peer Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
9.5 Optimised Evaluation and Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 198
9.5.1 Identifying Complexity of Language Parts . . . . . . . . . . . . . . . . . . . . . . 198
9.5.2 Simulation Unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
9.5.3 Rule Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
9.5.4 Constraint Solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
9.5.5 Virtual Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
9.6 Term Formulas as Integrity Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
10 Conclusion 203
IV Appendix 205
A A Prototypical Runtime System 207
A.1 Usage of the Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
A.1.1 Command Line Switches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
A.2 Overall Structure of the Source Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
A.3 Module Xcerpt.Data: Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
A.3.1 Term.hs: Data Structures for Terms . . . . . . . . . . . . . . . . . . . . . . . . . 211
A.3.2 Program.hs: Data Structures for Programs . . . . . . . . . . . . . . . . . . . . . . 213
A.4 Module Xcerpt.IO: Input/Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
A.5 Module Xcerpt.Parser: Parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
A.5.1 Xcerpt.Parser.Xcerpt: Xcerpt V1 and V2 Parser . . . . . . . . . . . . . . . . . . . 215
A.5.2 Xcerpt.Parser.XML: XML parser . . . . . . . . . . . . . . . . . . . . . . . . . . 216
A.5.3 Xcerpt.Parser.HTML: HTML parser . . . . . . . . . . . . . . . . . . . . . . . . . 216
A.6 Module Xcerpt.Show: Output Formatting . . . . . . . . . . . . . . . . . . . . . . . . . . 217
A.7 Module Xcerpt.EngineNG: Program Evaluation . . . . . . . . . . . . . . . . . . . . . . . 218
A.7.1 Constraint Solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
A.7.2 Unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
A.7.3 Backward Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
A.8 Module Xcerpt.Methods: User-Defined Functions . . . . . . . . . . . . . . . . . . . . . . 225
B Proofs 227
B.1 Proof of Theorem 4.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
B.2 Proof of Theorem 8.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
B.3 Proof of Lemma 8.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
List of Examples 237
Index 239
Bibliography 241
Curriculum Vitae 248
VIII Sebastian SchaffertPart I
Introduction and Motivation
1