Neural-symbolic integration [Elektronische Ressource] / eingereicht von Sebastian Bader

Neural-symbolic integration [Elektronische Ressource] / eingereicht von Sebastian Bader

English
196 Pages
Read
Download
Downloading requires you to have access to the YouScribe library
Learn all about the services we offer

Description

Neural-Symbolic IntegrationDissertationzur Erlangung des akademischen GradesDoktor rerum naturalium (Dr. rer. nat)vorgelegt an derTechnischen Universitat DresdenFakultat Informatikeingereicht vonDipl.-Inf. Sebastian Badergeb. am 22. Mai 1977 in RostockDresden, Oktober 2009Gutachter: Prof. Dr. rer. nat. habil Ste en H olldobler(Technische Universit at Dresden)Prof. Dr. rer. nat. habil Barbara Hammer(Technische Universit at Clausthal)Verteidigung: 5. Oktober 2009Neural-Symbolic IntegrationSebastian BaderDresden, October 2009To my family.Contents1 Introduction and Motivation 11.1 Motivation for the Study of Neural-Symbolic Integration . . . . . . . . . . . . 21.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 A Classi cation Scheme for Neural-Symbolic Systems . . . . . . . . . . . . . . 121.4 Challenge Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181.5 Structure of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 Preliminaries 252.1 General Notions and Notations . . . . . . . . . . . . . . . . . . . . . . . . . . 262.2 Metric Spaces, Contractive Functions and Iterated Function Systems . . . . . 302.3 Logic Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342.4 Binary Decision Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412.5 Connectionist Systems . . . . . . . . . . . . . . . . . . . . . . . .

Subjects

Informations

Published by
Published 01 January 2009
Reads 25
Language English
Document size 3 MB
Report a problem

Neural-Symbolic Integration
Dissertation
zur Erlangung des akademischen Grades
Doktor rerum naturalium (Dr. rer. nat)
vorgelegt an der
Technischen Universitat Dresden
Fakultat Informatik
eingereicht von
Dipl.-Inf. Sebastian Bader
geb. am 22. Mai 1977 in Rostock
Dresden, Oktober 2009
Gutachter: Prof. Dr. rer. nat. habil Ste en H olldobler
(Technische Universit at Dresden)
Prof. Dr. rer. nat. habil Barbara Hammer
(Technische Universit at Clausthal)
Verteidigung: 5. Oktober 2009Neural-Symbolic Integration
Sebastian Bader
Dresden, October 2009To my family.Contents
1 Introduction and Motivation 1
1.1 Motivation for the Study of Neural-Symbolic Integration . . . . . . . . . . . . 2
1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 A Classi cation Scheme for Neural-Symbolic Systems . . . . . . . . . . . . . . 12
1.4 Challenge Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5 Structure of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2 Preliminaries 25
2.1 General Notions and Notations . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2 Metric Spaces, Contractive Functions and Iterated Function Systems . . . . . 30
2.3 Logic Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.4 Binary Decision Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.5 Connectionist Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3 Embedding Propositional Rules into Connectionist Systems 53
3.1 Embedding Semantic Operators into Threshold Networks . . . . . . . . . . . 54
3.2 Embtic Op into Sigmoidal Networks . . . . . . . . . . . 58
3.3 Iterating the Computation of the Embedded Operators . . . . . . . . . . . . . 65
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4 An Application to Part of Speech Tagging 69
4.1 Part of Speech Tagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.2 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.3 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5 Connectionist Learning and Propositional Background Knowledge 77
5.1 Integrating Background Knowledge into the Training Process . . . . . . . . . 78
5.2 A Simple Classi cation Task as Case Study . . . . . . . . . . . . . . . . . . . 79
5.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6 Extracting Propositional Rules from Connectionist Systems 85
6.1 The Rule Extraction Problem for Feed-Forward Networks . . . . . . . . . . . 86
6.2 CoOp { A New Decompositional Approach . . . . . . . . . . . . . . . . . . . 88
6.3 Decomposition of Feed-Forward Networks . . . . . . . . . . . . . . . . . . . . 89
6.4 Computing Minimal Coalitions and Oppositions . . . . . . . . . . . . . . . . . 91
6.5 Composition of Intermediate Results . . . . . . . . . . . . . . . . . . . . . . . 104
6.6 Incorporation of Integrity Constraints . . . . . . . . . . . . . . . . . . . . . . 112
6.7 Extraction of Propositional Logic Programs . . . . . . . . . . . . . . . . . . . 117
6.8 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
v7 Embedding First-Order Rules into Connectionist Systems 123
7.1 Feasibility of the Core Method . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.2 Embedding Interpretations into the Real Numbers . . . . . . . . . . . . . . . 126
7.3 Emb the Consequence Operator into the Real Numbers . . . . . . . . . 133
7.4 Approximating the Embedded Operator using Connectionist Systems . . . . . 134
7.5 Iterating the Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.6 Vector-Based Learning on Embedded Interpretations . . . . . . . . . . . . . . 157
7.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
8 Conclusions 161
8.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
8.2 Challenge Problems Revised . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
8.3 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
8.4 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172List of Figures
1.1 The Neural-Symbolic Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 The idea behind the Core Method . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 The three main dimensions of our classi cation scheme . . . . . . . . . . . . . 12
1.4 The details of the interrelation dimension . . . . . . . . . . . . . . . . . . . . 13
1.5 The of the language . . . . . . . . . . . . . . . . . . . . . . 13
1.6 The details of the usage dimension . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 Operators de ned to obtain a more compact source code and some \syntactic
sugar" yielding a better readable source code . . . . . . . . . . . . . . . . . . 27
2.2 The de nition for the \evaluate-to" operator . . . . . . . . . . . . . . . . . . 28
2.3 The de nition for the \ev operator, ctd. . . . . . . . . . . . . . . . 29
2.4 Implementation to construct an empty BDD and to access the ingredients of a
BDD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.5 Implementation to construct a BDD from two BDDs . . . . . . . . . . . . . . 44
2.6 Base cases for the construction of a BDD from two BDDs using disjunction and
conjunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.7 The construction of a BDD from a given variable and two BDDs using the
if-then-else construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.8 Implementation to create of nodes while constructing a reduced BDD . . . . 45
3.1 Activation functions tanh and sigm together with their threshold counterparts 54
3.2 Implementation to construct a behaviour-equivalent arti cial neural network for
a given propositional logic program . . . . . . . . . . . . . . . . . . . . . . . . 57
3.3 Implementation for the construction of a behaviour equivalent network with
units computing the hyperbolic tangent. . . . . . . . . . . . . . . . . . . . . . 64
4.1 General architecture of a neural-symbolic part of speech tagger. . . . . . . . . 71
4.2 Network arc for part-of-speech tagging. . . . . . . . . . . . . . . . . . 73
4.3 Comparison of MSE after training an initialised and a purely randomised network. 75
4.4 The evolution of the error on training and validation set . . . . . . . . . . . . 75
5.1 The rule-insertion cycle for the integration of symbolic knowledge into the con-
nectionist training process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2 A 3-layer fully connected feed-forward network to classify Tic-Tac-Toe boards 79
5.3 The embedding of a tic-tac-toe rule . . . . . . . . . . . . . . . . . . . . . . . . 80
5.4 The development of the mean squared error over time for di erent embedding
factors ! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.1 Implementation to compute the set of perceptrons for a given feed-forward net-
work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.2 Implementation to construct a search tree to guide the extraction process . . 93
6.3 to the pruned search tree . . . . . . . . . . . . . . 95
vii6.4 Implementation to construct the reduced ordered binary decision diagram rep-
resenting the set of minimal coalition for a given perceptronP . . . . . . . . 101
6.5 Implementation to construct the pruned search trees for coalitions and opposi-
tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.6 Implementation to construct the reduced ordered BDD for coalition and the
reduced ordered BDD for oppositions for a given perceptronP . . . . . . . . 105
6.7 Implementation to construct the positive form of a given perceptron. . . . . . 106
6.8 to the negative form of a given p . . . . 107
6.9 The expansion of some node O into BDDa . . . . . . . . . . . . . . . . . . . . . 113
6.10 The extraction of a reduced ordered BDDb representing the coalitions of all out-
put nodes wrt. the input nodes for a given network N . . . . . . . . . . . . . . 114
6.11 The full extraction into a single reduced ordered BDD including the incorpora-
tion of integrity constraints for all output nodes. . . . . . . . . . . . . . . . . 117
6.12 Resulting BDD sizes of the extraction for di erent max -integrity constraints. 121n
7.1 Implementation of the embedding function for the 1-dimensional case . . . . 127
7.2 for the emb fu for multi-dimensional embeddings 127
7.3 Two sets of embedded interpretations . . . . . . . . . . . . . . . . . . . . . . 130
7.4 Implementation to compute a set of constant pieces which approximate a given
T -operator up to level N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136P
7.5 Implementation to compute a set of approximating step functions for a given
T -operator up to level N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138P
17.6 Three sigmoidal approximations of the step function . . . . . . . . . . . 1390;0
7.7 Implementation to compute a set of approximating sigmoidal functions for a
given T -operator up to level N . . . . . . . . . . . . . . . . . . . . . . . . . . 141P
7.8 Implementation to construct an approximating sigmoidal network for a given
T -operator up to level N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143P
7.9 Implementation to compute the set of approximating raised cosine functions for
a given T -operator up to level N . . . . . . . . . . . . . . . . . . . . . . . . . 145P
7.10 Implementation to construct an approximating raised cosine network for a given
T -operator up to level N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146P
7.11 Implementation to compute the set of approximating reference vectors for a
given T -operator up to level N . . . . . . . . . . . . . . . . . . . . . . . . . . 149P
7.12 Implementation to construct an approximating vector based network for a given
T -operator up to level N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152P
7.13 Approximation based on the constant pieces are not necessarily contractive if
the underlying f is contractive . . . . . . . . . . . . . . . . . . . . . . . . . . 155P
7.14 The adaptation of the input weights for a given input i . . . . . . . . . . . . . 158
7.15 Adding a new unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
7.16 Removing an inutile unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160