7 Pages
English
Gain access to the library to view online
Learn more

Extensible Dependency Grammar: A New Methodology Ralph Debusmann

-

Gain access to the library to view online
Learn more
7 Pages
English

Description

Extensible Dependency Grammar: A New Methodology Ralph Debusmann Programming Systems Lab Saarland University Postfach 15 11 50 66041 Saarbrucken Germany Denys Duchier Equipe Calligramme LORIA – UMR 7503 Campus Scientifique, B. P. 239 54506 Vandœuvre les Nancy CEDEX France Geert-Jan M. Kruijff Computational Linguistics Saarland University Postfach 15 11 50 66041 Saarbrucken Germany Abstract This paper introduces the new grammar formalism of Extensible Dependency Grammar (XDG), and emphasizes the benefits of its methodology of ex- plaining complex phenomena by interaction of sim- ple principles on multiple dimensions of linguis- tic description. This has the potential to increase modularity with respect to linguistic description and grammar engineering, and to facilitate concurrent processing and the treatment of ambiguity. 1 Introduction We introduce the new grammar formalism of Exten- sible Dependency Grammar (XDG). In XDG, com- plex phenomena arise out of the interaction of sim- ple principles on multiple dimensions of linguis- tic description. In this paper, we point out how this novel methodology positions XDG in between multi-stratal approaches like LFG and MTT, and mono-stratal ones like HPSG, attempting to com- bine their benefits and avoid their problems. It is the division of linguistic analyses into dif- ferent dimensions which makes XDG multi-stratal.

  • principle library

  • based xdg solver

  • xdg solver used

  • government principle

  • must satisfy

  • xdg

  • einen roman versucht

  • vf ?

  • id dimension

  • roman


Subjects

Informations

Published by
Reads 31
Language English

Exrait

Extensible Dependency Grammar: A New Methodology
Ralph Debusmann Denys Duchier Geert-Jan M. Kruijff
´Programming Systems Lab Computational LinguisticsEquipe Calligramme
Saarland University Saarland UniversityLORIA – UMR 7503
Postfach 15 11 50 Postfach 15 11 50Campus Scientifique, B. P. 239
66041 Saarbru¨cken 66041 Saarbru¨cken54506 Vandœuvre le`s Nancy CEDEX
Germany GermanyFrance
rade@ps.uni-sb.de gj@coli.uni-sb.deduchier@loria.fr
Abstract • number of dimensions: two in TDG (ID and
LP), arbitrary many in XDG
This paper introduces the new grammar formalism
of Extensible Dependency Grammar (XDG), and • set of principles: fixed in TDG, extensible
emphasizes the benefits of its methodology of ex- principle library in XDG
plaining complex phenomena by interaction of sim-
The structure of this paper is as follows: In§2, weple principles on multiple dimensions of linguis-
introduce XDG formally, and also the XDG solvertic description. This has the potential to increase
used for parsing and generation. In §3, we intro-modularity with respect to linguistic description and
duce a number of XDG principles informally, be-grammar engineering, and to facilitate concurrent
fore making use of them in an idealized example
processing and the treatment of ambiguity.
grammar in §4. In §5 we argue why XDG has the
potential to be an improvement over multi-stratal1 Introduction
and mono-stratal approaches, before we conclude in
We introduce the new grammar formalism of Exten- §6.
sible Dependency Grammar (XDG). In XDG, com-
plex phenomena arise out of the interaction of sim- 2 Extensible Dependency Grammar
ple principles on multiple dimensions of linguis- In this section, we introduce XDG formally and
tic description. In this paper, we point out how mention briefly the constraint-based XDG solver for
this novel methodology positions XDG in between parsing and generation.
multi-stratal approaches like LFG and MTT, and
2.1 Formalizationmono-stratal ones like HPSG, attempting to com-
bine their benefits and avoid their problems. Formally, an XDG grammar is built up of dimen-
It is the division of linguistic analyses into dif- sions, a lexicon and principles, and characterizes a
ferent dimensions which makes XDG multi-stratal. set of well-formed analyses.
On the other, XDG is mono-stratal in that its princi- A dimension is a tuple D=(Lab,Fea,Val,Pri) of
ples interact to constrain all dimensions simultane- a set Lab of edge labels, a set Fea of features, a set
ously. XDG combines the benefits of these two po- Val of feature values, and a set of one-dimensional
sitions, and attempts to circumvent their problems. principles Pri. A lexicon for the dimension D is a
From multi-stratal approaches, XDG adopts a high set Lex ⊆ Fea→ Val of total feature assignments
degree of modularity, both with respect to linguis- called lexical entries. An analysis on dimension
tic description as well as for grammar engineering. D is a triple (V,E,F) of a set V of nodes, a set
This also facilitates the statement of cross-linguistic E ⊆ V×V× Lab of directed labeled edges, and an
generalizations. XDG avoids the problem of placing assignment F : V →(Fea→ Val) of lexical entries
too high a burden on the interfaces, and allows in- to nodes. V and E form a graph. We write Ana forD
teractions between all and not only adjacent dimen- the set of all possible analyses on dimension D. The
sions. From mono-stratal approaches, XDG adopts principles characterize subsets of Ana . We assumeD
a high degree of integration, facilitating concurrent that the elements of Pri are finite representations of
processing and the treatment of ambiguity. At the such subsets.
nsame time, XDG does not lose its modularity. An XDG grammar((Lab ,Fea ,Val ,Pri) ,Pri,i i i i i=1
XDG is a descendant of Topological Depen- Lex) consists of n dimensions, multi-dimensional
dency Grammar (TDG) (Duchier and Debusmann, principles Pri, and a lexicon Lex. An XDG analysis
n2001), pushing the underlying methodology further (V,E ,F) is an element of Ana = Ana ×···×i i 1i=1
by generalizing it in two aspects: Ana where all dimensions share the same set ofnnodes V . We call a dimension of a grammar gram- duce some of the most important one-dimensional
mar dimension. and multi-dimensional principles.
Multi-dimensional principles specify subsets of
3.1 Tree principleAna, i.e. of tuples of analyses for the individual di-
mensions. The lexicon Lex⊆ Lex ×···×Lex con-1 n tree(i) The analysis on dimension i must be a tree.
strains all dimensions at once, thereby synchroniz-
The tree principle is non-lexicalized and
ing them. An XDG analysis is licensed by Lex iff
parametrized by the dimension i.
(F (v),...,F (v))∈ Lex for every node v∈ V .1 n
In order to compute analyses for a given input, 3.2 Dag principle
we employ a set of input constraints (Inp), which
dag(i) The analysis on dimension i must be a di-again specify a subset of Ana. XDG solving then
rected acyclic graph.amounts to finding elements of Ana that are licensed
The dag principle is non-lexicalized andby Lex, and consistent with Inp and Pri. The input
parametrized by the dimension i.constraints determine whether XDG solving is to be
used for parsing or generation. For parsing, they
3.3 Valency principlespecify a sequence of words, and for generation, a
multiset of semantic literals. valency(i,in ,out) All nodes on dimension i musti i
satisfy their in and out specifications.
2.2 Solver
The valency principle is lexicalized and serves
XDG solving has a natural reading as a constraint to lexically describe dependency graphs. It is
satisfaction problem (CSP) on finite sets of integers, parametrized by the dimension i, the in specification
where well-formed analyses correspond to the solu- in and the out specification out . For each node, ini i i
tions of the CSP (Duchier, 2003). We have imple- stipulates the licensed incoming edges, and out thei
mented an XDG solver using the Mozart-Oz pro- licensed outgoing edges.
gramming system (Mozart Consortium, 2004).
In the example grammar lexicon part in Figure 1
XDG solving operates on all dimensions concur-
below, the in specification is in and out isID ID
rently. This means that the solver can infer informa- the out specification on the ID dimension. For the
tion about one dimension from information on an-
common noun Roman, the in specification licenses
other, if there is either a multi-dimensional principle zero or one incoming edges labeled subj (subj?),
linking the two dimensions, or by the synchroniza- and zero or one incoming edges labeled obj (obj?).
tion induced by the lexical entries. For instance, not The out specification requires precisely one outgo-
only can syntactic information trigger inferences in ing edge labeled det (det!).
syntax, but also vice versa.
Because XDG allows us to write grammars with 3.4 Government principle
completely free word order, XDG solving is an
government(i,cases ,govern ) All edges in dimen-i iNP-complete problem (Koller and Striegnitz, 2002).
sion i must satisfy the government specification of
This means that the worst-case complexity of the
the mother.
solver is exponential. The average-case complexity
The government principle is lexicalized. Its pur-of many smaller-scale grammars that we have ex-
pose is to constrain the case feature of a depen-perimented with seems polynomial, but it remains
1dent. It is parametrized by the dimension i, theto be seen whether we can scale this up to large-
cases specification cases and the government spec-iscale grammars.
ification govern. cases assigns to each word a set of
possible cases, and govern a mapping from labels to3 Principles
sets of cases.
The well-formedness conditions of XDG analy-
In Figure 1, the cases specification for the deter-
ses are stipulated by principles. Principles are
miner den is {acc} (i.e. it can only be accusative).
parametrizable, e.g. by the dimensions on which
By its government specification, the finite verb ver-
they are applied, or by lexical features. They can
sucht requires its subject to exhibit nominative case
be lexicalized or non-lexicalized, and can be one-
(subj →{nom}).
dimensional or multi-dimensional. Principles are
taken from an extensible principle library. So far,
1We restrict ourselves to the case feature only for simplicity.
the set of possible principles is unrestricted, and to In a fully-fledged grammar, the government principle would be
find restrictions for them is a topic for future re- used to constrain also other morphological aspects like number,
search. In the following two subsections, we intro- person and gender.3.5 Agreement principle 3.9 Linking principle
linking(i, j,link ) All edges on dimension i mustagreement(i,cases ,agree ) All edges in dimensioni i, ji
satisfy the linking specification of the mother.i must satisfy the agreement specification of the
The linking principle is lexicalized and two-mother.
dimensional. It is parametrized by the two dimen-The agreement principle is lexicalized. Its pur-
sions i and j, and by the linking specification link ,pose is to enforce the case agreement of a daugh- i, j
2 mapping labels from Lab to sets of labels fromter. It is parametrized by dimension i, the lexical i
Lab . Its purpose is to specify how dependents oncases specification cases , assigning to each word a ji
dimension i are realized by (or linked to) dependentsset of possible cases, and the agreement specifica-
on dimension j.tion agree , assigning to each word a set of labels.i
In the lexicon part in Figure 3, the linking spec-As an example, in Figure 1, the agreement spec-
ification for the transitive verb lesen requires thatification for the common noun Roman is{det}, i.e.
its agent on the PA dimension must be realized by athe case of the common noun must agree with its
subject (ag →{subj}), and the patient by an objectdeterminer.
(pat →{obj}).
3.6 Order principle.
4 Example grammarorder(i,on ,≺) On dimension i, 1) each node musti i
satisfy its node labels specification, 2) the order of In this section, we elucidate XDG with an example
the daughters of each node must be compatible with grammar fragment for German. With it, we demon-
≺ , and 3) the node itself must be ordered correctly strate three aspects of the methodology of XDG:i
with respect to its daughters (using its node label).
• How complex phenomena such as topicaliza-
The order principle is lexicalized. It is
tion and control arise by the interaction of sim-
parametrized by the dimension i, the node labels
ple principles on different dimensions of lin-
specification on mapping each node to set of labelsi
guistic description.
from Lab , and the total order≺ on Lab .i i i
Assuming the node labels specification given in • How the high degree of integration helps to re-
duce ambiguity.Figure 2, and the total order in (5), the tree in (11)
satisfies the order principle. For instance for the • How the high degree of modularity facilitates
node versucht: 1) The node label of versucht is lbf, the statement of cross-linguistic generaliza-
satisfying the node labels specification. 2) The or-
tions.
der of the daughters Roman (under the edge labeled
vf), Peter (mf) and lesen (rbf) is compatible with the Note that this grammar fragment is an idealized ex-
ample, and does not make any claims about XDG astotal order prescribing vf ≺ mf ≺ rbf. 3) The node
versucht itself is ordered correctly with respect to its a grammar theory. Its purpose is solely to substan-
tiate our points about XDG as a framework.daughters (the total order prescribes vf≺lbf≺mf).
4.1 Dimensions3.7 Projectivity principle
The grammar fragment make use of two dimen-projectivity(i) The analysis on dimension i must be
sions: Immediate Dominance (ID) and Linearprojective.
Precedence (LP). The models on the ID dimensionThe projectivity principle is non-lexicalized. Its
3 are unordered, syntactic dependency trees whosepurpose is to exclude non-projective analyses. It is
edge labels correspond to syntactic functions likeparametrized by dimension i.
subject and object. On the LP dimension, the mod-
3.8 Climbing principle els are ordered, projective topological dependency
trees whose edge labels are topological fields likeclimbing(i, j) The graph on dimension i must be
Vorfeld and Mittelfeld.flatter than the graph on dimension j.
The climbing principle is non-lexicalized and 4.2 Labels
two-dimensional. It is parametrized by the two di-
The set Lab of labels on the ID dimension is:IDmensions i and j.
Lab = {det,subj,obj,vinf,part} (1)IDFor instance, the tree in (11) is flatter than the
corresponding tree in (10). These correspond resp. to determiner, subject, ob-
ject, infinitive verbal complement, and particle.
2Again, we restrict ourselves to case for simplicity.
The set Lab of labels on the LP dimension is:3 LPThe projectivity principle of course only makes sense in
Lab = {detf,nounf,vf,lbf,mf,partf,rbf}combination with the order principle. (2)LPsubj
mf
vinf
rbf
vinf
Corresponding resp. to determiner field, noun field, one. Other ID trees are ruled out by the interaction
Vorfeld, left bracket field, Mittelfeld, particle field, of the principles on the ID dimension. For instance,
and right bracket field. the government and agreement principles conspire
to rule out the reading where Roman is the subject of
4.3 Principles
versucht (and Peter the object). How? By the agree-
On the ID dimension, we make use of the following ment principle, Roman must be accusative, since it
one-dimensional principles: agrees with its accusative determiner einen. By the
government principle, the subject of versucht musttree(ID)
valency(ID,in ,out ) be nominative, and the object of lesen accusative.ID ID (3)
government(ID,cases ,govern )ID ID Thus Roman, by virtue of being accusative, cannot
agreement(ID,cases ,agree )ID ID become the subject of versucht. The only other op-
The LP dimension uses the following principles: tion for it is to become the object of lesen. Conse-
quently, Peter, which is unspecified for case, must
tree(LP)
become the subject of versuchen (versuchen mustvalency(LP,in ,out )LP LP (4) have a subject by the valency principle).order(LP,on ,≺ )LP LP
projectivity(LP)
4.6 Topicalization
where the total order≺ is defined as:LP
Our second example is a case of topicalization,
detf≺nounf≺vf≺lbf≺mf≺partf≺rbf (5) where the object has moved into the Vorfeld, to the
left of the finite verb:We make use of the following multi-dimensional
principles: Einen Roman versucht Peter zu lesen. (9)
climbing(LP, ID)
(6)
linking(LP, ID) Here is the ID tree and the LP tree analysis:
.
4.4 Lexicon
We split the lexicon into two parts. The ID and LP
4parts are displayed resp. in Figure 1 and Figure 2.
The LP part includes also the linking specification
5for the LP,ID-application of the linking principle.
Einen Roman versucht Peter zu lesen (10)4.5 Government and agreement
.
Our first example is the following sentence:
Peter versucht einen Roman zu lesen.
Peter tries a novel to read. (7)acc
lbfPeter tries to read a novel.
nounf nounf rbf
detf partf
We display the ID analysis of the sentence below: Einen Roman versucht Peter zu lesen (11).
The ID tree analysis is the same as before, except
that the words are shown in different positions. In
the LP tree, Roman is in the Vorfeld of versucht, Pe-
ter in the Mittelfeld, and lesen in the right bracket
field. versucht itself is (by its node label) in the left
bracket field. Moreover, Einen is in the determinerPeter versucht einen Roman zu lesen (8)
field of Roman, and zu in the particle field of lesen.
Again, this is an example demonstrating howHere, Peter is the subject of versucht. lesen is the in-
finitival verbal complement of versucht, zu the parti- complex phenomena (here: topicalization) are ex-
plained by the interaction of simple principles. Top-cle of lesen, and Roman the object of lesen. Finally,
einen is the determiner of Roman. icalization does not have to explicitly taken care of,
it is rather a consequence of the interacting princi-Under our example grammar, the sentence is un-
ambiguous, i.e. the given ID tree is the only possible ples. Here, the valency, projectivity and climbing
principles conspire to bring about the “climbing up”
4Here, stands for “don’t care”, this means e.g. for the verb of the NP Einen Roman from being the daughter of
versucht that it has unspecified case.
lesen in the ID tree to being the daughter of versucht5We do not make use of the linking specification for the
in the LP tree: The out specification of lesen doesGerman grammar fragment (the mappings are all empty), but
we will do so as we switch to Dutch in §4.8 below. not license any outgoing edge. Hence, Roman must
obj
obj
vf
subj
detf
det
det
par
part
t
partfin out cases govern agreeID ID ID ID ID
den {det?} {} {acc} {} {}
Roman {subj?,obj?} {det!} {nom,dat,acc} {} {det}
Peter {subj?,obj?} {} {nom,dat,acc} {} {}
versucht {} {subj!,vinf!} {subj →{nom}} {}
zu {part?} {} {} {}
lesen {vinf?} {obj!} {obj →{acc}} {}
Figure 1: Lexicon for the example grammar fragment, ID part
in out on linkLP LP LP LP,ID
den {detf?} {} {detf} {}
Roman {vf?,mf?} {detf!} {nounf} {}
Peter {vf?,mf?} {} {nounf} {}
versucht {} {vf?,mf∗,rbf?} {lbf} {}
zu {partf?} {} {partf} {}
lesen {rbf?} {} {rbf} {}
Figure 2: Lexicon for the example grammar fragment, LP part
become the daughter of another node. The only pos- • The Vorfeld of the finite verb probeert cannot
sibility is versucht. The determiner Einen must then be occupied by an object (but only by an ob-
also “climb up” because Roman is its only possi- ject): link ={vf →{subj}}.LP,ID
ble mother. The result is an LP tree which is flat-
Now to the example, a Dutch translation of (7):
ter with respect to the ID tree. The LP tree is also
Peter probeert een roman te lezen.projective. If it were not be flatter, then it would
Peter tries a novel to read. (13)
be non-projective, and ruled out by the projectivity
Peter tries to read a novel.
principle.
We get only one analysis on the ID dimension,
4.7 Negative example where Peter is the subject and roman the object.
An analysis where Peter is the object of lezen andOur third example is a negative example, i.e. an un-
roman the subject of probeert is impossible, as ingrammatical sentence:
the German example. The difference is, however,
∗Peter einen Roman versucht zu lesen. (12) how this analysis is excluded. In German, the ac-
cusative inflection of the determiner einen triggeredThis example is perfectly legal on the unordered ID
the agreement and the government principle to ruledimension, but has no model on the LP dimension.
it out. In Dutch, the determiner is not inflected.Why? Because by its LP out specification, the finite
The unwanted analysis is excluded on the groundsverb versucht allows only one dependent to the left
of word order instead: By the linking principle, theof it (in its Vorfeld), and here we have two. The
Vorfeld of probeert must be filled by a subject, andinteresting aspect of this example is that although
not by an object. That means that Peter in the Vor-we can find a well-formed ID tree for it, this ID tree
feld (to the left of probeert) must be a subject, andis never actually generated. The interactions of the
consequently, the only other choice for roman is thatprinciples, viz. here of the principles on the LP di-
it becomes the object of lezen.mension, rule out the sentence before any full ID
analysis has been found. 4.9 Predicate-Argument Structure
So far, our example grammar fragment was confined4.8 From German to Dutch
to syntax. In this section, we emphasize the exten-
For the fourth example, we switch from German to
sibility aspect of XDG by showing how it allows
Dutch. We will show how to use the lexicon to con-
us to extend the grammar with another dimension,
cisely capture an important cross-linguistic general-
Predicate-Argument Structure (PA). The models on
ization. We keep the same grammar as before, but
the PA dimension are not trees but directed acyclic
with two changes, arising from the lesser degree of
graphs (dags), to model re-entrancies e.g. caused by
inflection and the higher reliance on word order in
control constructions. Thanks to the modularity of
Dutch:
XDG, the PA part of the grammar is the same for
German and Dutch.• The determiner een is not case-marked but
The set Lab of labels on the PA dimension is:can be either nominative, dative or accusative: PA
Lab = {ag,pat,prop}cases ={nom,dat,acc}. (14)PAIDprop
Corresponding resp. to agent, patient and proposi- 4.10 Scope structure
tion. (Debusmann et al., 2004a) present a syntax-
The PA dimension uses the following one- semantics interface for XDG which additionally in-
dimensional principles: troduces a dimension to model quantifier scope. For
lack of space, we omit the discussion of it in this
dag(PA)
(15) paper, but we mention it here to emphasize the ex-valency(PA,in ,out )PA PA
tensibility of the framework.
Note that we re-use the valency principle again, as
we did on the ID and LP dimensions. 5 Comparison
And also the following multi-dimensional princi- This section includes a more in-depth comparison
ples: of XDG with purely multi- and mono-stratal ap-
climbing(ID, PA) proaches.(16)
linking(PA, ID)
Contrary to multi-stratal approaches like LFG or
Here, we re-use the climbing and linking princi- MTT, XDG is more integrated. For one, it places
ples. That is, we state that the ID tree is flatter a lighter burden the interfaces between the dimen-
than the corresponding PA dag. This captures rais- sions. In LFG for instance, the -mapping from
ing and control, where arguments of embedded infi- c-structure to f-structure is rather specific, and has
nite verbs can “climb up” and become arguments of to be specifically adapted to new c-structures, e.g.
a raising or control verb, in the same way as syntac- in order to handle a new construction with a dif-
tic arguments can “climb up” from ID to LP. We use ferent word order. That is, not only the grammar
the linking principle to specify how semantic argu- rules for the c-structure need to be adapted, but also
ments are to be realized syntactically (e.g. the agent the interface between c- and f-structure. As already
as a subject etc.). stressed several times, in XDG, complex phenom-
We display the PA part of the lexicon in Figure 3. ena arise out of the interaction of simple, maximally
general principles. Hence to accommodate the newHere is an example PA dag analysis of example
sentence (7): construction, the grammar would ideally only need
. to be adapted on the word order dimension, leaving
the principles in place.
Furthermore, XDG allows interactions of rela-
tional constraints between all dimensions, not only
between adjacent ones (like c- and f-structure),
Peter versucht einen Roman zu lesen (17) and in all directions. For one, this gets us bi-
directionality (parsing and generation with the same
grammar) for free. Secondly, the interactions ofHere, Peter is the agent of versucht, and also the
XDG have the potential to help greatly in reduc-agent of lesen. Furthermore, lesen is a proposition
ing ambiguity. In multi-stratal approaches, ambigu-dependent of versucht, and Roman is the patient of
ity must be duplicated throughout the system. E.g.lesen.
suppose there are two candidate c-structures in LFGNotice that the PA dag is indeed a dag and not a
parsing, but one is ill-formed semantically. Thentree since Peter has two incoming edges: It is simul-
they can only be ruled out after duplicating the am-taneously the agent of versucht and of lesen. This
biguity on the f-structure, and then filtering out theis enforced by by the valency principle: Both ver-
ill-formed structure on the semantic -structure. Insucht and lesen require an agent. Peter is the only
XDG on the other hand, the semantic principles canword which can be the agent of both, because it is
rule out the ill-formed analysis much earlier, typ-a subject and the agents of versucht and lesen must
6 ically on the basis of a partial syntactic analysis.be subjects by the linking principle. The climb-
Thus, ill-formed analyses are never duplicated, ining principle ensures that predicate arguments can
fact, they are not even produced.be “raised” on the ID structure with respect to the
Contrary to mono-stratal ones, XDG is morePA structure. Again, this example demonstrates that
modular. For one, as (Oliva et al., 1999) note,XDG is able to reduce a complex phenomenon such
mono-stratal approaches like HPSG usually giveas control to the interaction of per se fairly simple
precedence to the syntactic tree structure, whileprinciples such as valency, climbing and linking.
putting the description of other aspects of the anal-
6 ysis on the secondary level only, by means of fea-Note that we would have to extend the linking principle in
order to account e.g. for object raising. tures spread over the nodes of the tree. As a result,
fs
ag
pat
agin out linkPA PA PA,ID
den {} {} {}
Roman {ag?,pat?} {} {}
Peter {ag?,pat?} {} {}
versucht {} {ag!,prop!} {ag →{subj},prop →{vinf}}
zu {} {} {}
lesen {prop?} {ag!,pat!} {ag →{subj},pat →{obj}}
Figure 3: Lexicon of the example grammar fragment, PA part
it becomes a hard task to modularize grammars, e.g. good results for smaller-scale handwritten gram-
into parts for syntax and semantics. Because syntax mars, but not for larger-scale grammars induced
is privileged, the phenomena ascribing to semantics from treebanks (NEGRA, PDT) or converted from
cannot be described independently, and whenever other grammar formalisms (XTAG). Here, we plan
the syntax part of the grammar changes, the seman- to continue research on using XDG to parse and
tics part needs to be adapted. In XDG, no dimension generate with TAG grammars (Koller and Strieg-
is in any way privileged to another. Semantic phe- nitz, 2002), (Debusmann et al., 2004b). A last goal
nomena can be described much more independently is to integrate XDG with statistics, e.g. to guide the
from syntax. This facilitates grammar engineering, search for solutions, in the vein of (Dienes et al.,
and also the statement of cross-linguistic general- 2003).
izations. Assuming that the semantics part of a
Referencesgrammar stay invariant for most natural languages,
in order to accommodate a new language, ideally Ralph Debusmann, Denys Duchier, Alexander
only the syntactic parts would need to be changed, Koller, Marco Kuhlmann, Gert Smolka, and Ste-
leaving the semantics parts intact. We gave an ex- fan Thater. 2004a. A relational syntax-semantics
ample of this in§4. interface based on dependency grammar.
Ralph Debusmann, Denys Duchier, Marco
6 Conclusion Kuhlmann, and Stefan Thater. 2004b. Tag as
dependency grammar. In Proceedings of TAG+7,
In this paper, we introduced the XDG grammar
Vancouver/CAN.
framework, and emphasized that its new methodol-
Peter Dienes, Alexander Koller, and Marco
ogy places it in between the extremes of multi- and
Kuhlmann. 2003. Statistical A* Dependency
mono-stratal approaches. By means of an idealized
Parsing. In Prospects and Advances in the Syn-
example grammar, we demonstrated how complex
tax/Semantics Interface, Nancy/FRA.
phenomena can be explained as arising from the in-
Denys Duchier and Ralph Debusmann. 2001.
teraction of simple principles on numerous dimen-
Topological dependency trees: A constraint-
sions of linguistic description. On the one hand, this
based account of linear precedence. In Proceed-
methodology has the potential to modularize lin-
ings of ACL 2001, Toulouse/FRA.
guistic description and grammar engineering, and
Denys Duchier. 2003. Configuration of labeled
to facilitate the statement of linguistic generaliza-
trees under lexicalized constraints and principles.
tions. On the other hand, as XDG is a inherently
Research on Language and Computation, 1(3–
concurrent architecture, inferences from any dimen-
4):307–336.
sion can help reduce the ambiguity on others. These
Alexander Koller and Kristina Striegnitz. 2002.
inferences need not only stem from hard constraints,
Generation as dependency parsing. In Proceed-
but can also be preferences to guide the search for
ings of ACL 2002, Philadelphia/USA.
solutions.
Mozart Consortium. 2004. The Mozart-Oz web-
There are plenty of avenues for future research.
site. http://www.mozart-oz.org/.
Firstly, we plan to continue work on XDG as a
Karel Oliva, M. Andrew Moshier, and Sabine
framework. Here, one important goal is to find
Lehmann. 1999. Grammar engineering for the
out what criteria we can give to restrict the num-
next millennium. In Proceedings of the 5th Natu-
ber of principles. Secondly, we need to evolve
ral Language Processing Pacific Rim Symposium
the XDG grammar theory, and in particular the
1999 “Closing the Millennium”, Beijing/CHI.
XDG syntax-semantics interface (Debusmann et al.,
Tsinghua University Press.
2004a). Thirdly, for practical use, we need to im-
prove our knowledge about XDG solving (i.e. pars-
ing and generation). So far, we could only obtain