Documentation for the PARC 700 Dependency Bank

Organization:


Introduction

This document provides basic documentation to the dependency bank structures done at PARC based on 700 randomly selected sentences from section 23 of the WSJ Penn Treebank.

There are several things to note in reading the dependency bank structures:

Redundancy/Doubling: The dependency bank is not the minimal set of features needed to describe a structure. For example, imperatives have both stmt_type imperative and mood imperative:

sentence: Jump.
stmt_type(jump~0, imperative)
mood(jump~0, imperative)

For certain applications, such features may be considered redundant and potentially cause problems with double counting of features (e.g., if the stmt_type of an imperative is matched correctly, the mood is also matched). For such applications, we provide a structure pruning tool. This tool allows the user to custom prune the dependency bank structure to remove undesired attributes.

Methodology

The dependency bank was produced as follows. First, the sentences were parsed with the PARC LFG English grammar. The parse closest to the correct one was saved; this saved parse was sometimes hand modified. These LFG parses were then manipulated to produce the dependency structures. This manipulation included a flattening of the structure, the elimination of certain grammar internal features, and the renaming of some features to be more universal. These converted structures were then hand checked and modified by two people each.

Dependency Format

A sample dependency format is shown below:

sentence(
  id(wsj_2356.19, parc_23.34)
  date(2002.6.12)
  validators(T.H. King, J.-P. Marcotte)
sentence_form( )
structure(
  mood(replace~0, indicative)
  tense(replace~0, past)
  passive(replace~0, +)
  stmt_type(replace~0, declarative)
  subj(replace~0, device~1)
  vtype(replace~0, main)
  det_form(device~1, the)
  det_type(device~1, def)
  num(device~1, sg)
  pers(device~1, 3))
)

Each dependency begins with sentence( and ends with ). It contains five fields. In order, the fields within each dependency are:

Finally, note that the escape character is \ (back slash). The escape character must be used for literal commas and parentheses; a literal back slash is represented as \\. There are relatively few of these in the PARC700. An example:

   conj(coord~0, Fulton Prebon \(U.S.A.\) Inc.~2)

back to top of documentation

Grammatical Functions

There are a number of grammatical functions used in the dependency treebank. These do not have a separate syntax from other features, but are discussed separately because they provide the core information that anyone using the dependency bank is likely to need.

Each function is discussed briefly below and an example from the dependency bank is provided. Some of the examples are relatively large because they have not been altered for the purposes of this document.

List of Grammatical Functions

subj

This is the subject. All verbs have subjects in the dependency bank. Other items may have subjects, e.g., small clauses in constructions like I consider him a nuisance.. The subject in the example below is device~1, which in turn has a number of attributes (det_form, det_type, num, pers).
sentence(
  id(wsj_2356.19, parc_23.34)
  date(2002.6.12)
  validators(T.H. King, J.-P. Marcotte)
sentence_form(The device was replaced.)
structure(
  mood(replace~0, indicative)
  tense(replace~0, past)
  passive(replace~0, +)
  stmt_type(replace~0, declarative)
  subj(replace~0, device~1)
  vtype(replace~0, main)
  det_form(device~1, the)
  det_type(device~1, def)
  num(device~1, sg)
  pers(device~1, 3))
)

back to list of grammatical functions
back to top of documentation

obj

This is the object. Verbs can have objects. Prepositions also have objects. The object of the verb pursue~0 in the example below is evidence~2, which in turn has a number of attributes.
sentence(
  id(wsj_2377.18, parc_23.183)
  date(2002.6.12)
  validators(M. Dalrymple, T.H. King)
sentence_form(Right now they're pursuing evidence.)
structure(
   adjunct(pursue~0, now~7)
   mood(pursue~0, indicative)
   obj(pursue~0, evidence~2)
   prog(pursue~0, +)
   stmt_type(pursue~0, declarative)
   subj(pursue~0, pro~1)
   tense(pursue~0, pres)
   vtype(pursue~0, main)
   case(pro~1, nom)
   num(pro~1, pl)
   pers(pro~1, 3)
   pron_form(pro~1, they)
   pron_type(pro~1, pers)
   num(evidence~2, sg)
   pers(evidence~2, 3)
   adegree(right~6, positive)
   adv_type(right~6, advmod)
   adegree(now~7, positive)
   adjunct(now~7, right~6)
   adv_type(now~7, sadv))
)

back to list of grammatical functions
back to top of documentation

obj_theta

obj_theta are secondary objects (the theta stands for thematic in LFG's linking theory). Obj_theta are found with verbs which have two object-like arguments. The cannonical example of this is give in structures like I gave the boy the box. In the example below, the obj_theta of pay~0 is premium~3, while its obj is RTC~2. Note that obj_theta is relatively rare.
sentence(
  id(wsj_2348.18, parc_23.498)
  date(2002.6.12)
  validators(T.H. King, J.-P. Marcotte)
sentence_form(NCNB Texas National Bank will pay the RTC a premium of $129 million for $3.5 billion in deposits.)
structure(
   adjunct(pay~0, for~6)
   mood(pay~0, indicative)
   obj(pay~0, RTC~2)
   obj_theta(pay~0, premium~3)
   stmt_type(pay~0, declarative)
   subj(pay~0, NCNB Texas National Bank~1)
   tense(pay~0, fut)
   vtype(pay~0, main)
   num(NCNB Texas National Bank~1, sg)
   pers(NCNB Texas National Bank~1, 3)
   proper(NCNB Texas National Bank~1, misc)
   det_form(RTC~2, the)
   det_type(RTC~2, def)
   num(RTC~2, sg)
   pers(RTC~2, 3)
   proper(RTC~2, misc)
   adjunct(premium~3, of~8)
   det_form(premium~3, a)
   det_type(premium~3, indef)
   num(premium~3, sg)
   pers(premium~3, 3)
   adv_type(for~6, vpadv)
   obj(for~6, $~9)
   ptype(for~6, semantic)
   adjunct_type(in~7, nominal)
   obj(in~7, deposit~13)
   ptype(in~7, semantic)
   adjunct_type(of~8, nominal)
   obj(of~8, $~25)
   ptype(of~8, semantic)
   adjunct($~9, in~7)
   num($~9, pl)
   number($~9, billion~15)
   pers($~9, 3)
   num(deposit~13, pl)
   pers(deposit~13, 3)
   adjunct(billion~15, 3.5~17)
   number_type(billion~15, cardinal)
   number_type(3.5~17, cardinal)
   num($~25, pl)
   number($~25, million~29)
   pers($~25, 3)
   adjunct(million~29, 129~31)
   number_type(million~29, cardinal)
   number_type(129~31, cardinal))
)

back to list of grammatical functions
back to top of documentation

comp

comps are closed complement clauses, i.e., clauses with the subject expressed internally. They correspond primarily to that and whether clauses, as in I said that it appeared.. However, there is no requirement that comps be finite (tensed). In the example below, the comp of say~0 is try~2.
sentence(
  id(wsj_2360.5, parc_23.50)
  date(2002.6.12)
  validators(T.H. King, J.-P. Marcotte)
sentence_form(He said the thrift will try to get regulators to reverse the decision.)
structure(
   comp(say~0, try~2)
   mood(say~0, indicative)
   stmt_type(say~0, declarative)
   subj(say~0, pro~1)
   tense(say~0, past)
   vtype(say~0, main)
   case(pro~1, nom)
   gend_sem(pro~1, male)
   num(pro~1, sg)
   pers(pro~1, 3)
   pron_form(pro~1, he)
   pron_type(pro~1, pers)
   mood(try~2, indicative)
   stmt_type(try~2, declarative)
   subj(try~2, thrift~11)
   subord_form(try~2, null)
   tense(try~2, fut)
   vtype(try~2, main)
   xcomp(try~2, get~5)
   inf_form(get~5, to)
   obj(get~5, regulator~9)
   subj(get~5, thrift~11)
   vtype(get~5, main)
   xcomp(get~5, reverse~6)
   inf_form(reverse~6, to)
   obj(reverse~6, decision~10)
   subj(reverse~6, regulator~9)
   vtype(reverse~6, main)
   num(regulator~9, pl)
   pers(regulator~9, 3)
   det_form(decision~10, the)
   det_type(decision~10, def)
   num(decision~10, sg)
   pers(decision~10, 3)
   det_form(thrift~11, the)
   det_type(thrift~11, def)
   num(thrift~11, sg)
   pers(thrift~11, 3))
)

There are two unusual uses of comps in the dependency bank. The first is for direct speech. With direct speech, what is said is analyzed as the main predicate of the sentence. In the example below, the top level predicate is give~0. The verb of saying is treated as a parenthetical adjunct whose comp is the main predicate. This creates a circular structure.

sentence(
  id(wsj_2350.10, parc_23.9)
  date(2002.6.12)
  validators(T.H. King, J.-P. Marcotte)
sentence_form(``Giveaways just give people the wrong image\,'' said Mr. Heinemann.)
structure(
   adjunct(give~0, just~7)
   adjunct(give~0, say~8)
   mood(give~0, indicative)
   obj(give~0, people~2)
   obj_theta(give~0, image~3)
   stmt_type(give~0, declarative)
   subj(give~0, giveaway~1)
   tense(give~0, pres)
   vtype(give~0, main)
   num(giveaway~1, pl)
   pers(giveaway~1, 3)
   num(people~2, pl)
   pers(people~2, 3)
   adjunct(image~3, wrong~25)
   det_form(image~3, the)
   det_type(image~3, def)
   num(image~3, sg)
   pers(image~3, 3)
   adegree(just~7, positive)
   adv_type(just~7, sadv)
   adjunct_type(say~8, quote-paren)
   comp(say~8, give~0)
   mood(say~8, indicative)
   subj(say~8, Mr. Heinemann~18)
   tense(say~8, past)
   vtype(say~8, main)
   num(Mr. Heinemann~18, sg)
   pers(Mr. Heinemann~18, 3)
   proper(Mr. Heinemann~18, name)
   adegree(wrong~25, positive)
   adjunct_type(wrong~25, nominal)
   atype(wrong~25, attributive))
)

The second unusual use of comp is with certain copular (linking be) constructions. When a copular verb has as its second argument something which has a subject of its own, a dummy xcomp is build which take a comp. Note that this is a reflex of the current LFG grammar in which copular be always takes an xcomp as an argument; we do not feel that this is the optimal analysis of this construction. This basic structure is outlined below:

sentence_form(Consensus is that John grew.)
structure(
  subj(be~0, consensus~2)
  vtype(be~0, copular)
  xcomp(be~0, pro~1)
  comp(pro~1, grow~11)
  subj(grow~11, John~12))

In the example below, the xcomp of be~0 is pro~1, which is a dummy predicate that takes grow~11 as its comp.

sentence(
  id(wsj_2397.35, parc_23.321)
  date(2002.6.12)
  validators(M. Dalrymple, T.H. King)
sentence_form(The consensus among economists is that it grew a much more sluggish 2.3% in the third quarter of 1989\, which ended two weeks ago.)
structure(
   mood(be~0, indicative)
   stmt_type(be~0, declarative)
   subj(be~0, consensus~2)
   tense(be~0, pres)
   vtype(be~0, copular)
   xcomp(be~0, pro~1)
   comp(pro~1, grow~11)
   subj(pro~1, consensus~2)
   adjunct(consensus~2, among~7)
   det_form(consensus~2, the)
   det_type(consensus~2, def)
   num(consensus~2, sg)
   pers(consensus~2, 3)
   adjunct_type(among~7, nominal)
   obj(among~7, economist~8)
   ptype(among~7, semantic)
   num(economist~8, pl)
   pers(economist~8, 3)
   adjunct(grow~11, end~17)
   adjunct(grow~11, in~16)
   mood(grow~11, indicative)
   obj(grow~11, percent~13)
   stmt_type(grow~11, declarative)
   subj(grow~11, pro~12)
   subord_form(grow~11, that)
   tense(grow~11, past)
   vtype(grow~11, main)
   case(pro~12, nom)
   gend_sem(pro~12, nonhuman)
   num(pro~12, sg)
   pers(pro~12, 3)
   pron_form(pro~12, it)
   pron_type(pro~12, pers)
   adjunct(percent~13, sluggish~41)
   det_form(percent~13, a)
   det_type(percent~13, indef)
   num(percent~13, sg)
   number(percent~13, 2.3~44)
   pers(percent~13, 3)
   adv_type(in~16, vpadv)
   obj(in~16, quarter~18)
   ptype(in~16, semantic)
   adjunct(end~17, week~32)
   adjunct_type(end~17, relative)
   mood(end~17, indicative)
   pron_rel(end~17, pro~29)
   stmt_type(end~17, declarative)
   subj(end~17, pro~29)
   tense(end~17, past)
   topic_rel(end~17, pro~29)
   vtype(end~17, main)
   adjunct(quarter~18, of~22)
   det_form(quarter~18, the)
   det_type(quarter~18, def)
   num(quarter~18, sg)
   number(quarter~18, three~28)
   pers(quarter~18, 3)
   adjunct_type(of~22, nominal)
   obj(of~22, 1989~23)
   ptype(of~22, semantic)
   number_type(1989~23, cardinal)
   pers(1989~23, 3)
   number_type(three~28, ordinal)
   case(pro~29, nom)
   pron_form(pro~29, which)
   pron_type(pro~29, relative)
   adjunct(week~32, ago~36)
   adv_type(week~32, sadv)
   num(week~32, pl)
   number(week~32, two~37)
   pers(week~32, 3)
   adegree(ago~36, positive)
   adv_type(ago~36, npadv)
   number_type(two~37, cardinal)
   adeg_dim(sluggish~41, positive)
   adegree(sluggish~41, comparative)
   adjunct(sluggish~41, more~50)
   adjunct_type(sluggish~41, nominal)
   atype(sluggish~41, attributive)
   number_type(2.3~44, cardinal)
   adjunct(more~50, much~51)
   adjunct_type(more~50, degree)
   adv_type(much~51, advmod))
)

back to list of grammatical functions
back to top of documentation

xcomp

xcomps are open complements (in contrast to comps, which are closed complements). Being open means that they share thier subject with some other predicate (there are a few situations in the dependency bank where this is not the case). Infinitival arguments of verbs are always treated as xcomps here. In addition, the second argument of a copular verb is an xcomp. In the example below, skeptical~1 is the xcomp of the copular verb be~0; note that the subject of skeptical~1 is some~2, which is the same as the subject of be~0.
sentence(
  id(wsj_2351.42, parc_23.20)
  date(2002.6.12)
  validators(T.H. King, J.-P. Marcotte)
sentence_form(Some in the industry are skeptical.)
structure(
   mood(be~0, indicative)
   stmt_type(be~0, declarative)
   subj(be~0, some~2)
   tense(be~0, pres)
   vtype(be~0, copular)
   xcomp(be~0, skeptical~1)
   adegree(skeptical~1, positive)
   atype(skeptical~1, predicative)
   subj(skeptical~1, some~2)
   adjunct(some~2, in~7)
   num(some~2, pl)
   pers(some~2, 3)
   adjunct_type(in~7, nominal)
   obj(in~7, industry~8)
   ptype(in~7, semantic)
   det_form(industry~8, the)
   det_type(industry~8, def)
   num(industry~8, sg)
   pers(industry~8, 3))
)

The example below contains an infinitival xcomp. expect~2 takes complete~10 as its xcomp; note that the subject of both predicates is pro~9.

sentence(
  id(wsj_2329.8, parc_23.414)
  date(2002.6.12)
  validators(T.H. King, J.-P. Marcotte)
sentence_form(Ideal said it expects to complete the transaction early next year.)
structure(
   comp(say~0, expect~2)
   mood(say~0, indicative)
   stmt_type(say~0, declarative)
   subj(say~0, Ideal~1)
   tense(say~0, past)
   vtype(say~0, main)
   num(Ideal~1, sg)
   pers(Ideal~1, 3)
   proper(Ideal~1, misc)
   mood(expect~2, indicative)
   stmt_type(expect~2, declarative)
   subj(expect~2, pro~9)
   subord_form(expect~2, null)
   tense(expect~2, pres)
   vtype(expect~2, main)
   xcomp(expect~2, complete~10)
   adjunct(year~5, early~15)
   adjunct(year~5, next~8)
   adv_type(year~5, sadv)
   num(year~5, sg)
   pers(year~5, 3)
   adegree(next~8, positive)
   adjunct_type(next~8, nominal)
   atype(next~8, attributive)
   case(pro~9, nom)
   gend_sem(pro~9, nonhuman)
   num(pro~9, sg)
   pers(pro~9, 3)
   pron_form(pro~9, it)
   pron_type(pro~9, pers)
   adjunct(complete~10, year~5)
   inf_form(complete~10, to)
   obj(complete~10, transaction~13)
   subj(complete~10, pro~9)
   vtype(complete~10, main)
   det_form(transaction~13, the)
   det_type(transaction~13, def)
   num(transaction~13, sg)
   pers(transaction~13, 3)
   adegree(early~15, positive)
   adjunct_type(early~15, nominal)
   atype(early~15, attributive))
)

back to list of grammatical functions
back to top of documentation

obl, obl_ag, and obl_compar

obl are oblique arguments of verbs, i.e., ones headed by prepositions. It is extremely difficult to tell an oblique argument from an adjunct in many instances. In general, if the grammar offered the choice between an analysis with an obl and one with an adjunct, we chose the obl; thus the choice is largely a reflection of the subcategorization frames in our verb lexicon. In the example below, to~11 is an obl of the verb go~9.

sentence(
  id(wsj_2376.12, parc_23.154)
  date(2002.6.12)
  validators(M. Dalrymple, T.H. King)
sentence_form(I see a possibility of going to 2200 this month.'')
structure(
   mood(see~0, indicative)
   obj(see~0, possibility~2)
   stmt_type(see~0, declarative)
   subj(see~0, pro~1)
   tense(see~0, pres)
   vtype(see~0, main)
   case(pro~1, nom)
   num(pro~1, sg)
   pers(pro~1, 1)
   pron_form(pro~1, I)
   pron_type(pro~1, pers)
   adjunct(possibility~2, of~8)
   det_form(possibility~2, a)
   det_type(possibility~2, indef)
   num(possibility~2, sg)
   pers(possibility~2, 3)
   adjunct_type(of~8, nominal)
   obj(of~8, go~9)
   ptype(of~8, semantic)
   adjunct(go~9, month~15)
   gerund(go~9, +)
   num(go~9, sg)
   obl(go~9, to~11)
   pers(go~9, 3)
   prog(go~9, +)
   subj(go~9, pro~10)
   vtype(go~9, main)
   pron_type(pro~10, null)
   obj(to~11, 2200~19)
   ptype(to~11, semantic)
   adv_type(month~15, sadv)
   deixis(month~15, proximal)
   det_form(month~15, this)
   det_type(month~15, demon)
   num(month~15, sg)
   pers(month~15, 3)
   number_type(2200~19, cardinal)
   pers(2200~19, 3))
)

Agents of passive verbs (i.e., by phrases) are specially encoded as obl_ag. These also get a pcase feature with the value by. The by does not have its own predicate taking an obj, unlike regular obl. In the example below, General Motors Acceptance Corp.~5 is the obl_ag of place~4.

sentence(
  id(wsj_2380.16, parc_23.220)
  date(2002.6.12)
  validators(T.H. King, J.-P. Marcotte)
sentence_form(COMMERCIAL PAPER placed directly by General Motors Acceptance Corp.:)
structure(
   adjunct(Commercial Paper~0, place~4)
   num(Commercial Paper~0, sg)
   pers(Commercial Paper~0, 3)
   proper(Commercial Paper~0, misc)
   stmt_type(Commercial Paper~0, header)
   adjunct(place~4, directly~8)
   adjunct_type(place~4, nominal)
   obl_ag(place~4, General Motors Acceptance Corp.~5)
   passive(place~4, +)
   subj(place~4, pro~6)
   vtype(place~4, main)
   num(General Motors Acceptance Corp.~5, sg)
   pcase(General Motors Acceptance Corp.~5, by)
   pers(General Motors Acceptance Corp.~5, 3)
   proper(General Motors Acceptance Corp.~5, misc)
   ptype(General Motors Acceptance Corp.~5, nonsemantic)
   pron_type(pro~6, null)
   adegree(directly~8, positive)
   adv_type(directly~8, vpadv))
)

A final specialized obl is the obl_compar which is the than or as phrase associated with a comparative or equative adjective. The obl_compar is a dependent of the adjective. In the example below, normal~16 is the obl_compar of the comparative adjective high~1.

sentence(
  id(wsj_2306.37, parc_23.159)
  date(2002.6.12)
  validators(M. Dalrymple, T.H. King)
sentence_form(Trading volume was only modestly higher than normal.)
structure(
   adjunct(be~0, only~6)
   mood(be~0, indicative)
   stmt_type(be~0, declarative)
   subj(be~0, volume~2)
   tense(be~0, past)
   vtype(be~0, copular)
   xcomp(be~0, high~1)
   adeg_dim(high~1, positive)
   adegree(high~1, comparative)
   adjunct(high~1, null~30)
   atype(high~1, predicative)
   obl_compar(high~1, than~17)
   subj(high~1, volume~2)
   mod(volume~2, trading~5)
   num(volume~2, sg)
   pers(volume~2, 3)
   gerund(trading~5, +)
   num(trading~5, sg)
   pers(trading~5, 3)
   adegree(only~6, positive)
   adv_type(only~6, sadv)
   adegree(normal~16, positive)
   atype(normal~16, attributive)
   obj(than~17, normal~16)
   adegree(modestly~19, positive)
   adv_type(modestly~19, advmod)
   adjunct(null~30, modestly~19)
   adjunct_type(null~30, degree))
)

back to list of grammatical functions
back to top of documentation

adjunct

adjuncts are the default type of unsubcategorized material. Adjectives in noun phrases are adjuncts, as are adverbial modifiers of verbs. Note that not all unsubcategorized modifiers are adjuncts. For example, noun modifiers in noun-noun compounds are mod, while name modifiers of complex names are app. Most adjuncts have either an adjunct_type or an adv_type in the dependency bank.

In the example below, main~7 is an adjectival adjunct of the noun reason~2.

sentence(
  id(wsj_2325.30, parc_23.401)
  date(2002.6.12)
  validators(T.H. King, J.-P. Marcotte)
sentence_form(The main reason remains weather.)
structure(
   mood(remain~0, indicative)
   stmt_type(remain~0, declarative)
   subj(remain~0, reason~2)
   tense(remain~0, pres)
   vtype(remain~0, main)
   xcomp(remain~0, weather~1)
   num(weather~1, sg)
   pers(weather~1, 3)
   subj(weather~1, reason~2)
   adjunct(reason~2, main~7)
   det_form(reason~2, the)
   det_type(reason~2, def)
   num(reason~2, sg)
   pers(reason~2, 3)
   adegree(main~7, positive)
   adjunct_type(main~7, nominal)
   atype(main~7, attributive))
)

Similarly, in the example below, but~4 is an adverbial adjunct of the verb worry~0.

sentence(
  id(wsj_2379.41, parc_23.209)
  date(2002.6.12)
  validators(T.H. King, J.-P. Marcotte)
sentence_form(But they are worried.)
structure(
   adjunct(worry~0, but~4)
   mood(worry~0, indicative)
   passive(worry~0, +)
   stmt_type(worry~0, declarative)
   subj(worry~0, pro~1)
   tense(worry~0, pres)
   vtype(worry~0, main)
   case(pro~1, nom)
   num(pro~1, pl)
   pers(pro~1, 3)
   pron_form(pro~1, they)
   pron_type(pro~1, pers)
   adegree(but~4, positive)
   adv_type(but~4, initadv))
)

One special case of adjuncts is that of sentential negation. Sentential negation is treated as adjunct of the verb, but is assigned a special adjunct_type negative. In the example below, not~6 is an adjunct of can~0.

sentence(
  id(wsj_2386.33, parc_23.277)
  date(2002.6.12)
  validators(T.H. King, J.-P. Marcotte)
sentence_form(Merrill Lynch can't survive without the little guy.'')
structure(
   adjunct(can~0, not~6)
   mood(can~0, indicative)
   stmt_type(can~0, declarative)
   subj(can~0, Merrill Lynch~2)
   tense(can~0, pres)
   vtype(can~0, modal)
   xcomp(can~0, survive~1)
   adjunct(survive~1, without~11)
   subj(survive~1, Merrill Lynch~2)
   vtype(survive~1, main)
   num(Merrill Lynch~2, sg)
   pers(Merrill Lynch~2, 3)
   proper(Merrill Lynch~2, misc)
   adjunct_type(not~6, negative)
   adjunct(guy~10, little~17)
   det_form(guy~10, the)
   det_type(guy~10, def)
   num(guy~10, sg)
   pers(guy~10, 3)
   adv_type(without~11, vpadv)
   obj(without~11, guy~10)
   ptype(without~11, semantic)
   adegree(little~17, positive)
   adjunct_type(little~17, nominal)
   atype(little~17, attributive))
)

back to list of grammatical functions
back to top of documentation

mod

mod is used to construct noun-noun compounds, which are extremely common. The last noun in the compound is the head. All other nouns in the compound are mods. In the example below, debt~5 is the mod of burden~2. Note that if there are three or more nouns in a compound, the last one is still the head and the others all modify that head; details of scoping are not resolved in the dependency bank.
sentence(
  id(wsj_2397.21, parc_23.318)
  date(2002.6.12)
  validators(M. Dalrymple, T.H. King)
sentence_form(Debt burdens are heavier.)
structure(
   mood(be~0, indicative)
   stmt_type(be~0, declarative)
   subj(be~0, burden~2)
   tense(be~0, pres)
   vtype(be~0, copular)
   xcomp(be~0, heavy~1)
   adeg_dim(heavy~1, positive)
   adegree(heavy~1, comparative)
   adjunct(heavy~1, null~3)
   atype(heavy~1, predicative)
   subj(heavy~1, burden~2)
   mod(burden~2, debt~6)
   num(burden~2, pl)
   pers(burden~2, 3)
   adjunct_type(null~3, degree)
   num(debt~6, sg)
   pers(debt~6, 3))
)

In the dependency bank, names of companies and people are not treated as noun-noun compounds even though many of them fit this pattern. Instead, they are treated as single names. For example, when referring to the company, Merrill Lynch is treated as a single name, not as Merrill being a mod of Lynch.

back to list of grammatical functions
back to top of documentation

topic_rel and pron_rel

topic_rel occurs with relative clauses and hence is quite common. In relative clauses, the topic_rel is the fronted constituent. The topic_rel always plays an additional role in the clause. The pron_rel is the relative pronoun itself. The pron_rel is either identical to the topic_rel (e.g., in the book which I read) or is an element within it (e.g., in the book whose cover is torn). In the example below, pro~29, which corresponds to that in the string, is the topic_rel and the pron_rel of accumulate~25. pro~29 is also the subj of accumulate~25.
sentence(
  id(wsj_2383.7, parc_23.241)
  date(2002.6.12)
  validators(T.H. King, J.-P. Marcotte)
sentence_form(It was previously thought ASKO held a 13.6% stake that was accumulated since July.)
structure(
   adjunct(think~0, previously~6)
   comp(think~0, hold~10)
   mood(think~0, indicative)
   passive(think~0, +)
   stmt_type(think~0, declarative)
   subj(think~0, pro~3)
   tense(think~0, past)
   vtype(think~0, main)
   case(pro~3, nom)
   gend_sem(pro~3, nonhuman)
   num(pro~3, sg)
   pers(pro~3, 3)
   pron_form(pro~3, it)
   pron_type(pro~3, pers)
   adegree(previously~6, positive)
   adv_type(previously~6, sadv)
   mood(hold~10, indicative)
   obj(hold~10, stake~14)
   stmt_type(hold~10, declarative)
   subj(hold~10, ASKO~13)
   subord_form(hold~10, null)
   tense(hold~10, past)
   vtype(hold~10, main)
   num(ASKO~13, sg)
   pers(ASKO~13, 3)
   proper(ASKO~13, misc)
   adjunct(stake~14, accumulate~25)
   adjunct(stake~14, percent~24)
   det_form(stake~14, a)
   det_type(stake~14, indef)
   num(stake~14, sg)
   pers(stake~14, 3)
   adv_type(since~17, vpadv)
   obj(since~17, July~18)
   ptype(since~17, semantic)
   num(July~18, sg)
   pers(July~18, 3)
   proper(July~18, date)
   adjunct_type(percent~24, nominal)
   num(percent~24, sg)
   number(percent~24, 13.6~28)
   pers(percent~24, 3)
   adjunct(accumulate~25, since~17)
   adjunct_type(accumulate~25, relative)
   mood(accumulate~25, indicative)
   passive(accumulate~25, +)
   pron_rel(accumulate~25, pro~29)
   stmt_type(accumulate~25, declarative)
   subj(accumulate~25, pro~29)
   tense(accumulate~25, past)
   topic_rel(accumulate~25, pro~29)
   vtype(accumulate~25, main)
   number_type(13.6~28, cardinal)
   case(pro~29, nom)
   num(pro~29, sg)
   pers(pro~29, 3)
   pron_form(pro~29, that)
   pron_type(pro~29, relative))
)

back to list of grammatical functions
back to top of documentation

focus_int and pron_int

focus_int is the fronted constituent in a wh-question. pron_int is the interrogative pronoun itself. pron_int is either identical to focus_int (e.g., in Who saw him?) or is an element in it (e.g., in Which book did you see?). As with topic_rel in relative clauses, the focus_int will have an additional role in the clause. As there are relatively few questions in the corpus, there are relatively few focus_int. In the example below, pro~2 is the focus_int and the pron_int of the main predicate be~0; note that pro~2 is also the subject of be~0.
sentence(
  id(wsj_2369.1, parc_23.116)
  date(2002.6.12)
  validators(M. Dalrymple, T.H. King)
sentence_form(Of all the ethnic tensions in America\, which is the most troublesome right now?)
structure(
   adjunct(be~0, now~8)
   adjunct(be~0, of~5)
   focus_int(be~0, pro~2)
   mood(be~0, indicative)
   pron_int(be~0, pro~2)
   stmt_type(be~0, interrogative)
   subj(be~0, pro~2)
   tense(be~0, pres)
   vtype(be~0, copular)
   xcomp(be~0, troublesome~1)
   adeg_dim(troublesome~1, positive)
   adegree(troublesome~1, superlative)
   atype(troublesome~1, predicative)
   subj(troublesome~1, pro~2)
   case(pro~2, nom)
   num(pro~2, sg)
   pers(pro~2, 3)
   pron_form(pro~2, which)
   pron_type(pro~2, interrogative)
   obj(of~5, tension~9)
   ptype(of~5, semantic)
   adjunct_type(in~6, nominal)
   obj(in~6, America~15)
   ptype(in~6, semantic)
   adegree(right~7, positive)
   adv_type(right~7, advmod)
   adegree(now~8, positive)
   adjunct(now~8, right~7)
   adv_type(now~8, sadv)
   adjunct(tension~9, ethnic~14)
   adjunct(tension~9, in~6)
   det_form(tension~9, the)
   det_type(tension~9, def)
   num(tension~9, pl)
   pers(tension~9, 3)
   quant(tension~9, all~21)
   adegree(ethnic~14, positive)
   adjunct_type(ethnic~14, nominal)
   atype(ethnic~14, attributive)
   num(America~15, sg)
   pers(America~15, 3)
   proper(America~15, location))
)

back to list of grammatical functions
back to top of documentation

poss

poss marks possessives in noun phrases. It may be a simple pronoun (e.g., in my book) or a complex noun phrase (e.g., in the boy in school's book). In the example below, the poss of philosophy~14 is Hooker~7.
sentence(
  id(wsj_2303.6, parc_23.6)
  date(2002.6.12)
  validators(T.H. King, J.-P. Marcotte)
sentence_form(Hooker's philosophy was to build and sell.)
structure(
   mood(be~0, indicative)
   stmt_type(be~0, declarative)
   subj(be~0, philosophy~14)
   tense(be~0, past)
   vtype(be~0, copular)
   xcomp(be~0, pro~16)
   inf_form(sell~4, to)
   subj(sell~4, pro~26)
   vtype(sell~4, main)
   num(Hooker~7, sg)
   pers(Hooker~7, 3)
   proper(Hooker~7, name)
   num(philosophy~14, sg)
   pers(philosophy~14, 3)
   poss(philosophy~14, Hooker~7)
   comp(pro~16, coord~19)
   subj(pro~16, philosophy~14)
   conj(coord~19, build~37)
   conj(coord~19, sell~4)
   coord_form(coord~19, and)
   coord_level(coord~19, VP)
   pron_type(pro~26, null)
   inf_form(build~37, to)
   subj(build~37, pro~26)
   vtype(build~37, main))
)

back to list of grammatical functions
back to top of documentation

conj

conj is not a traditional grammatical function. However, the treatment of coordination within the dependency bank warrants detailed discussion. Coordinate constructions are always given a predicate of the form coord~#. coord~# fulfills the grammatical function in the clause that the coordinate structure had. In the example below, the subject is coordinated.

sentence: John and Mary left.
subj(leave~0, coord~1)

The conjuncts within the coordination are conj of the coord~# predicate. Our previous example would thus expand to:

sentence: John and Mary left.
subj(leave~0, coord~1)
conj(coord~1, John~2)
conj(coord~1, Mary~3)

coord~# always has a value for coord_form, which indicates the form of the conjunction, and for coord_level, which indicates what type of constituents are coordinated. Our previous example would thus further expand to:

sentence: John and Mary left.
subj(leave~0, coord~1)
conj(coord~1, John~2)
conj(coord~1, Mary~3)
coord_form(coord~1, and)
coord_level(coord~1, NP)

In the example below, there are two coordinations, one of the subject and one of the noun-noun compound mod.

sentence(
  id(wsj_2339.5, parc_23.449)
  date(2002.6.12)
  validators(T.H. King, J.-P. Marcotte)
sentence_form(Both Merieux and Connaught are biotechnology research and vaccine manufacturing concerns.)
structure(
   mood(be~0, indicative)
   stmt_type(be~0, declarative)
   subj(be~0, coord~2)
   tense(be~0, pres)
   vtype(be~0, copular)
   xcomp(be~0, concern~1)
   mod(concern~1, coord~23)
   num(concern~1, pl)
   pers(concern~1, 3)
   subj(concern~1, coord~2)
   conj(coord~2, Connaught~5)
   conj(coord~2, Merieux~4)
   coord_form(coord~2, and)
   coord_level(coord~2, NP)
   num(coord~2, pl)
   pers(coord~2, 3)
   precoord_form(coord~2, both)
   num(Merieux~4, sg)
   pers(Merieux~4, 3)
   proper(Merieux~4, misc)
   num(Connaught~5, sg)
   pers(Connaught~5, 3)
   proper(Connaught~5, misc)
   adjunct(manufacturing~12, vaccine~18)
   gerund(manufacturing~12, +)
   num(manufacturing~12, sg)
   pers(manufacturing~12, 3)
   adegree(vaccine~18, positive)
   adjunct_type(vaccine~18, nominal)
   atype(vaccine~18, attributive)
   num(biotechnology~22, sg)
   pers(biotechnology~22, 3)
   conj(coord~23, manufacturing~12)
   conj(coord~23, research~26)
   coord_form(coord~23, and)
   coord_level(coord~23, NN)
   num(coord~23, pl)
   pers(coord~23, 3)
   mod(research~26, biotechnology~22)
   num(research~26, sg)
   pers(research~26, 3))
)

back to list of grammatical functions
back to top of documentation

number

Number modifiers of noun phrases are giving the function number. In the example below, two~12 is the number of year~8. Numbers have a number_type which can be either cardinal or ordinal.

sentence(
  id(wsj_2376.19, parc_23.157)
  date(2002.6.12)
  validators(M. Dalrymple, T.H. King)
sentence_form(Now\, as in those two years\, her stock market indicators are positive.)
structure(
   adjunct(be~0, as~6)
   adjunct(be~0, now~5)
   mood(be~0, indicative)
   stmt_type(be~0, declarative)
   subj(be~0, indicator~2)
   tense(be~0, pres)
   vtype(be~0, copular)
   xcomp(be~0, positive~1)
   adegree(positive~1, positive)
   atype(positive~1, predicative)
   subj(positive~1, indicator~2)
   mod(indicator~2, market~19)
   num(indicator~2, pl)
   pers(indicator~2, 3)
   poss(indicator~2, pro~24)
   adegree(now~5, positive)
   adv_type(now~5, initadv)
   adv_type(as~6, sadv)
   obj(as~6, in~7)
   ptype(as~6, semantic)
   obj(in~7, year~8)
   ptype(in~7, semantic)
   deixis(year~8, distal)
   det_form(year~8, that)
   det_type(year~8, demon)
   num(year~8, pl)
   number(year~8, two~12)
   pers(year~8, 3)
   number_type(two~12, cardinal)
   num(stock~17, sg)
   pers(stock~17, 3)
   mod(market~19, stock~17)
   num(market~19, sg)
   pers(market~19, 3)
   gend_sem(pro~24, female)
   num(pro~24, sg)
   pers(pro~24, 3)
   pron_form(pro~24, she)
   pron_type(pro~24, poss))
)

back to list of grammatical functions
back to top of documentation

quant

These are quantifiers that modify nouns. They can appear after the determiner, if there is one (e.g., in the many boxes). They can also appear before the determiner, if there is one, or in place of it (e.g., in all the boxes). In the example below, no~8 is the quant of buyer~1.

sentence(
  id(wsj_2300.75, parc_23.450)
  date(2002.6.12)
  validators(T.H. King, J.-P. Marcotte)
sentence_form(But there were no buyers.)
structure(
   adjunct(be~0, but~5)
   mood(be~0, indicative)
   stmt_type(be~0, declarative)
   subj(be~0, there~2)
   tense(be~0, past)
   vtype(be~0, copular)
   xcomp(be~0, buyer~1)
   num(buyer~1, pl)
   pers(buyer~1, 3)
   quant(buyer~1, no~8)
   subj(buyer~1, there~2)
   case(there~2, nom)
   gend_sem(there~2, nonhuman)
   num(there~2, pl)
   pers(there~2, 3)
   pron_type(there~2, expletive)
   adegree(but~5, positive)
   adv_type(but~5, initadv)
   polarity(no~8, -))
)

In the example below, many is the quant of manager~1.

sentence(
  id(wsj_2306.29, parc_23.133)
  date(2002.6.12)
  validators(M. Dalrymple, T.H. King)
sentence_form(Many fund managers argue that now's the time to buy.)
structure(
   comp(argue~0, be~2)
   mood(argue~0, indicative)
   stmt_type(argue~0, declarative)
   subj(argue~0, manager~1)
   tense(argue~0, pres)
   vtype(argue~0, main)
   quant(manager~1, many~22)
   mod(manager~1, fund~19)
   num(manager~1, pl)
   pers(manager~1, 3)
   mood(be~2, indicative)
   stmt_type(be~2, declarative)
   subj(be~2, now~12)
   subord_form(be~2, that)
   tense(be~2, pres)
   vtype(be~2, copular)
   xcomp(be~2, time~7)
   inf_form(buy~5, to)
   subj(buy~5, pro~6)
   vtype(buy~5, main)
   pron_type(pro~6, null)
   det_form(time~7, the)
   det_type(time~7, def)
   num(time~7, sg)
   pers(time~7, 3)
   subj(time~7, now~12)
   xcomp(time~7, buy~5)
   num(now~12, sg)
   pers(now~12, 3)
   num(fund~19, sg)
   pers(fund~19, 3))
)

back to list of grammatical functions
back to top of documentation

Other Features

This section describes features found in the dependency bank which are not grammatical functions. Their use is briefly described, along with their possible values. Note that not all values are necessarily found in the dependency bank; these values are the ones permitted by the grammar.

Examples in this section show only partial structures in order to focus on the feature in question. More detailed examples can be found by searching the dependency bank for the feature or value in question.

Feature: adegree

Possible Values: comparative positive superlative

adegree provides the degree of adjectives: comparative for comparatives, superlative for superlatives, and positive for all other adjectives.

sentence: That is the reddest apple.
adegree(red~1, superlative)
sentence: That is a red apple.
adegree(red~1, positive)

Feature: adeg_dim

Possible Values: positive negative equative

adeg_dim (adegree dimension) provides information as to whether the adjective degree is positive (more red) negative (less red) or equative (as red). Regular adegree positive adjectives do not have adeg_dim (red).

sentence: A less costly solution was found.
adegree(costly~1, comparative)
adeg_dim(costly~1, negative)

Feature: adjunct_type

Possible Values: cleft conditional degree manner negative nominal parenthetical purpose quote-paren relative temporal

Many, but not all, adjuncts are marked with an adjunct_type. Simple adverbials are given a adv_type instead.

sentence:  The dog, a poodle, appeared.
adjunct_type(poodle~1, parenthetical)
sentence: The red fox appeared.
adjunct_type(red~1, nominal)

Of particular interest in this corpus, is adjunct_type quote-paren which is used for verbs of saying in direct speech.

sentence: ``Mr. Jacobs resigned yesterday,'' said the spokesman.
adjunct_type(say~1, quote-paren)

Feature: adv_type

Possible Values: advmod affix amod amod-int delimiter focus initadv npadv nummod pmod sadv timeadv vpadv

adv_type is used for adverbs in a relatively narrow sense (compared to adjuncts in general). Often the type of adverb can be deduced from the word it modifies, in which case the adv_type feature may be viewed as redundant. However, the difference between sadv (sentential adverbs) and vpadv (VP adverbs) may be difficult to recover.

sentence: They ran quickly.
adv_type(quickly~1, vpadv)
sentence: A very small box arrived.
adv_type(very~1, amod)
sentence: Only boxes arrived.
adv_type(only~1, focus)

Feature: atype

Possible Values: attributive predicative

All adjectives have an atype, indicating whether they are attributive (e.g., modifying nouns) or predicative (e.g., second argument of copular be).

sentence: I found the red box.
atype(red~1, attributive)
sentence: The box is red.
atype(red~1, predicative)

Feature: case

Possible Values: acc gen nom

In the dependency bank structures, case is only used with pronouns. No case marking is indicated for nouns. In general, the case marking can be derived from the grammatical function of the (pro)noun. Genitive case marking (gen) is rare in the dependency bank and generally occurs with the relative pronoun whose.

sentence: I arrived.
case(pro~1, nom)
sentence: John saw me.
case(pro~1, acc)

Feature: coord_form

Possible Values: and as_well_as but nor or plus v. , ; :

Conjunctions provide a coord_form to indicate which conjunction was used. When punctuation is used as a conjunction, it provides a value as well. All coord should have a coord_form.

sentence: John and Mary left.
coord_form(coord~1, and)
sentence: John came; Mary left.
coord_form(coord~0, ;)

Feature: coord_level

Possible Values: declared without constraints

Coordinate structure provide a coord_level value to indicate what types of constituents were coordinated. This information is recorded because it is useful in various applications, such as semantics. All coord should have a coord_level.

sentence: John and Mary left.
coord_level(coord~1, NP)
sentence: John came; Mary left.
coord_level(coord~0, ROOT)

Feature: deixis

Possible Values: distal proximal

Demonstratives have a feature deixis to distinguish distal (far) from proximal (close) demonstratives. Note that since the demonstratives themselves do not have indexed predicates, the value will show up on the head noun.

sentence: This box is open.
det_type(box~1, demon)
deixis(box~1, proximal)

Feature: det_form

Possible Values: a another that the this

det_form is the form of the determiner, where determiners include demonstratives. Note that an is realized as a in the dependency bank as a result of stemming. All determiners have both a det_form and a det_type.

sentence: An apple was eaten.
det_form(apple~1, a)

Feature: det_type

Possible Values: def demon indef

det_type classifies the determiners into def(inite) indef(inite) and demon(strative). All determiners have both a det_form and a det_type.

sentence: An apple was eaten.
det_type(apple~1, indef)

Feature: emphasis

Possible Values: +

emphasis is a feature provided by emphatic do. There is no other reflex of emphatic do in the dependency bank representation.

sentence: They did leave.
emphasis(leave~0, +)

Feature: gend_sem

Possible Values: female male nonhuman

gend_sem records semantic gender. It is used primarily for pronouns. There is no marking of grammatical gender in English and hence not in the depedency bank.

sentence: She appeared.
gend_sem(pro~1, female)

Feature: gerund

Possible Values: +

Gerunds are marked with gerund +. gerund + may mark certain nouns which have been lexicalized and hence may less clearly be gerunds (e.g., marketing). These gerunds are not stemmed to the verb from which they are derived.

sentence: Publishing is not doing well.
gerund(publishing~1, +)

Feature: inf_form

Possible Values: to

inf_form marks infinitives with to (bare infinitives (We watched them work) are not marked in the dependency bank). Note that inf_form is the only indication that to appeared in the original structure; it bears no other features in the dependency bank.

sentence: I want to leave.
inf_form(leave~2, to)

Feature: mood

Possible Values: imperative indicative subjunctive

mood marks the mood of a clause. There are relatively few subjunctives in the dependency bank. All imperatives have mood imperative marking.

sentence: They push it.
mood(push~0, indicative)
sentence: Push it.
mood(push~0, imperative)

Feature: num

Possible Values: pl sg

num is the number of nouns. Singular nouns get sg and plural nouns get pl. Note that the non-head nouns (mod) in noun-noun compounds are always treated as being singular even if the morphology is plural; since these are being treated as singulars, they are not stemmed (if you find errors with this, please contact us).

sentence: Cats appeared.
num(cat~1, pl)
sentence: Securities fraud is rampant.
mod(fraud~1, securities~2)
num(securities~2, sg)

Feature: number_type

Possible Values: cardinal ordinal

Numbers are given a number_type of either cardinal (3) or ordinal (3rd); numbers are treated identically whether they are written as digits (3) or spelled out (three). All numbers should have a number_type if they modify a noun.

sentence: Six boys appeared.
number_type(six~1, cardinal)

Feature: partitive

Possible Values: +

Partitives are marked with partitive +. The marking appears on the quantifier or number that triggered the partitive construction.

sentence: Six of the boxes broke.
partitive(six~1, +)

Feature: passive

Possible Values: +

Passive verbs are marked by passive +. Active verbs are not marked, i.e., there is no passive - feature in the dependency bank. Passive participles that are not the main predicate of the clause are also marked with passive +.

sentence: It was broken.
passive(break~0, +)
sentence: Other items included in it are ...
passive(include~1, +)

Feature: pcase

Possible Values: declared without constraints

pcase is used to mark the presence of a nonsemantic preposition. Semantic prepositions show up as regular predicates in the dependency bank and take an obj. The main nonsemantic preposition is the dependency bank is by which is used to mark the agents of passive verbs (obl_ag). Certain verbs also mark their object with nonsemantic prepositions, although these are relatively rare. If you find errors with this, please contact us.

sentence: It was broken by John.
pcase(John~1, by)
sentence: I rely on that book.
pcase(book~1, on)

Feature: perf

Possible Values: +

perf indicates the presence of the perfective auxiliary have. Note that no attempt has been made to determine the actual semantics of the clause; the feature is a reflex of the syntactic form. The absence of the auxiliary is not indicated by perf - in the dependency bank.

sentence: They have appeared.
perf(appear~0, +)

Feature: pers

Possible Values: 1 2 3

pers indicates the person of nouns and pronouns. Nouns are always pers 3. Pronouns receive the appropriate value.

sentence: We opened the box.
pers(we, 1)
pers(box, 3)

Feature: polarity

Possible Values: -

polarity indicates certain negative polarity items that can trigger inversion. In particular, it marks no and never, even when they do not trigger inversion.

sentence: No girls appeared.
polarity(no~1, -)

Feature: precoord_form

Possible Values: both either neither

precoord_form marks the presence and form of precoordination material. precoord_form will always modify coord.

sentence: Either John or Mary appeared.
precoord_form(coord~1, either)

Feature: prog

Possible Values: +

prog indicates the presence of the progressive auxiliary be. Note that no attempt has been made to determine the actual semantics of the clause; the feature is a reflex of the syntactic form. The absence of the auxiliary is not indicated by prog - in the dependency bank.

sentence: They are appearing.
prog(appear~0, +)

Feature: pron_form

Possible Values: another anyone anybody anything anywhere each_other everybody everything everyone everywhere he here hers his how how_come how_many how_much however I it mine most my nobody no_one nothing nowhere null ours she somebody someone something sometime somewhere that theirs there these they this those we what what_if whatever whatsoever when whenever where wherever which whichever who whom whoever whose whosever whosoever why you yours

All pronouns have a predicate value of pro (plus the dependency bank index ~#). The form of the pronoun is recorded in the pron_form. Note that by examing all of the features associated with the pronoun, it should be possible to deduce the pronoun form; however, the pron_form value is provided for those who find it useful to have it stated directly.

sentence: They appeared.
pron_form(pro~1, they)

Feature: pron_type

Possible Values: demon expletive free interrogative locative null pers quant poss refl relative

All pronouns are assigned a pron_type to indicate what class of pronouns they belong to (demon = demonstrative; expletive = expletive; free = free; interrogative = interrogative; locative = locative; null = no over realization; pers = personal; quant = quantifier; poss = possessive; refl = reflexive; relative = relative). The only unusual pron_type is null, which indicates that there was no surface realization of the pronoun; instead, the pronoun is provided to satisfy the subcategorization requirements of the verb.

sentence: She left.
pron_type(pro~1, pers)
sentence: Prices increased, following Black Monday.
subj(follow~1, pro~2)
pron_type(pro~2, null)

Feature: proper

Possible Values: date location name title misc

proper is assigned to proper nouns in order to classify them. Locations are assigned proper location. Personal names are assigned proper name. Titles (in personal names) are assigned proper title. Everything else is assigned proper misc. In the dependency bank, the bulk of proper misc nouns are company names.

sentence: Shidler Investment Corp. refused to comment.
proper(Shidler Investment Corp.~1, misc)

Feature: prt_form

Possible Values: around back down in off on out over up

Particle verbs provide a prt_form value to indicate the particle. There is no other reflex of the particle in the dependency bank structures. The position of the particle in the sentence (e.g., before or after the object) is not indicated in the dependency bank other than in the copy of the string.

sentence: He threw it out.
prt_form(throw~0, out)

Feature: ptype

Possible Values: nonsemantic semantic

Prepositions are classified as semantic, in which case they take an obj, and nonsemantic, in which case they provide a pcase value. Most prepositions are semantic. The most common nonsemantic preposition is by used to mark agents of passive verbs (obl_ag).

sentence: It is on the table.
ptype(on~1, semantic)
sentence: It was broken by John.
pcase(John~1, by)

Feature: quant_type

Possible Values: comparative

The only quantifiers that are marked with quant_type are quantifiers which mark comparatives. Other quantifiers do not receive a value for quant_type; they can be recognized by having, for example, a pron_type quant.

sentence: More investors left.
quant_type(more~1, comparative)

Feature: stmt_type

Possible Values: declarative header imperative interrogative purpose

Clauses are marked for stmt_type (statement type). The values should be self explanatory. header is used for items which are not sentences, but instead are headers (usually NPs, but sometimes other categories).

sentence: Who came?
stmt_type(come~0, interrogative)
sentence: John came.
stmt_type(come~0, declarative)

Feature: subord_form

Possible Values: for if null that whether

subord_form records the form of subordinating complementizer used. This is the only indication of the choice of complementizer, although the choice of if/whether versus that/null can be detected by looking at the stmt_type (interrogative versus declarative). The value null always corresponds to a "dropped" that.

sentence: I know that he came.
subord_form(come~1, that)
sentence: I know he came.
subord_form(come~1, null)
sentence:   It is necessary for him to come.
subord_form(come~1, for)

Feature: tense

Possible Values: fut past pres

The tense of a clause is marked by tense (fut=future; past=past; pres=present). Complex tenses are not indicated overtly, these can be derived by looking a the tense feature in combination with the perf and prog features.

sentence: I appeared.
tense(appear~0, past)
sentence: He will have appeared.
tense(appear~0, fut)

Feature: vconstr

Possible Values: cleft

vconstr is used to encode certain verbal constructions, namely clefts (It is the box that I want to open.). With clefts, the vconstr feature will always modify a be predicate. Clefts are relatively rare in the dependency bank.

sentence: It is Mary that I saw.
vconstr(be~0, cleft)

Feature: vtype

Possible Values: copular main modal

vtype is the type of the verb. main is an ordinary verb, e.g., say, make, call. copular indicates a copular (linking) verb, which is almost always be. modal is a modal verb, e.g., might, should. All verbs have a vtype, not just the matrix, tensed verb.

  
sentence: They left.
vtype(leave~0, main)
sentence: It is green.
vtype(be~0, copular)
sentence: They should leave.
vtype(should~0, modal)
vtype(leave~1, main)

Appendices

NOTE: The appendices are simply a listing of the grammatical functions and features+values that were discussed above. They contain no new information, but rather serve as a summary. back to top of documentation

Function Names

Features with Values

Below is a list of features and their possible values. Note that _ is the escape character for space. This is needed because some values are multiword expressions and hence contain spaces. Note that the _ does not appear in the dependency bank; instead, a space will appear within the value.

It is likely that not all values are found in the dependency bank. They are listed here because they may appear in extensions of the dependency bank.

back to top of documentation
Contact Information

Please send queries, comments, and suggestions to Tracy Holloway King (thking "at" parc.com /www).

Last modified:
Created by: Tracy Holloway King
Maintained by: Tracy Holloway King