Ling187: Grammar Engineering

Homework Assignment for Week 2

Due: Wednesday, April 19 (by midnight)
Submit assignments electronically to both professors (kaplan "at" parc.com and thking "at" parc.com)


Turn in: 1. the final grammar you end up with (eng-week2.lfg);
please name it lastname-eng-week2.lfg
2. a paragraph discussing the monkey/sheep example
3. your revised testsuite with parse statistics (eng-week2-test.lfg.new);
please name it lastname-eng-week2-test.lfg.new
4. a rough estimate of how long this took so that we can adjust future assignments as needed
Exercises on:
PART 1: rule annotation, subcategorization and constraining equations
PART 2: templates
PART 3: testsuites

Start from the grammar eng-week2.lfg

Do not use punctuation or capital letters; in later grammars we will add these in.

If you put a file called xlerc in the directory with your grammar and in xlerc you put:

  create-parser eng-week2.lfg

then whenever you start xle in that directory, it will automatically load eng-week2.lfg. This will save a lot of time when making and testing changes.


PART 1: Rule annotation, Subcategorization and Constraining equations

EXERCISE 1 -- Missing Entries

If the problem is that a word is not listed in the lexicon, a message to this effect will appear in the XLE window, in addition to the morphology window appearing:

    
% parse "a monkey in the garden devours a banana"
parsing {a monkey in the garden devours a banana}  

 Chart unconnected because of unknown words
 Word possibly causing problem: garden 
0 solutions, 0.01 CPU seconds, 0 subtrees
0
%
Add the missing entry.

Try the sentences:

    a monkey sees the banana
    a girl is devouring the orange

Add the missing entries with the relevant PRED and TENSE features.

EXERCISE 2 - Extending c-structure rules to cover more adjuncts

Extend the lexicon and the S and VP-rules to cover the temporal adverbs yesterday and today:

       the girl saw the monkey yesterday
       the girl saw the monkey yesterday in the park
       yesterday the girl saw the monkey

You will need a new c-structure category to do this. In terms of f-structure, their contribution should end up in the same set as those of local PPs modifying verbs (e.g., in the park in the girl saw the monkey in the park).

When you make a change to the rule, you must restart XLE for it to take effect. XLE should warn you if you forgot to restart:

Warning: File /tilde/thking/eng-week2.lfg containing non-lexicon
section has been changed.  
Non-lexicon changes will not take effect until exit from XLE and restart.

EXERCISE 3 - New verbs with different subcategorization

3.1 Add a treatment of ditransitive verbs to the grammar and the lexicon. Classically in LFG the second object is analyzed as OBJ2.

          the girl gave the monkey a banana
          the girl gives the monkey a banana
          the girls give the monkey a banana
          the girl is giving the monkey a banana

Make sure that the grammatical functions you are using are defined in the CONFIG section under GOVERNABLERELATIONS.

Your analysis should avoid adding a spurious ambiguity to ordinary transitive sentences:

     the monkey devoured the banana

should still have only one analysis.

3.2 Add verbs taking a prepositional object. The standard LFG analysis assumes a function OBL for these (OBL stands for "oblique").

    the monkey thought about a banana

How do you have to modify the f-annotation of the PPs in the verb rule? Again, don't forget to add the new grammatical function in the CONFIG section.

EXERCISE 4 - Constraining equations and underspecification

In the current version of the grammar, subject-verb number agreement is checked by a constraining equation like:

   (^ SUBJ NUM) =c sg

or

   (^ SUBJ NUM) =c pl

Can you explain the difference in the grammar's behaviour for the following two sentences?

       the monkey sleeps
       the sheep sleeps

What are the options for fixing the problem? Implement your preferred solution so that both sentences get only one parse.

Turn in: When you turn in your final grammar, include in the email a paragraph discussing the difference.


PART 2: Templates

If you look at the lexicon in your current grammar, you will see that a lot of material is repeated. This can lead to mistakes and makes it difficult to maintain grammars because if you change an analysis you have to make the change in many places. To capture regularities, XLE/LFG has a formal device called a template.

EXERCISE 1 - Lexicon templates

Use the existing templates as models to redo the lexicon using templates (see the entry for orange and the templates it calls). Your resulting lexicon should have no ^ in it, although some lexical entries may call more than one template. You can use the following template names; feel free to add additional templates to group these or to have these templates call other more basic ones:

To see how a lexical entry expands, on the xle command line try:

   print-lex-entry orange

When you make a change to the templates, you must restart XLE for it to take effect. Templates are like grammar rules in this respect. XLE should warn you if you forgot to restart.

EXERCISE 2 - Grammar templates

Templates can be called from the grammar rules. Look at the templates:

   UP-OBJ = "annotation to assign object function"
	  
	  @(UP-GF OBJ).

   UP-GF(_GF) = "generic annotation to assign a grammatical function"
	  
	  (^ _GF)=!.

In the VP rule, replace:

   (^ OBJ)=!

with:

   @UP-OBJ

Restart the grammar and parse:

   the monkey devoured a banana

To see how your new rule expands, on the xle command line try:

   print-rule VP

Create similar templates and calls for SUBJ, OBJ2, and OBL.

Also create templates and calls for CASE and for the ! $ (^ ADJUNCT) annotation.

Turn in: Submit this final version of your grammar.


PART 3: Testsuites

As the grammar expands, it is very easy to make changes that effect sentences in ways you did not expect. To help detect this problem, you can create a testsuite.

Look at the basic testsuite eng-week2-test.lfg (the emacs library works best if you name your testsuite with a .lfg suffix). # is used to introduce comments. Each item to be parsed is on its own line, surrounded by blank lines. The default parse category is defined by ROOTCAT in the grammar; here it is S. If you want to parse another category, it must precede the item with (e.g., NP: a monkey).

In xle, try:

   parse-testfile eng-week2-test.lfg

This will produce several files:

Add sentences to the testfile that will cover all the basic grammar rules. For example:

Add some NPs to test out the NP rules. For example:

Parse your new testfile. Make sure that all the items parse and get the correct number of parses (usually 1, but there may be some legitimate ambiguities which will result in 2 or more parses).

Turn in: Submit the .new version of your new testsuite.


If you have any questions, you can send us email (kaplan "at" parc.com and thking "at" parc.com), call us (Ron: 812-4348; Tracy: 812-4808), or talk to us during office hours.