Due: Tuesday, January 20 (by midnight)
Submit assignments electronically to both professors
(rmkaplan@stanford.edu and thking@stanford.edu)
| Turn in: | 1. the final grammar you end up with (eng-week2.lfg); |
| please name it lastname-eng-week2.lfg | |
| 2. a paragraph discussing the monkey/sheep example | |
| 3. your revised testsuite with parse statistics (eng-week2-test.lfg.new); | |
| please name it lastname-eng-week2-test.lfg.new | |
| 4. a rough estimate of how long this took so that we can adjust future assignments as needed |
| Exercises on: | |
| PART 1: | rule annotation, subcategorization and constraining equations |
| PART 2: | templates |
| PART 3: | testsuites |
Start from the grammar eng-week2.lfg
Do not use punctuation or capital letters; in later grammars we will add these in.
If you put a file called xlerc in the directory with your grammar and in xlerc you put:
create-parser eng-week2.lfg
then whenever you start xle in that directory, it will automatically load eng-week2.lfg. This will save a lot of time when making and testing changes.
If the problem is that a word is not listed in the lexicon, a message to this effect will appear in the XLE window, in addition to the morphology window appearing:
% parse "a monkey in the garden devours a banana"
parsing {a monkey in the garden devours a banana}
Chart unconnected because of unknown words
Word possibly causing problem: garden
0 solutions, 0.01 CPU seconds, 0 subtrees
0
%
Add the missing entry.
Try the sentences:
a monkey sees the banana
a girl is devouring the orange
Add the missing entries with the relevant PRED and TENSE features.
Extend the lexicon and the S and VP-rules to cover the temporal adverbs yesterday and today:
the girl saw the monkey yesterday
the girl saw the monkey yesterday in the park
yesterday the girl saw the monkey
You will need a new c-structure category to do this. In terms of f-structure, their contribution should end up in the same set as those of local PPs modifying verbs (e.g., in the park in the girl saw the monkey in the park).
When you make a change to the rule, you must restart XLE for it to take effect. XLE should warn you if you forgot to restart:
Warning: File /tilde/thking/eng-week2.lfg containing non-lexicon section has been changed. Non-lexicon changes will not take effect until exit from XLE and restart.
3.1 Add a treatment of ditransitive verbs to the grammar and the lexicon. Classically in LFG the second object is analyzed as OBJ2.
the girl gave the monkey a banana
the girl gives the monkey a banana
the girls give the monkey a banana
the girl is giving the monkey a banana
Make sure that the grammatical functions you are using are defined in the CONFIG section.
Your analysis should avoid adding a spurious ambiguity to ordinary transitive sentences:
the monkey devoured the banana
should still have only one analysis.
3.2 Add verbs taking a prepositional object. The standard LFG analysis assumes a function OBL for these (OBL stands for "oblique").
the monkey thought about a banana
How do you have to modify the f-annotation of the PPs in the verb rule? Again, don't forget to add the new grammatical function in the CONFIG section.
In the current version of the grammar, subject-verb number agreement is checked by a constraining equation like:
(^ SUBJ NUM) =c sg
or
(^ SUBJ NUM) =c pl
Can you explain the difference in the grammar's behaviour for the following two sentences?
the monkey sleeps
the sheep sleeps
What are the options for fixing the problem? Implement your preferred solution so that both sentences get only one parse.
Turn in: When you turn in your final grammar, include in the email a paragraph discussing the difference.
If you look at the lexicon in your current grammar, you will see that a lot of material is repeated. This can lead to mistakes and makes it difficult to maintain grammars because if you change an analysis you have to make the change in many places. To capture regularities, XLE/LFG has a formal device called a template.
Use the existing templates as models to redo the lexicon using templates (see the entry for orange and the templates it calls). Your resulting lexicon should have no ^ in it, although some lexical entries may call more than one template. You can use the following template names; feel free to add additional templates to group these or to have these templates call other more basic ones:
To see how a lexical entry expands, on the xle command line try:
print-lex-entry orange
When you make a change to the templates, you must restart XLE for it to take effect. Templates are like grammar rules in this respect. XLE should warn you if you forgot to restart.
Templates can be called from the grammar rules. Look at the templates:
UP-OBJ = "annotation to assign object function" @(UP-GF OBJ). UP-GF(_GF) = "generic annotation to assign a grammatical function" (^ _GF)=!.
In the VP rule, replace:
(^ OBJ)=!
with:
@UP-OBJ
Restart the grammar and parse:
the monkey devoured a banana
To see how your new rule expands, on the xle command line try:
print-rule VP
Create similar templates and calls for SUBJ, OBJ2, and OBL.
Also create templates and calls for CASE and for the ! $ (^ ADJUNCT) annotation.
Turn in: Submit this final version of your grammar.
As the grammar expands, it is very easy to make changes that effect sentences in ways you did not expect. To help detect this problem, you can create a testsuite.
Look at the basic testsuite eng-week2-test.lfg (the emacs library works best if you name your testsuite with a .lfg suffix). # is used to introduce comments. Each item to be parsed is on its own line, surrounded by blank lines. The default parse category is defined by ROOTCAT in the grammar; here it is S. If you want to parse another category, it must precede the item with (e.g., NP: a monkey).
In xle, try:
parse-testfile eng-week2-test.lfg
This will produce several files:
Add sentences to the testfile that will cover all the basic grammar rules. For example:
Add some NPs to test out the NP rules. For example:
Parse your new testfile. Make sure that all the items parse and get the correct number of parses (usually 1, but there may be some legitimate ambiguities which will result in 2 or more parses).
Turn in: Submit the .new version of your new testsuite.
If you have any questions, you can send us email (rmkaplan@stanford.edu and thking@stanford.edu), call us (Ron: 812-4348; Tracy: 812-4808), or talk to us during office hours.