Refinements of Syntax、treebank and ambiguity (語法細分、語法樹庫和歧義)

Heads in noun phrases

  • Example:
    All the morning flights from Denver to Tampa leaving before 10.
  • In this phrase, flights is the main word. Such a main word of phrase is called its head, and the rest can be divided into material before and material after.
    在這裏插入圖片描述
  • If we study what can surround the head, and in which order, we find the following noun phrases can start with determiners:
  1. simple lexical items: the, this, a, . . . (this car)
  2. simple possessives: John’s car
  3. complex versions of that: John’s sister’s husband’s son’s cars
  • Before determiners, we may find predeterminers (all the flights)

Nominals

  • A nominal contains the head of the NP and any pre- and postmodifiers of the head.
  • Premodifiers:
  1. quantifiers, cardinals, ordinals (many clouds, three bikes, third place)
  2. adjectives (yellow submarine)
  • Quantifiers, cardinals, ordinals should be ahead of adjectives:
    eg. three large cars

  • Postmidifiers:

  1. prepositional phrases (flights from Seattle)
  2. non-finite (e.g. gerundive) postmodifiers (flights arriving before noon)
  3. restrictive relative clauses (flights that serve breakfast)
  4. Can be handled by rules like this:
    Nominal → Nominal PP
    Nominal → Nominal GerundVP
    Nominal → Nominal RelClause

Agreement

  • Agreement means that word forms in different parts of a constituent must correspond (in number, person, case, gender, etc.)
  • Examples:
  1. this flight
  2. those flights
  • Our rules so far generate ungrammatical sentences, i.e. the CFG overgenerates

Arguments in verb phrases

  • English VPs consist of head verb, along with zero or more arguments
  • Examples:
  1. VP → Verb : disappear
  2. VP → Verb NP : prefer a morning flight
  3. VP → Verb NP PP : put an apple on the table
  4. VP → Verb PP : live in town
  5. etc.
  • Arguments are allowed depends on the verb.For example, there are transitive and intransitive verbs:
  1. He locked the door
  2. He appeared
  • Ditransitive verbs can take direct object as well as indirect object. Some verbs require a (direct or indirect) object as argument:
  1. He lent him the book
  2. He gave her the apple
  • There are hundreds of different combinations of arguments that different verbs can take, and this is called subcategorisation of the verb.
  • More examples:
  1. John sneezed
  2. Find [NP a flight to NY ]
  3. Give [NP me ] [NP a cheaper fare ]
  4. Can you help [NP me ] [PP with a flight ]
  5. I prefer [ToVP to leave earlier ]
  6. I was told[S KLM has a flight]
  • But other combinations of verbs and arguments may be incorrect or at the very least dubious, eg:
  1. Give [NP me ] [PP with a flight]
  2. Find [ToVP to leave earlier ]

Problems for CFGs

  • With the broad categories that we used, the simple grammar rules we gave over-generate because of lack of agreement, and because we didn’t model subcategorisation of the verb One quick fix is to introduce finer categories.

Agreement using finer categories

Example CFG for agreement on number (singular/plural):
在這裏插入圖片描述

  • Disadvantage of the above approach:
  1. Explosion of the number of nonterminals and rules
  2. Looks ugly
  3. Consider this becomes even worse for languages with agreement on case, gender, etc.
  4. More elegant solutions will be discussed later (unification grammars)

Treebanks

  • Corpora in which each sentence has been annotated with parse.
  • Often done by hand, possibly computer-aided, on the basis of annotation guidelines how to deal with contentious cases.
  • Essential to statistical parsers (to be discussed later).
  • Humans are often bad at creating correct rules of (formal) grammar.
  • They are even worse at estimating rule probabilities.
  • So starting from samples of real language use is better.
  • Penn Treebank is most widely used treebank for English Best known is Wall Street Journal part of Penn Treebank 1 million words from 1987-1989 Wall Street Journal.
    在這裏插入圖片描述

Extraction of grammar from treebank

  • Rules can be put together to make a parse tree But conversely, we can extract rules from a tree. From a large treebank, one thereby obtains a fairly complete grammar.
  • Very often grammars are not written by hand, but extracted from treebanks.

Ambiguity

  • A sentence is structurally ambiguous if more than one parse tree exists (for given grammar). In the worst case, there may be exponentially many parses for one sentence.
  • A CFG is ambiguous if there is at least one ambiguous sentence.
  • Syntactic disambiguation: determining which is the correct parse tree among several.

Examples of ambiguity: PP-attachment

  • One morning I shot an elephant in my pajamas. How he got into my pajamas I don’t know. [Groucho Marx]
    在這裏插入圖片描述
    在這裏插入圖片描述

Examples of ambiguity: gerundive attachment

  • He saw the Eiffel Tower flying to Paris. :
  1. Who is doing the flying?
  2. Is flying part of gerundive construction with subject Eiffel. Tower ?
  3. Not likely, but grammatically valid
  • More examples:
  1. He saw the baker baking bread.
  2. He whistled a tune walking to his office.

Examples of ambiguity: coordination

  • old men and women [Tacitus, originally in Latin]:
  1. It so happens old refers only to men.
  2. In the text, the young men have been sent to war, leaving the women and the old men.

Examples of ambiguity: coordination (cont.)

  • Show me the meal on Flight UA 386 from San Fran- cisco to Denver. :
  1. There are three PPs, which can each be attached in several ways to the verb or to precedings NPs.
  2. Different choices can be combined to make more than a dozen different parses.
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章