12.07.2015 Views

Large-Scale Semi-Supervised Learning for Natural Language ...

Large-Scale Semi-Supervised Learning for Natural Language ...

Large-Scale Semi-Supervised Learning for Natural Language ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Set BASE [Golding and Roth, 1999] TRIGRAM SUMLM SUPERLMamong/between 60.3 86.0 80.8 90.5 92.8amount/number 75.6 86.2 83.9 93.2 93.7cite/sight/site 87.1 85.3 94.3 96.3 97.6peace/piece 60.8 88.0 92.3 97.7 98.0raise/rise 51.0 89.7 90.7 96.6 96.6Average 66.9 87.0 88.4 94.8 95.7Table 3.2: Context-sensitive spelling correction accuracy (%) on different confusion setsfillers near the beginning of the context pattern are more important, as the object of thepreposition is crucial <strong>for</strong> distinguishing these two classes (“between the two” but “amongthe three”). SUPERLM can exploit the relative importance of the different positions andthereby achieve higher per<strong>for</strong>mance.3.7 Non-referential Pronoun DetectionWe now present an application of our approach to a difficult analysis problem: detectingnon-referential pronouns. In fact, SUPERLM was originally devised <strong>for</strong> this task, and thensubsequently evaluated as a general solution to all lexical disambiguation problems. Moredetails on this particular application are available in our ACL 2008 paper [Bergsma et al.,2008b].3.7.1 The Task of Non-referential Pronoun DetectionCoreference resolution determines which noun phrases in a document refer to the samereal-world entity. As part of this task, coreference resolution systems must decide whichpronouns refer to preceding noun phrases (called antecedents) and which do not. In particular,a long-standing challenge has been to correctly classify instances of the Englishpronoun it. Consider the sentences:(1) You can make it in advance.(2) You can make it in Hollywood.In Example (1), it is an anaphoric pronoun referring to some previous noun phrase, like“the sauce” or “an appointment.” In Example (2), it is part of the idiomatic expression“make it” meaning “succeed.” A coreference resolution system should find an antecedent<strong>for</strong> the first it but not the second. Pronouns that do not refer to preceding noun phrases arecalled non-anaphoric or non-referential pronouns.The word it is one of the most frequent words in the English language, accounting <strong>for</strong>about 1% of tokens in text and over a quarter of all third-person pronouns. 5 Usually betweena quarter and a half of it instances are non-referential. As with other pronouns, the precedingdiscourse can affect it’s interpretation. For example, Example (2) can be interpreted asreferential if the preceding sentence is “You want to make a movie?” We show, however,5 e.g. http://ucrel.lancs.ac.uk/bncfreq/flists.html48

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!