Large-Scale Semi-Supervised Learning for Natural Language ...
Large-Scale Semi-Supervised Learning for Natural Language ...
Large-Scale Semi-Supervised Learning for Natural Language ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
System P R F AccSUPERLM 80.0 73.3 76.5 86.5Human-1 92.7 63.3 75.2 87.5Human-2 84.0 70.0 76.4 87.0Human-3 72.2 86.7 78.8 86.0Table 3.4: Human vs. computer non-referential it detection (%).Subjects first per<strong>for</strong>med a dry-run experiment on separate development data. They wereshown their errors and sources of confusion were clarified. They then made the judgmentsunassisted on a final set of 200 test examples. Three humans per<strong>for</strong>med the experiment.Their results show a range of preferences <strong>for</strong> precision versus recall, with F-Score and Accuracybroadly similar to SUPERLM (Table 3.4). These results show that our distributionalapproach is already getting good leverage from the limited context in<strong>for</strong>mation, around thatachieved by our best human.Error AnalysisIt is instructive to inspect the twenty-five test instances that our system classified incorrectly,given human per<strong>for</strong>mance on this same set. Seventeen of the twenty-five system errorswere also made by one or more human subjects, suggesting system errors are also mostlydue to limited context. For example, one of these errors was <strong>for</strong> the context: “it takesan astounding amount...” Here, the non-referential nature of the instance is not apparentwithout the infinitive clause that ends the sentence: “... of time to compare very long DNAsequences with each other.”Six of the eight errors unique to the system were cases where the system falsely saidthe pronoun was non-referential. Four of these could have referred to entire sentences orclauses rather than nouns. These confusing cases, <strong>for</strong> both humans and our system, resultfrom our definition of a referential pronoun: pronouns with verbal or clause antecedents areconsidered non-referential. If an antecedent verb or clause is replaced by a nominalization(Smith researched... to Smith’s research), then a neutral pronoun, in the same context, becomesreferential. When we inspect the probabilities produced by the maximum entropyclassifier, we see only a weak bias <strong>for</strong> the non-referential class on these examples, reflectingour classifier’s uncertainty. It would likely be possible to improve accuracy on thesecases by encoding the presence or absence of preceding nominalizations as a feature of ourclassifier.Another false non-referential decision is <strong>for</strong> the phrase “... machine he had installedit on.” The it is actually referential, but the extracted patterns (e.g. “he had install * on”)are nevertheless usually filled with it (this example also suggests using filler counts <strong>for</strong>the word “the” as a feature when it is the last word in the pattern). Again, it might bepossible to fix such examples by leveraging the preceding discourse. Notably, the first nounphrasebe<strong>for</strong>e the context is the word “software.” There is strong compatibility between thepronoun-parent “install” and the candidate antecedent “software.” In a full coreferenceresolution system, when the anaphora resolution module has a strong preference to link itto an antecedent (which it should when the pronoun is indeed referential), we can override aweak non-referential probability. Non-referential it detection should not be a pre-processingstep, but rather part of a globally-optimal configuration, as was done <strong>for</strong> general noun phrase53