12.07.2015 Views

Large-Scale Semi-Supervised Learning for Natural Language ...

Large-Scale Semi-Supervised Learning for Natural Language ...

Large-Scale Semi-Supervised Learning for Natural Language ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

anaphoricity by [Denis and Baldridge, 2007].The suitability of this kind of approach to correcting some of our system’s errors is especiallyobvious when we inspect the probabilities of the maximum entropy model’s outputdecisions on the test set. Where the maximum entropy classifier makes mistakes, it does sowith less confidence than when it classifies correct examples. The average predicted probabilityof the incorrect classifications is 76.0% while the average probability of the correctclassifications is 90.3%. Many incorrect decisions are ready to switch sides; our next stepwill be to use features based on the preceding discourse and the candidate antecedents tohelp give the incorrect classifications a helpful push.3.8 ConclusionWe proposed a unified view of using web-scale N-gram models <strong>for</strong> lexical disambiguation.State-of-the-art results by our supervised and unsupervised systems demonstrate that it isnot only important to use the largest corpus, but to get maximum in<strong>for</strong>mation from thiscorpus. Using the Google 5-gram data not only provides better accuracy than using pagecounts from a search engine, but facilitates the use of more context of various sizes andpositions. The TRIGRAM approach, popularized by Lapata and Keller [2005], clearly underper<strong>for</strong>msthe unsupervised SUMLM system on all three applications.In each of our tasks, the candidate set was pre-defined, and training data was availableto train the supervised system. While SUPERLM achieves the highest per<strong>for</strong>mance, thesimpler SUMLM, which uses uni<strong>for</strong>m weights, per<strong>for</strong>ms nearly as well as SUPERLM, andexceeds it <strong>for</strong> less training data. Unlike SUPERLM, SUMLM could easily be used in caseswhere the candidate sets are generated dynamically; <strong>for</strong> example, to assess the contextualcompatibility of preceding-noun candidates <strong>for</strong> anaphora resolution.54

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!