12.07.2015 Views

Large-Scale Semi-Supervised Learning for Natural Language ...

Large-Scale Semi-Supervised Learning for Natural Language ...

Large-Scale Semi-Supervised Learning for Natural Language ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

algorithm). Yarowsky used it <strong>for</strong> word-sense disambiguation. He essentially showed thata bootstrapping approach can achieve per<strong>for</strong>mance comparable to full supervised learning.An example from word-sense disambiguation will help illustrate: To disambiguate whetherthe noun bass is used in the fish sense or in the music sense, we can rely on a just a fewkey contexts to identify unambiguous instances of the noun in text. Suppose we know thatcaught a bass means the fish sense of bass. Now, whenever we see caught a bass, we labelthat noun <strong>for</strong> the fish sense. This is the context-based view of the problem. The other viewis a document-based view. It has been shown experimentally that all instances of a uniqueword type in a single document tend to share the same sense [Gale et al., 1992]. Once wehave one instance of bass labeled, we can extend this classification to the other instancesof bass in the same document using this second view. We can then re-learn our contextbasedclassifier from these new examples and repeat the process in new documents and newcontexts, until all the instances are labeled.Multi-view bootstrapping is also used in in<strong>for</strong>mation extraction [Etzioni et al., 2005].Collins and Singer [1999] and Cucerzan and Yarowsky [1999] apply bootstrapping to thetask of named-entity recognition. Klementiev and Roth [2006] used bootstrapping to extractinterlingual named entities. Our research has also been influenced by co-training-styleweakly supervised algorithms used in coreference resolution [Ge et al., 1998; Harabagiuet al., 2001; Müller et al., 2002; Ng and Cardie, 2003b; 2003a; Bean and Riloff, 2004] andgrammatical gender determination [Cucerzan and Yarowsky, 2003].Bootstrapping from SeedsA distinct line of bootstrapping research has also evolved in NLP, which we call Bootstrappingfrom Seeds. These approaches all involve starting with a small number of examples,building predictors from these examples, labeling more examples with the new predictors,and then repeating the process to build a large collection of in<strong>for</strong>mation. While this researchgenerally does not explicitly cast the tasks as exploiting orthogonal views of the data, it isinstructive to describe these techniques from the multi-view perspective.An early example is described by Hearst [1992]. Suppose we wish to find hypernyms intext. A hypernym is a relation between two things such that one thing is a sub-class of theother. It is sometimes known as the is-a relation. For example a wound is-a type of injury,Ottawa is-a city, a Cadillac is-a car, etc. Suppose we see the words in text, “Cadillacs andother cars...” There are two separate sources of in<strong>for</strong>mation in this example:1. The string pair itself: Cadillac, car2. The context: Xs and other YsWe can per<strong>for</strong>m bootstrapping in this framework as follows: First, we obtain a list of seedpairs of words, e.g. Cadillac/car, Ottawa/city, wound/injury, etc. Now, we create a predictorthat will label examples as being hypernyms based purely on whether they occur in thisseed set. We are thus only using the first view of the problem: the actual string pairs. Weuse this predictor to label a number of examples in actual text, e.g. “Cadillacs and othercars, cars such as Cadillacs, cars including Cadillacs, etc.” We then train a predictor <strong>for</strong>the other view of the problem: From all the labeled examples, we extract predictive contexts:“Xs and other Ys, Ys such as Xs, Ys including Xs, etc.” The contexts extracted inthis view can now be used to extract more seeds, and the seeds can then be used to extractmore contexts, etc., in an iterative fashion. Hearst described an early <strong>for</strong>m of this algorithm,28

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!