Large-Scale Semi-Supervised Learning for Natural Language ...
Large-Scale Semi-Supervised Learning for Natural Language ... Large-Scale Semi-Supervised Learning for Natural Language ...
[Pantel and Lin, 2002] Patrick Pantel and Dekang Lin. Discovering word senses from text.In KDD, 2002.[Pantel and Pennacchiotti, 2006] Patrick Pantel and Marco Pennacchiotti. Espresso: leveraginggeneric patterns for automatically harvesting semantic relations. In ACL ’06: Proceedingsof the 21st International Conference on Computational Linguistics and the 44thannual meeting of the ACL, 2006.[Pantel et al., 2007] Patrick Pantel, Rahul Bhagat, Bonaventura Coppola, TimothyChklovski, and Eduard Hovy. ISP: Learning inferential selectional preferences. InNAACL-HLT, 2007.[Pantel, 2003] Patrick Pantel. Clustering by committee. PhD thesis, University of Alberta,2003.[Paşca et al., 2006] Marius Paşca, Dekang Lin, Jeffrey Bigham, Andrei Lifchits, and AlpaJain. Names and similarities on the Web: Fact extraction in the fast lane. In Proceedingsof the 21st International Conference on Computational Linguistics and 44th AnnualMeeting of the ACL, 2006.[Phan, 2006] Xuan-Hieu Phan. CRFTagger: CRF English POS Tagger. crftagger.sourceforge.net, 2006.[Pinchak and Bergsma, 2007] Christopher Pinchak and Shane Bergsma. Automatic answertyping for how-questions. In HLT-NAACL, 2007.[Pitler et al., 2010] Emily Pitler, Shane Bergsma, Dekang Lin, and Kenneth Church. Usingweb-scale N-grams to improve base NP parsing performance. In COLING, 2010.[Porter, 1980] Martin F. Porter. An algorithm for suffix stripping. Program, 14(3), 1980.[Radev et al., 2001] Dragomir R. Radev, Hong Qi, Zhiping Zheng, Sasha Blair-Goldensohn, Zhu Zhang, Weiguo Fan, and John Prager. Mining the Web for Answers toNatural Language Questions. In CIKM, 2001.[Raina et al., 2006] Rajat Raina, Andrew Y. Ng, and Daphne Koller. Constructing informativepriors using transfer learning. In ICML, 2006.[Rappoport and Levent-Levi, 2006] Ari Rappoport and Tsahi Levent-Levi. Induction ofcross-language affix and letter sequence correspondence. In EACL Workshop on Cross-Language Knowledge Induction, 2006.[Ravichandran and Hovy, 2002] Deepak Ravichandran and Eduard Hovy. Learning surfacetext patterns for a question answering system. In ACL ’02: Proceedings of the 40thAnnual Meeting on Association for Computational Linguistics, 2002.[Resnik, 1996] Philip Resnik. Selectional constraints: An information-theoretic model andits computational realization. Cognition, 61, 1996.[Resnik, 1999] Philip Resnik. Mining the web for bilingual text. In Proceedings of the37th Annual Meeting of the Association for Computational Linguistics, 1999.[Rifkin and Klautau, 2004] Ryan Rifkin and Aldebaro Klautau. In defense of one-vs-allclassification. JMLR, 5:101–141, 2004.[Riloff and Jones, 1999] Ellen Riloff and Rosie Jones. Learning dictionaries for informationextraction by multi-level bootstrapping. In Proceedings of the Sixteenth NationalConference on Artificial Intelligence (AAAI-99), 1999.Adapting a lexicalized-[Rimell and Clark, 2008] Laura Rimell and Stephen Clark.grammar parser to contrasting domains. In EMNLP, 2008.123
[Ristad and Yianilos, 1998] Eric Sven Ristad and Peter N. Yianilos. Learning string-editdistance. IEEE Trans. Pattern Anal. Machine Intell., 20(5), 1998.[Roberto et al., 2007] Basili Roberto, Diego De Cao, Paolo Marocco, and Marco Pennacchiotti.Learning selectional preferences for entailment or paraphrasing rules. In RANLP,2007.[Rooth et al., 1999] Mats Rooth, Stefan Riezler, Detlef Prescher, Glenn Carroll, and FranzBeil. Inducing a semantically annotated lexicon via EM-based clustering. In ACL, 1999.[Roth and Yih, 2004] Dan Roth and Wen-Tau Yih. A linear programming formulation forglobal inference in natural language tasks. In CoNLL, 2004.[Roth, 1998] Dan Roth. Learning to resolve natural language ambiguities: A unified approach.In AAAI/IAAI, 1998.[Russell and Norvig, 2003] Stuart J. Russell and Peter Norvig. Artificial Intelligence: amodern approach, chapter 20: Statistical Learning Methods. Prentice Hall, Upper SaddleRiver, N.J., 2nd edition edition, 2003.[Schafer and Yarowsky, 2002] Charles Schafer and David Yarowsky. Inducing translationlexicons via diverse similarity measures and bridge languages. In CoNLL, 2002.[Sekine, 2008] Satoshi Sekine. A linguistic knowledge discovery tool: Very large ngramdatabase search with arbitrary wildcards. In COLING: Companion volume: Demonstrations,2008.[Shannon, 1948] Claude E. Shannon. A mathematical theory of communication. Bell SystemTechnical Journal, 27(3), 1948.[Shaw and Hatzivassiloglou, 1999] James Shaw and Vasileios Hatzivassiloglou. Orderingamong premodifiers. In ACL, 1999.[Simard et al., 1992] Michel Simard, George F. Foster, and Pierre Isabelle. Using cognatesto align sentences in bilingual corpora. In Fourth International Conference on Theoreticaland Methodological Issues in Machine Translation, 1992.[Smith and Eisner, 2005] Noah A. Smith and Jason Eisner. Contrastive estimation: traininglog-linear models on unlabeled data. In ACL, 2005.[Snow et al., 2005] Rion Snow, Daniel Jurafsky, and Andrew Y. Ng. Learning syntacticpatterns for automatic hypernym discovery. In NIPS, 2005.[Snow et al., 2008] Rion Snow, Brendan O’Connor, Daniel Jurafsky, and Andrew Y. Ng.Cheap and fast - but is it good? evaluating non-expert annotations for natural languagetasks. In EMNLP, 2008.[Steedman, 2008] Mark Steedman. On becoming a discipline. Comput. Linguist.,34(1):137–144, 2008.[Strube et al., 2002] Michael Strube, Stefan Rapp, and Christoph Müller. The influence ofminimum edit distance on reference resolution. In EMNLP, 2002.[Suzuki and Isozaki, 2008] Jun Suzuki and Hideki Isozaki. Semi-supervised sequentiallabeling and segmentation using giga-word scale unlabeled data. In Proceedings of ACL-08: HLT, 2008.[Taskar et al., 2005] Ben Taskar, Simon Lacoste-Julien, and Dan Klein. A discriminativematching approach to word alignment. In HLT-EMNLP, 2005.[Tefas et al., 2001] Anastasios Tefas, Constantine Kotropoulos, and Ioannis Pitas. Usingsupport vector machines to enhance the performance of elastic graph matching for frontalface authentication. IEEE Trans. Pattern Anal. Machine Intell., 23:735–746, 2001.124
- Page 81 and 82: Adjective ordering is also needed i
- Page 83 and 84: Accuracy (%)10095908580757065601001
- Page 85 and 86: System IN O1 O2Baseline 66.9 44.6 6
- Page 87 and 88: 90% of the time in Gutenberg. The L
- Page 89 and 90: VBN/VBD distinction by providing re
- Page 91 and 92: other tasks we only had a handful o
- Page 93 and 94: without the need for manual annotat
- Page 95 and 96: DSP uses these labels to identify o
- Page 97 and 98: Semantic classesMotivated by previo
- Page 99 and 100: empirical Pr(n|v) in Equation (6.2)
- Page 101 and 102: Verb Plaus./Implaus. Resnik Dagan e
- Page 103 and 104: SystemAccMost-Recent Noun 17.9%Maxi
- Page 105 and 106: Chapter 7Alignment-Based Discrimina
- Page 107 and 108: ious measures to learn the recurren
- Page 109 and 110: how labeled word pairs can be colle
- Page 111 and 112: Figure 7.1: LCSR histogram and poly
- Page 113 and 114: 0.711-pt Average Precision0.60.50.4
- Page 115 and 116: Fr-En Bitext Es-En Bitext De-En Bit
- Page 117 and 118: Chapter 8Conclusions and Future Wor
- Page 119 and 120: 8.3 Future WorkThis section outline
- Page 121 and 122: My focus is thus on enabling robust
- Page 123 and 124: [Bergsma and Cherry, 2010] Shane Be
- Page 125 and 126: [Church and Mercer, 1993] Kenneth W
- Page 127 and 128: [Grefenstette, 1999] Gregory Grefen
- Page 129 and 130: [Koehn, 2005] Philipp Koehn. Europa
- Page 131: [Mihalcea and Moldovan, 1999] Rada
- Page 135 and 136: [Wang et al., 2008] Qin Iris Wang,
- Page 137: NNP noun, proper, singular Motown V
[Ristad and Yianilos, 1998] Eric Sven Ristad and Peter N. Yianilos. <strong>Learning</strong> string-editdistance. IEEE Trans. Pattern Anal. Machine Intell., 20(5), 1998.[Roberto et al., 2007] Basili Roberto, Diego De Cao, Paolo Marocco, and Marco Pennacchiotti.<strong>Learning</strong> selectional preferences <strong>for</strong> entailment or paraphrasing rules. In RANLP,2007.[Rooth et al., 1999] Mats Rooth, Stefan Riezler, Detlef Prescher, Glenn Carroll, and FranzBeil. Inducing a semantically annotated lexicon via EM-based clustering. In ACL, 1999.[Roth and Yih, 2004] Dan Roth and Wen-Tau Yih. A linear programming <strong>for</strong>mulation <strong>for</strong>global inference in natural language tasks. In CoNLL, 2004.[Roth, 1998] Dan Roth. <strong>Learning</strong> to resolve natural language ambiguities: A unified approach.In AAAI/IAAI, 1998.[Russell and Norvig, 2003] Stuart J. Russell and Peter Norvig. Artificial Intelligence: amodern approach, chapter 20: Statistical <strong>Learning</strong> Methods. Prentice Hall, Upper SaddleRiver, N.J., 2nd edition edition, 2003.[Schafer and Yarowsky, 2002] Charles Schafer and David Yarowsky. Inducing translationlexicons via diverse similarity measures and bridge languages. In CoNLL, 2002.[Sekine, 2008] Satoshi Sekine. A linguistic knowledge discovery tool: Very large ngramdatabase search with arbitrary wildcards. In COLING: Companion volume: Demonstrations,2008.[Shannon, 1948] Claude E. Shannon. A mathematical theory of communication. Bell SystemTechnical Journal, 27(3), 1948.[Shaw and Hatzivassiloglou, 1999] James Shaw and Vasileios Hatzivassiloglou. Orderingamong premodifiers. In ACL, 1999.[Simard et al., 1992] Michel Simard, George F. Foster, and Pierre Isabelle. Using cognatesto align sentences in bilingual corpora. In Fourth International Conference on Theoreticaland Methodological Issues in Machine Translation, 1992.[Smith and Eisner, 2005] Noah A. Smith and Jason Eisner. Contrastive estimation: traininglog-linear models on unlabeled data. In ACL, 2005.[Snow et al., 2005] Rion Snow, Daniel Jurafsky, and Andrew Y. Ng. <strong>Learning</strong> syntacticpatterns <strong>for</strong> automatic hypernym discovery. In NIPS, 2005.[Snow et al., 2008] Rion Snow, Brendan O’Connor, Daniel Jurafsky, and Andrew Y. Ng.Cheap and fast - but is it good? evaluating non-expert annotations <strong>for</strong> natural languagetasks. In EMNLP, 2008.[Steedman, 2008] Mark Steedman. On becoming a discipline. Comput. Linguist.,34(1):137–144, 2008.[Strube et al., 2002] Michael Strube, Stefan Rapp, and Christoph Müller. The influence ofminimum edit distance on reference resolution. In EMNLP, 2002.[Suzuki and Isozaki, 2008] Jun Suzuki and Hideki Isozaki. <strong>Semi</strong>-supervised sequentiallabeling and segmentation using giga-word scale unlabeled data. In Proceedings of ACL-08: HLT, 2008.[Taskar et al., 2005] Ben Taskar, Simon Lacoste-Julien, and Dan Klein. A discriminativematching approach to word alignment. In HLT-EMNLP, 2005.[Tefas et al., 2001] Anastasios Tefas, Constantine Kotropoulos, and Ioannis Pitas. Usingsupport vector machines to enhance the per<strong>for</strong>mance of elastic graph matching <strong>for</strong> frontalface authentication. IEEE Trans. Pattern Anal. Machine Intell., 23:735–746, 2001.124