Large-Scale Semi-Supervised Learning for Natural Language ...
Large-Scale Semi-Supervised Learning for Natural Language ... Large-Scale Semi-Supervised Learning for Natural Language ...
[Jiampojamarn et al., 2007] Sittichai Jiampojamarn, Grzegorz Kondrak, and Tarek Sherif.Applying many-to-many alignments and hidden Markov models to letter-to-phonemeconversion. In NAACL-HLT, 2007.[Jiampojamarn et al., 2010] Sittichai Jiampojamarn, Ken Dwyer, Shane Bergsma, AdityaBhargava, Qing Dou, Mi-Young Kim, and Grzegorz Kondrak. Transliteration generationand mining with limited training resources. Named Entities Workshop (NEWS), 2010.[Joachims et al., 2009] Thorsten Joachims, Thomas Finley, and Chun-Nam John Yu.Cutting-plane training of structural SVMs. Mach. Learn., 77(1):27–59, 2009.[Joachims, 1999a] Thorsten Joachims. Making large-scale Support Vector Machine learningpractical. In B. Schölkopf and C. Burges, editors, Advances in Kernel Methods:Support Vector Machines. MIT-Press, 1999.[Joachims, 1999b] Thorsten Joachims. Transductive inference for text classification usingsupport vector machines. In International Conference on Machine Learning (ICML),1999.[Joachims, 2002] Thorsten Joachims. Optimizing search engines using clickthrough data.In KDD, 2002.[Joachims, 2006] Thorsten Joachims. Training linear SVMs in linear time. In KDD, 2006.[Jones and Ghani, 2000] Rosie Jones and Rayid Ghani. Automatically building a corpusfor a minority language from the web. In Proceedings of the Student Research Workshopat the 38th AnnualMeeting of the Association for Computational Linguistics, 2000.[Jurafsky and Martin, 2000] Daniel Jurafsky and James H. Martin. Speech and languageprocessing. Prentice Hall, 2000.[Kehler et al., 2004] Andrew Kehler, Douglas Appelt, Lara Taylor, and Aleksandr Simma.The (non)utility of predicate-argument frequencies for pronoun interpretation. In HLT-NAACL, 2004.[Keller and Lapata, 2003] Frank Keller and Mirella Lapata. Using the web to obtain frequenciesfor unseen bigrams. Computational Linguistics, 29(3):459–484, 2003.[Kilgarriff and Grefenstette, 2003] Adam Kilgarriff and Gregory Grefenstette. Introductionto the special issue on the Web as corpus. Computational Linguistics, 29(3):333–347, 2003.[Kilgarriff, 2007] Adam Kilgarriff. Googleology is bad science. Computational Linguistics,33(1), 2007.[Klementiev and Roth, 2006] Alexandre Klementiev and Dan Roth. Named entity transliterationand discovery from multilingual comparable corpora. In HLT-NAACL, 2006.[Knight et al., 1995] Kevin Knight, Ishwar Chander, Matthew Haines, Vasileios Hatzivassiloglou,Eduard Hovy, Masayo Iida, Steve K. Luk, Richard Whitney, and Kenji Yamada.Filling knowledge gaps in a broad coverage machine translation system. In IJCAI, 1995.[Koehn and Knight, 2002] Philipp Koehn and Kevin Knight. Learning a translation lexiconfrom monolingual corpora. In ACL Workshop on Unsupervised Lexical Acquistion, 2002.[Koehn and Monz, 2006] Philipp Koehn and Christof Monz. Manual and automatic evaluationof machine translation between European languages. In NAACL Workshop onStatistical Machine Translation, 2006.[Koehn et al., 2003] Philipp Koehn, Franz Josef Och, and Daniel Marcu. Statistical phrasebasedtranslation. In HLT-NAACL, 2003.119
[Koehn, 2005] Philipp Koehn. Europarl: A parallel corpus for statistical machine translation.In MT Summit X, 2005.[Kondrak and Sherif, 2006] Grzegorz Kondrak and Tarek Sherif. Evaluation of severalphonetic similarity algorithms on the task of cognate identification. In COLING-ACLWorkshop on Linguistic Distances, 2006.[Kondrak et al., 2003] Grzegorz Kondrak, Daniel Marcu, and Kevin Knight. Cognates canimprove statistical translation models. In HLT-NAACL, 2003.[Kondrak, 2005] Grzegorz Kondrak. Cognates and word alignment in bitexts. In MT SummitX, 2005.Simple semi-[Koo et al., 2008] Terry Koo, Xavier Carreras, and Michael Collins.supervised dependency parsing. In ACL-08: HLT, 2008.[Kotsia et al., 2009] Irene Kotsia, Stefanos Zafeiriou, and Ioannis Pitas. Novel multiclassclassifiers based on the minimization of the within-class variance. IEEE Trans. Neur.Networks, 20(1):14–34, 2009.[Kulick et al., 2004] Seth Kulick, Ann Bies, Mark Liberman, Mark Mandel, Ryan Mc-Donald, Martha Palmer, Andrew Schein, Lyle Ungar, Scott Winters, and Pete White.Integrated annotation for biomedical information extraction. In BioLINK 2004: LinkingBiological Literature, Ontologies and Databases, 2004.[Kummerfeld and Curran, 2008] Jonathan K. Kummerfeld and James R. Curran. Classificationof verb particle constructions with the google web1t corpus. In AustralasianLanguage Technology Association Workshop, 2008.[Lafferty et al., 2001] John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira.Conditional Random Fields: Probabilistic models for segmenting and labeling sequencedata. In ICML, 2001.[Lapata and Keller, 2005] Mirella Lapata and Frank Keller. Web-based models for naturallanguage processing. ACM Trans. Speech and Language Processing, 2(1):1–31, 2005.[Lappin and Leass, 1994] Shalom Lappin and Herbert J. Leass. An algorithm for pronominalanaphora resolution. Computational Linguistics, 20(4), 1994.[Lauer, 1995a] Mark Lauer. Corpus statistics meet the noun compound: Some empiricalresults. In ACL, 1995.[Lauer, 1995b] Mark Lauer. Designing Statistical Language Learners: Experiments onCompound Nouns. PhD thesis, Macquarie University, 1995.[Levenshtein, 1966] Vladimir I. Levenshtein. Binary codes capable of correcting deletions,insertions, and reversals. Soviet Physics Doklady, 10(8), 1966.[Li and Abe, 1998] Hang Li and Naoki Abe. Generalizing case frames using a thesaurusand the MDL principle. Computational Linguistics, 24(2), 1998.[Lin and Wu, 2009] Dekang Lin and Xiaoyun Wu.learning. In ACL-IJCNLP, 2009.Phrase clustering for discriminative[Lin et al., 2010] Dekang Lin, Kenneth Church, Heng Ji, Satoshi Sekine, David Yarowsky,Shane Bergsma, Kailash Patil, Emily Pitler, Rachel Lathbury, Vikram Rao, Kapil Dalwani,and Sushant Narsale. New tools for web-scale N-grams. In LREC, 2010.[Lin, 1998a] Dekang Lin. Automatic retrieval and clustering of similar words. In COLING-ACL, 1998.[Lin, 1998b] Dekang Lin. Dependency-based evaluation of MINIPAR. In LREC Workshopon the Evaluation of Parsing Systems, 1998.120
- Page 77 and 78: Chapter 5Creating Robust Supervised
- Page 79 and 80: § In-Domain (IN) Out-of-Domain #1
- Page 81 and 82: Adjective ordering is also needed i
- Page 83 and 84: Accuracy (%)10095908580757065601001
- Page 85 and 86: System IN O1 O2Baseline 66.9 44.6 6
- Page 87 and 88: 90% of the time in Gutenberg. The L
- Page 89 and 90: VBN/VBD distinction by providing re
- Page 91 and 92: other tasks we only had a handful o
- Page 93 and 94: without the need for manual annotat
- Page 95 and 96: DSP uses these labels to identify o
- Page 97 and 98: Semantic classesMotivated by previo
- Page 99 and 100: empirical Pr(n|v) in Equation (6.2)
- Page 101 and 102: Verb Plaus./Implaus. Resnik Dagan e
- Page 103 and 104: SystemAccMost-Recent Noun 17.9%Maxi
- Page 105 and 106: Chapter 7Alignment-Based Discrimina
- Page 107 and 108: ious measures to learn the recurren
- Page 109 and 110: how labeled word pairs can be colle
- Page 111 and 112: Figure 7.1: LCSR histogram and poly
- Page 113 and 114: 0.711-pt Average Precision0.60.50.4
- Page 115 and 116: Fr-En Bitext Es-En Bitext De-En Bit
- Page 117 and 118: Chapter 8Conclusions and Future Wor
- Page 119 and 120: 8.3 Future WorkThis section outline
- Page 121 and 122: My focus is thus on enabling robust
- Page 123 and 124: [Bergsma and Cherry, 2010] Shane Be
- Page 125 and 126: [Church and Mercer, 1993] Kenneth W
- Page 127: [Grefenstette, 1999] Gregory Grefen
- Page 131 and 132: [Mihalcea and Moldovan, 1999] Rada
- Page 133 and 134: [Ristad and Yianilos, 1998] Eric Sv
- Page 135 and 136: [Wang et al., 2008] Qin Iris Wang,
- Page 137: NNP noun, proper, singular Motown V
[Jiampojamarn et al., 2007] Sittichai Jiampojamarn, Grzegorz Kondrak, and Tarek Sherif.Applying many-to-many alignments and hidden Markov models to letter-to-phonemeconversion. In NAACL-HLT, 2007.[Jiampojamarn et al., 2010] Sittichai Jiampojamarn, Ken Dwyer, Shane Bergsma, AdityaBhargava, Qing Dou, Mi-Young Kim, and Grzegorz Kondrak. Transliteration generationand mining with limited training resources. Named Entities Workshop (NEWS), 2010.[Joachims et al., 2009] Thorsten Joachims, Thomas Finley, and Chun-Nam John Yu.Cutting-plane training of structural SVMs. Mach. Learn., 77(1):27–59, 2009.[Joachims, 1999a] Thorsten Joachims. Making large-scale Support Vector Machine learningpractical. In B. Schölkopf and C. Burges, editors, Advances in Kernel Methods:Support Vector Machines. MIT-Press, 1999.[Joachims, 1999b] Thorsten Joachims. Transductive inference <strong>for</strong> text classification usingsupport vector machines. In International Conference on Machine <strong>Learning</strong> (ICML),1999.[Joachims, 2002] Thorsten Joachims. Optimizing search engines using clickthrough data.In KDD, 2002.[Joachims, 2006] Thorsten Joachims. Training linear SVMs in linear time. In KDD, 2006.[Jones and Ghani, 2000] Rosie Jones and Rayid Ghani. Automatically building a corpus<strong>for</strong> a minority language from the web. In Proceedings of the Student Research Workshopat the 38th AnnualMeeting of the Association <strong>for</strong> Computational Linguistics, 2000.[Jurafsky and Martin, 2000] Daniel Jurafsky and James H. Martin. Speech and languageprocessing. Prentice Hall, 2000.[Kehler et al., 2004] Andrew Kehler, Douglas Appelt, Lara Taylor, and Aleksandr Simma.The (non)utility of predicate-argument frequencies <strong>for</strong> pronoun interpretation. In HLT-NAACL, 2004.[Keller and Lapata, 2003] Frank Keller and Mirella Lapata. Using the web to obtain frequencies<strong>for</strong> unseen bigrams. Computational Linguistics, 29(3):459–484, 2003.[Kilgarriff and Grefenstette, 2003] Adam Kilgarriff and Gregory Grefenstette. Introductionto the special issue on the Web as corpus. Computational Linguistics, 29(3):333–347, 2003.[Kilgarriff, 2007] Adam Kilgarriff. Googleology is bad science. Computational Linguistics,33(1), 2007.[Klementiev and Roth, 2006] Alexandre Klementiev and Dan Roth. Named entity transliterationand discovery from multilingual comparable corpora. In HLT-NAACL, 2006.[Knight et al., 1995] Kevin Knight, Ishwar Chander, Matthew Haines, Vasileios Hatzivassiloglou,Eduard Hovy, Masayo Iida, Steve K. Luk, Richard Whitney, and Kenji Yamada.Filling knowledge gaps in a broad coverage machine translation system. In IJCAI, 1995.[Koehn and Knight, 2002] Philipp Koehn and Kevin Knight. <strong>Learning</strong> a translation lexiconfrom monolingual corpora. In ACL Workshop on Unsupervised Lexical Acquistion, 2002.[Koehn and Monz, 2006] Philipp Koehn and Christof Monz. Manual and automatic evaluationof machine translation between European languages. In NAACL Workshop onStatistical Machine Translation, 2006.[Koehn et al., 2003] Philipp Koehn, Franz Josef Och, and Daniel Marcu. Statistical phrasebasedtranslation. In HLT-NAACL, 2003.119