12.07.2015 Views

Large-Scale Semi-Supervised Learning for Natural Language ...

Large-Scale Semi-Supervised Learning for Natural Language ...

Large-Scale Semi-Supervised Learning for Natural Language ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Table of Contents1 Introduction 11.1 What NLP Systems Do . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Writing Rules vs. Machine <strong>Learning</strong> . . . . . . . . . . . . . . . . . . . . . 21.3 <strong>Learning</strong> from Unlabeled Data . . . . . . . . . . . . . . . . . . . . . . . . 31.4 A Perspective on Statistical vs. Linguistic Approaches . . . . . . . . . . . 61.5 Overview of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . 81.6 Summary of Main Contributions . . . . . . . . . . . . . . . . . . . . . . . 102 <strong>Supervised</strong> and <strong>Semi</strong>-<strong>Supervised</strong> Machine <strong>Learning</strong> in <strong>Natural</strong> <strong>Language</strong> Processing2.1 The Rise of Machine <strong>Learning</strong> in NLP . . . . . . . . . . . . . . . . . . . .12122.2 The Linear Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3 <strong>Supervised</strong> <strong>Learning</strong> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3.1 Experimental Set-up . . . . . . . . . . . . . . . . . . . . . . . . . 172.3.2 Evaluation Measures . . . . . . . . . . . . . . . . . . . . . . . . . 182.3.3 <strong>Supervised</strong> <strong>Learning</strong> Algorithms . . . . . . . . . . . . . . . . . . . 192.3.4 Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . 202.3.5 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.4 Unsupervised <strong>Learning</strong> . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.5 <strong>Semi</strong>-<strong>Supervised</strong> <strong>Learning</strong> . . . . . . . . . . . . . . . . . . . . . . . . . . 242.5.12.5.2Transductive <strong>Learning</strong> . . . . . . . . . . . . . . . . . . . . . . . .Self-training . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25272.5.3 Bootstrapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.5.4 <strong>Learning</strong> with Heuristically-Labeled Examples . . . . . . . . . . . 292.5.5 Creating Features from Unlabeled Data . . . . . . . . . . . . . . . 323 <strong>Learning</strong> with Web-<strong>Scale</strong> N-gram Models 353.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.2.1 Lexical Disambiguation . . . . . . . . . . . . . . . . . . . . . . . 363.2.2 Web-<strong>Scale</strong> Statistics in NLP . . . . . . . . . . . . . . . . . . . . . 383.3 Disambiguation with N-gram Counts . . . . . . . . . . . . . . . . . . . . .3.3.1 SUPERLM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39403.3.2 SUMLM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.3.3 TRIGRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423.3.4 RATIOLM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423.4 Evaluation Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.5 Preposition Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.5.1 The Task of Preposition Selection . . . . . . . . . . . . . . . . . . 433.5.2 Preposition Selection Results . . . . . . . . . . . . . . . . . . . . . 443.6 Context-Sensitive Spelling Correction . . . . . . . . . . . . . . . . . . . . 463.6.1 The Task of Context-Sensitive Spelling Correction . . . . . . . . . 463.73.6.2 Context-sensitive Spelling Correction Results . . . . . . . . . . . .Non-referential Pronoun Detection . . . . . . . . . . . . . . . . . . . . . .47483.7.1 The Task of Non-referential Pronoun Detection . . . . . . . . . . . 483.7.2 Our Approach to Non-referential Pronoun Detection . . . . . . . . 49

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!