12.07.2015 Views

Large-Scale Semi-Supervised Learning for Natural Language ...

Large-Scale Semi-Supervised Learning for Natural Language ...

Large-Scale Semi-Supervised Learning for Natural Language ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

§ In-Domain (IN) Out-of-Domain #1 (O1) Out-of-Domain #2 (O2)5.3 BNC [Malouf, 2000] Gutenberg (new) Medline (new)5.4 NYT [Bergsma et al., 2009b] Gutenberg (new) Medline (new)5.5 WSJ [Vadas and Curran, 2007a] Grolier [Lauer, 1995a] Medline [Nakov, 2007]5.6 WSJ [Marcus et al., 1993] Brown [Marcus et al., 1993] Medline [Kulick et al., 2004]Table 5.1: Data, with references, <strong>for</strong> tasks in § 5.3: Prenominal Adjective Ordering, § 5.4:Context-Sensitive Spelling Correction, § 5.5: Noun Compound Bracketing, and§5.6: VerbPart-of-Speech Disambiguation.§ IN-Train IN-Dev IN-Test O1 O25.3 237K 13K 13K 13K 9.1K5.4 100K 50K 50K 7.8K 56K5.5 2.0K 72 95 244 4295.6 23K 1.1K 1.1K 21K 6.3KTable 5.2: Number of labeled examples in in-domain training, development and test sets,and out-of-domain test sets, <strong>for</strong> tasks in Sections 5.3-5.6.use one in-domain and two out-of-domain test sets <strong>for</strong> each task. Statistical significance isassessed with McNemar’s test, p

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!