29.01.2015 Views

sentiment-annotated lexicon construction for an urdu ... - Paas.com.pk

sentiment-annotated lexicon construction for an urdu ... - Paas.com.pk

sentiment-annotated lexicon construction for an urdu ... - Paas.com.pk

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Pakist<strong>an</strong> Journal of Science (Vol. 63 No. 4 Dec, 2011)<br />

SENTIMENT-ANNOTATED LEXICON CONSTRUCTION FOR AN URDU TEXT BASED<br />

SENTIMENT ANALYZER<br />

Afraz Z. S., A. Muhammad <strong>an</strong>d Martinez-Enriquez A. M *<br />

Department of CS & E, U. E. T., Lahore, Pakist<strong>an</strong><br />

**<br />

Department of CS, CINVESTAV-IPN, D.F. Mexico<br />

Corresponding author’s email (afrazsyed@uet.edu.<strong>pk</strong>)<br />

ABSTRACT: A <strong>lexicon</strong> based <strong>sentiment</strong> <strong>an</strong>alyzer is <strong>com</strong>posed of two parts: a classifier <strong>an</strong>d a<br />

<strong>lexicon</strong> of <strong>sentiment</strong>-<strong><strong>an</strong>notated</strong> words/phrases. In this paper, a model <strong>for</strong> such a <strong>lexicon</strong> is presented, in<br />

which the polarity scores are <strong><strong>an</strong>notated</strong> with all the subjective entries. This approach h<strong>an</strong>dles Urdu<br />

words, which are morphologically rich <strong>an</strong>d results into a much higher level of <strong>lexicon</strong> intricacy th<strong>an</strong><br />

the other l<strong>an</strong>guages, like English. This is a pioneering ef<strong>for</strong>t, as no <strong>sentiment</strong>-<strong><strong>an</strong>notated</strong> <strong>lexicon</strong> exists<br />

<strong>for</strong> Urdu l<strong>an</strong>guage. Moreover, already developed <strong>lexicon</strong>s of other l<strong>an</strong>guages c<strong>an</strong>not be used, because,<br />

Urdu exhibits, exceptionally distinctive orthographical, morphological, <strong>an</strong>d grammatical features. This<br />

<strong>lexicon</strong> is constructed as a part of a <strong>lexicon</strong> based <strong>sentiment</strong> <strong>an</strong>alyzer <strong>for</strong> opinionated Urdu text, given<br />

in the <strong>for</strong>m of reviews. After applying the developed <strong>lexicon</strong> on multiple reviews, it is observed that<br />

the results are meeting the expectations.<br />

Key words: Natural l<strong>an</strong>guage processing, <strong>com</strong>putational linguistics, <strong>sentiment</strong> <strong>an</strong>alysis, opinion mining, shallow<br />

parsing, Urdu text processing, <strong>lexicon</strong> <strong>construction</strong>.<br />

INTRODUCTION<br />

The rapid proliferation of the user generated text<br />

on the internet has given rise to a number of previously<br />

unknown aspects of the natural l<strong>an</strong>guage processing <strong>an</strong>d<br />

underst<strong>an</strong>ding. This is <strong>an</strong> obvious fact that such a huge<br />

body of knowledge generated by millions of minds<br />

around the world c<strong>an</strong>not be left free <strong>an</strong>d unbridled<br />

(Glaser et al., 2002). As a result, the field of <strong>sentiment</strong><br />

<strong>an</strong>alysis, opinion mining, or subjectivity <strong>an</strong>alysis is<br />

emerging rapidly as <strong>an</strong> unexplored frontier. For English<br />

l<strong>an</strong>guage, this area is under consideration from the last<br />

decade (Hatzivassiloglou <strong>an</strong>d Wiebe, 2000; Turney 2002;<br />

Yu <strong>an</strong>d Hatzivassiloglou, 2003 <strong>an</strong>d P<strong>an</strong>g <strong>an</strong>d Lee, 2008).<br />

These contributions present a <strong>com</strong>plete model of a<br />

<strong>sentiment</strong> <strong>an</strong>alyzer based on different techniques <strong>an</strong>d<br />

approaches like supervised or unsupervised machine<br />

learning or <strong>lexicon</strong> based, etc.<br />

In these works, a usual model of a <strong>sentiment</strong><br />

<strong>an</strong>alyzer incorporates two <strong>com</strong>ponents: (a) the classifier<br />

which <strong>an</strong>alyzes <strong>an</strong>d categorizes the given text <strong>an</strong>d (b) the<br />

<strong>lexicon</strong> or <strong>lexicon</strong>s containing the in<strong>for</strong>mation about the<br />

orientations of the entries (words/ phrases) as positive or<br />

negative. These <strong>lexicon</strong>s are called <strong>sentiment</strong>-<strong><strong>an</strong>notated</strong><br />

<strong>lexicon</strong>s (P<strong>an</strong>g <strong>an</strong>d Lee, 2008), because the polarity<br />

marks indicated <strong>for</strong> orientation are <strong><strong>an</strong>notated</strong> directly to<br />

the <strong>lexicon</strong> entries. Such <strong>lexicon</strong>s c<strong>an</strong> either be m<strong>an</strong>ually<br />

<strong>com</strong>piled or automatically generated. A considerable<br />

percentage of research has emerged in the <strong>sentiment</strong><br />

<strong><strong>an</strong>notated</strong> <strong>lexicon</strong> <strong>construction</strong> within a few years (Annett<br />

<strong>an</strong>d Kondrak, 2008; Higashinaka et al., 2007;<br />

Andreevskaia <strong>an</strong>d Bergler, 2006; Hu <strong>an</strong>d Lui, 2005; Yu<br />

<strong>an</strong>d Hatzivassiloglou, 2003; Riloff et al., 2003; Turney,<br />

2002 <strong>an</strong>d Hatzivassiloglou <strong>an</strong>d Wiebe, 2000). These<br />

contributions have proposed a variety of approaches <strong>for</strong><br />

the <strong>lexicon</strong> development, their structures <strong>an</strong>d the<br />

relationships between the entries.<br />

Mainly these ef<strong>for</strong>ts are <strong>for</strong> English l<strong>an</strong>guage<br />

<strong>an</strong>d exploit pre-developed linguistic recourses like<br />

corpuses <strong>for</strong> the development <strong>an</strong>d extraction of the<br />

required <strong>lexicon</strong>s. Consequently, <strong>for</strong> English l<strong>an</strong>guage<br />

this aspect of research is no more <strong>an</strong> unsolved issue. On<br />

the other h<strong>an</strong>d, Urdu is a recourse poor l<strong>an</strong>guage<br />

(Mukund et al, 2010) <strong>an</strong>d hence, the task of domain<br />

specific <strong>sentiment</strong> <strong><strong>an</strong>notated</strong> <strong>lexicon</strong> <strong>construction</strong> <strong>for</strong><br />

Urdu text poses m<strong>an</strong>y challenges. To our knowledge no<br />

such <strong>lexicon</strong> exists. However, there are a very few ef<strong>for</strong>ts<br />

which have tried to construct <strong>lexicon</strong>s <strong>for</strong> other l<strong>an</strong>guage<br />

processing applications of Urdu text (Ijaz <strong>an</strong>d Hussain,<br />

2007; Humayoun et al., 2007; Muaz <strong>an</strong>d Hussain, 2009<br />

<strong>an</strong>d Mukund et al, 2010).<br />

There<strong>for</strong>e, this paper describes the structure,<br />

<strong>construction</strong> <strong>an</strong>d evaluation of a m<strong>an</strong>ually tagged<br />

<strong>sentiment</strong>-<strong><strong>an</strong>notated</strong> Urdu words based <strong>lexicon</strong> as a<br />

<strong>com</strong>ponent of a <strong>sentiment</strong> <strong>an</strong>alysis model developed <strong>for</strong><br />

Urdu text. The <strong>lexicon</strong> contains in<strong>for</strong>mation about the<br />

subjectivity of <strong>an</strong> entry in addition to its orthographic,<br />

phonological, syntactic <strong>an</strong>d, morphological aspects. This<br />

approach recognizes the subjective entries in the <strong>lexicon</strong><br />

through their two attributes; i.e. orientation (either<br />

positive or negative) <strong>an</strong>d intensity (the <strong>for</strong>ce of the<br />

orientation). After the development of the <strong>lexicon</strong>, it is<br />

integrated with the <strong>sentiment</strong> classifier. The classifier<br />

preprocesses the given text <strong>an</strong>d then applies shallow<br />

218


Pakist<strong>an</strong> Journal of Science (Vol. 63 No. 4 Dec, 2011)<br />

parsing based chunking. It uses <strong>lexicon</strong> <strong>for</strong> <strong>com</strong>paring all<br />

the words/phrases present in the text. As a result, all the<br />

subjective terms in the given text be<strong>com</strong>e <strong><strong>an</strong>notated</strong>. On<br />

the basis of the polarities of individual words, the<br />

sentence <strong>an</strong>d then its total review polarity is calculated.<br />

The overall system per<strong>for</strong>m<strong>an</strong>ce is evaluated by using a<br />

corpus of movie reviews in Urdu l<strong>an</strong>guage. The<br />

classification algorithm is applied on the review corpus.<br />

Each subjective word in the review is <strong>com</strong>pared with<br />

<strong>lexicon</strong> entries <strong>for</strong> the <strong>com</strong>putation of the polarity scores.<br />

MATERIAL AND METHODS<br />

In this section, the <strong>construction</strong>, structure <strong>an</strong>d<br />

integration of the <strong>sentiment</strong>-<strong><strong>an</strong>notated</strong> <strong>lexicon</strong> of the<br />

Urdu words developed <strong>for</strong> a <strong>sentiment</strong> classification<br />

model is described. The model is designed to distinguish<br />

between the objective <strong>an</strong>d subjective terms in a given<br />

review. Objective terms are with neutral <strong>sentiment</strong>s,<br />

which have no effect on the final decision of the<br />

classification <strong>an</strong>d subjective terms are considered as the<br />

carriers of the <strong>sentiment</strong>s <strong>an</strong>d their presence c<strong>an</strong> alter the<br />

final classification. Keeping this distinction in view, the<br />

<strong>lexicon</strong> entries are also categorized as objective <strong>an</strong>d<br />

subjective terms. Be<strong>for</strong>e going into details, some terms<br />

are defined below:<br />

• Orientation. Orientation describes either the<br />

positivity or the negativity of a <strong>lexicon</strong> entry. For<br />

most of the entries, orientation is predefined during<br />

<strong>lexicon</strong> <strong>construction</strong> phase. But, in a given text it c<strong>an</strong><br />

be altered with the use of a polarity shifter in the<br />

sentence, e.g. the word ‏”اچھھھا“‏ (acha, good) have<br />

positive orientation but, with the polarity shifter “<br />

expression, (naheen, not), it be<strong>com</strong>es a negative ‏”نہیں<br />

i.e., نہیں“‏ ‏”اچھا (acha naheen, not good). Moreover,<br />

the orientation of some words (though their number<br />

is few) is highly domain specific or depends upon the<br />

context within which they are used. But, these two<br />

issues are beyond the scope of this research.<br />

• Intensity. This is the intensity of orientation of a<br />

<strong>lexicon</strong> entry. This describes the <strong>for</strong>ce of positivity<br />

or negativity of a term. Usually, the modifiers, e.g., “<br />

(bohat, more) describe the intensity of <strong>an</strong> ‏”بہھھت<br />

expression. Like other l<strong>an</strong>guages, in Urdu there are<br />

three degrees of intensity; absolute (only positive or<br />

negative orientation), <strong>com</strong>paratives (two distinct<br />

entities are <strong>com</strong>pared with each other) <strong>an</strong>d<br />

superlative (one of all entities is with highest<br />

orientation)<br />

• Polarity. The polarity mark is <strong><strong>an</strong>notated</strong> with each<br />

<strong>lexicon</strong> entry to show its orientation <strong>an</strong>d intensity.<br />

This is done at the implementation level.<br />

Lexicon Construction: A <strong>sentiment</strong>-<strong><strong>an</strong>notated</strong> <strong>lexicon</strong><br />

be<strong>com</strong>es more intricate as <strong>com</strong>pared to other Natural<br />

L<strong>an</strong>guage Processing (NLP) <strong>lexicon</strong>s. There are two<br />

reasons <strong>for</strong> this intricacy:<br />

• Each <strong>lexicon</strong> entry demonstrates its polarity<br />

in<strong>for</strong>mation in addition to its orthographic,<br />

phonological, syntactic <strong>an</strong>d, morphological features.<br />

This polarity in<strong>for</strong>mation is usually represented as<br />

either positive, or negative or neutral. For example,<br />

SentiWordNet (Andreevskaia <strong>an</strong>d Bergler, 2006),<br />

use triplets [positive, negative, objectives], with<br />

minimum value 0.0 <strong>an</strong>d maximum 1.0.<br />

• Most of the words exhibit multiple orientations<br />

depending upon their use <strong>an</strong>d domain. For example,<br />

“This damage is everlasting”. In this sentence, the<br />

everlasting is a positive word, but the <strong>com</strong>ment’s<br />

overall orientation is negative. Also, unpredictable is<br />

a positive word when used about a movie’s plot, <strong>an</strong>d<br />

be<strong>com</strong>es negative <strong>for</strong> the per<strong>for</strong>m<strong>an</strong>ce of a<br />

microwave oven.<br />

Construction Steps: The <strong>lexicon</strong> <strong>construction</strong> task is<br />

divided into following steps:<br />

Figure 1. Structure of the <strong>sentiment</strong>-<strong><strong>an</strong>notated</strong> <strong>lexicon</strong><br />

with respect to O <strong>an</strong>d I<br />

• Categorize the words either subjective or objective.<br />

When the classification algorithm is applied on these<br />

words, then the classifier simply ignores objective<br />

terms, in this way its per<strong>for</strong>m<strong>an</strong>ce totally depends<br />

upon subjective words.<br />

• Categorize these words according to morphological<br />

rules, which work at the word level. These rules c<strong>an</strong><br />

ch<strong>an</strong>ge the structure, me<strong>an</strong>ing, <strong>an</strong>d part of speech of<br />

the words. For example, rules <strong>for</strong> marking of <strong>an</strong><br />

adjective with the noun it qualifies, etc.<br />

• Identify their grammatical rules, which describe the<br />

possible structures of a sentence <strong>an</strong>d position of the<br />

parts of speech with respect to each other. As Urdu is<br />

a free order l<strong>an</strong>guage so theses rules are more<br />

difficult to define <strong>an</strong>d implement. For example, use<br />

of modifiers with adjectives or use of auxiliaries with<br />

verbs, etc.<br />

• Discover relationships between different <strong>lexicon</strong><br />

entries. These relationships c<strong>an</strong> define synonyms,<br />

<strong>an</strong>tonyms, <strong>an</strong>d cross references, etc.<br />

219


Pakist<strong>an</strong> Journal of Science (Vol. 63 No. 4 Dec, 2011)<br />

• Decide <strong>an</strong>d <strong>an</strong>notate polarities <strong>an</strong>d then intensities to<br />

the entries. In this task first the entries are<br />

categorized as positive or negative then their<br />

intensity scores are attached to them. Some entries<br />

have only orientations <strong>an</strong>d some have only intensities<br />

(like modifiers) <strong>an</strong>d some have both values.<br />

Lexicon Structure: It is assumed that the <strong>lexicon</strong> entries<br />

are either subjective or objective. The Objective terms are<br />

saved without <strong>an</strong>y polarity mark, but the subjective terms<br />

are further categorized on the bases of orientation <strong>an</strong>d<br />

intensity into three types as:<br />

• Terms with orientation only T (O). These are the<br />

terms which are either absolute positive or absolute<br />

negative. The degree of positivity or negativity is not<br />

attached with them.<br />

• Terms with intensity only T (I). These are the terms<br />

which have no orientation but they c<strong>an</strong> intensify the<br />

orientation of other word in the sentences.<br />

• Terms with both orientation <strong>an</strong>d intensity T (O, I). If<br />

a term contains both orientation (either positive or<br />

negative) <strong>an</strong>d intensity then it lies in this category<br />

<strong>an</strong>d is marked with both values.<br />

Some examples of <strong>lexicon</strong> entries from all the three<br />

categories, i.e., T(O), T(I) <strong>an</strong>d T(I,O) are given in Table<br />

1. For example, the word ‏”کامیاب“‏ (kamyaab, successful),<br />

‏”زیادہ“‏ Similarly, has positive orientation but no intensity.<br />

(zyada, more) <strong>an</strong>d ‏”بہت“‏ (bohat, very) both have intensity<br />

<strong>an</strong>d no orientation. Whereas, ‏”بہتر“‏ (behtar, better) <strong>an</strong>d “<br />

(behtareen, best) both have positive orientation ‏”بہھترین<br />

with intensities of a <strong>com</strong>parative <strong>an</strong>d superlative degrees,<br />

respectively.<br />

Figure 2. Integration of <strong>sentiment</strong> <strong><strong>an</strong>notated</strong> <strong>lexicon</strong> of Urdu words with the <strong>sentiment</strong> classifier<br />

System Integration: The <strong><strong>an</strong>notated</strong> <strong>lexicon</strong> of Urdu<br />

words is integrated with the <strong>sentiment</strong> classifier as shown<br />

in Figure 2. First of all, the given text in the <strong>for</strong>m of a<br />

review is taken from the website. The <strong>sentiment</strong> classifier<br />

<strong>com</strong>ponent of the systems preprocesses this review,<br />

segments it into sentences <strong>an</strong>d then words. These words<br />

are then tagged with the respective parts of speech. Now,<br />

these tagged words are <strong>com</strong>pared with the <strong>lexicon</strong> entries<br />

<strong>for</strong> <strong>sentiment</strong> orientations <strong>an</strong>d intensities. This<br />

<strong>com</strong>parison results into polarity marked or polarity<br />

<strong><strong>an</strong>notated</strong> words <strong>an</strong>d phrases. The classifier then<br />

calculates the <strong>sentiment</strong> orientation of the sentences using<br />

term polarities.<br />

RESULTS AND DISCUSSION<br />

As already mentioned, the corpuses of reviews<br />

in Urdu text are not available in the electronic <strong>for</strong>m.<br />

Although, some other corpuses related to news, blogs are<br />

accessible but these are not appropriate <strong>for</strong> the<br />

experimentation <strong>an</strong>d evaluation of our system because<br />

these do not contain opinionated text like reviews.<br />

There<strong>for</strong>e, two corpuses are m<strong>an</strong>ually collected<br />

as the test-beds from the domains of movies <strong>an</strong>d<br />

electronic appli<strong>an</strong>ces. These reviews are taken from<br />

different people to avoid monotonous opinions. The<br />

movie reviews based corpus MR (movie reviews) is<br />

<strong>com</strong>prised of 226 positive, 224 negative <strong>an</strong>d 450 reviews<br />

in total. There are 328 reviews of electronic appli<strong>an</strong>ces in<br />

PR (product reviews) corpus, with 177 positive <strong>an</strong>d 151<br />

negative.<br />

For measuring the per<strong>for</strong>m<strong>an</strong>ce, accuracy is<br />

used as the system per<strong>for</strong>m<strong>an</strong>ce metric. It is the measure<br />

of how close the document classification suggested by<br />

our system is to the actual <strong>sentiment</strong>s present in the<br />

review. A series of experiments is per<strong>for</strong>med on both<br />

corpora, one after <strong>an</strong>other.<br />

Table 2, shows the results, with accuracy of 66-<br />

74% <strong>for</strong> MR <strong>an</strong>d 77-79% <strong>for</strong> PR. It also gives the<br />

variation in the classification of positive <strong>an</strong>d negative<br />

reviews, separately.<br />

220


Pakist<strong>an</strong> Journal of Science (Vol. 63 No. 4 Dec, 2011)<br />

Table 2. Results of experimentation on both corpora<br />

Category Corpora Accuracy<br />

Negative<br />

MR 66%<br />

PR 77%<br />

Positive<br />

MR 74%<br />

PR 79%<br />

Conclusions This research work presents, the structure,<br />

development <strong>an</strong>d integration of a <strong>sentiment</strong>-<strong><strong>an</strong>notated</strong><br />

<strong>lexicon</strong>, developed as a <strong>com</strong>ponent of <strong>an</strong> Urdu text based<br />

<strong>sentiment</strong> <strong>an</strong>alysis system. Urdu is a morphologically<br />

rich l<strong>an</strong>guage, <strong>an</strong>d hence, poses m<strong>an</strong>y challenges <strong>for</strong> the<br />

development of such a <strong>lexicon</strong>. Moreover, due to<br />

unavailability of electronic text <strong>an</strong>d corpuses of<br />

opinionated reviews, our task be<strong>com</strong>es even more time<br />

consuming. The next step after the development of the<br />

<strong>lexicon</strong> is its integration with the <strong>sentiment</strong> classifier <strong>an</strong>d<br />

final implementation of the <strong>com</strong>plete system. There are<br />

two types of corpuses, which are used <strong>for</strong> testing, i.e.,<br />

movie <strong>an</strong>d product reviews. Despite of the inherent<br />

<strong>com</strong>plexities of the l<strong>an</strong>guage, the experimentation gives<br />

excellent results with <strong>an</strong> accuracy of about (74%).<br />

There<strong>for</strong>e, it is pl<strong>an</strong>ned to extend this <strong>lexicon</strong> on the same<br />

structure but with larger coverage of words.<br />

REFERENCES<br />

Andreevskaia, A. <strong>an</strong>d S. Bergler: Mining WordNet <strong>for</strong><br />

fuzzy <strong>sentiment</strong>: Sentiment tag extraction from<br />

WordNet glosses. In: EACL 2006, Trent, Italy,<br />

(2006).<br />

Annet, M. <strong>an</strong>d G. Kondark: A <strong>com</strong>parison of <strong>sentiment</strong><br />

<strong>an</strong>alysis techniques: Polarizing movie blogs. In:<br />

Bergler, S. (ed.) C<strong>an</strong>adi<strong>an</strong> AI 2008. LNCS<br />

(LNAI), vol. 5032, pp. 25–35. Springer,<br />

Heidelberg, (2008).<br />

Glaser, J., J. Dixit <strong>an</strong>d P. D. Green: Studying hate crime<br />

with the Internet: What makes racists advocate<br />

racial violence, Journal of Social Issues 58, 1,<br />

177-193, (2002).<br />

Hatzivassiloglou, V. <strong>an</strong>d J. Wiebe: Effects of Adjective<br />

Orientation <strong>an</strong>d Gradability on Sentence<br />

Subjectivity. In: 18th International Conference<br />

on Computational Linguistics, New Brunswick,<br />

NJ, (2000).<br />

Higashinaka, R., M. Walker <strong>an</strong>d R. Prasad: Learning to<br />

generate naturalistic utter<strong>an</strong>ces using reviews in<br />

spoken dialogue systems. ACM Tr<strong>an</strong>sactions<br />

onSpeech <strong>an</strong>d L<strong>an</strong>guage Processing (TSLP),<br />

(2007).<br />

Hu, M. <strong>an</strong>d B. Lui: Mining <strong>an</strong>d summarizing customer<br />

reviews. In: Conference on Hum<strong>an</strong> L<strong>an</strong>guage<br />

Technology <strong>an</strong>d Empirical Methods in Natural<br />

L<strong>an</strong>guage Processing, (2005).<br />

Humayoun, M., H. Hammarström, <strong>an</strong>d A. R<strong>an</strong>ta.: Urdu<br />

morphology, orthography <strong>an</strong>d <strong>lexicon</strong><br />

extraction. In A. Farghaly <strong>an</strong>d K.<br />

Megerdoomi<strong>an</strong> (Eds.). In: Proceedings of the<br />

2nd Workshop on Computational Approaches to<br />

Arabic Scriptbased L<strong>an</strong>guages, pp. 59–66.<br />

St<strong>an</strong><strong>for</strong>d LSA (2007).<br />

Ijaz, M. <strong>an</strong>d S. Hussain: Corpus based Urdu Lexicon<br />

Development. In: Conference on L<strong>an</strong>guage<br />

Technology (CLT 2007), University of<br />

Peshawar, Pakist<strong>an</strong>, (2007).<br />

Muaz, A., A. Ali <strong>an</strong>d S. Hussain: Analysis <strong>an</strong>d<br />

Development of Urdu POS Tagged Corpora. In:<br />

Proceedings of the 7 th Workshop on Asi<strong>an</strong><br />

L<strong>an</strong>guage Resources, IJCNLP, (2009).<br />

Mukund, S., D. Ghosh <strong>an</strong>d R. K. Srihari: Using Cross-<br />

Lingual Projections to Generate sem<strong>an</strong>tic Role<br />

Labeled Corpus <strong>for</strong> Urdu- A Resource Poor<br />

L<strong>an</strong>guage. In: 23 rd International Conference on<br />

Computational Linguistics COLING, (2010).<br />

P<strong>an</strong>g, B. <strong>an</strong>d L. Lee: Opinion mining <strong>an</strong>d <strong>sentiment</strong><br />

<strong>an</strong>alysis. Foundation <strong>an</strong>d Trends in In<strong>for</strong>mation<br />

Retrieval 2(1-2), 1–135, (2008).<br />

Riloff, E., J. Wiebe <strong>an</strong>d T. Wilson: Learning subjective<br />

nouns using extraction pattern bootstrapping. In<br />

Proceedings of the Conference on Natural<br />

L<strong>an</strong>guage Learning (CoNLL), pp. 25–32,<br />

(2003).<br />

Turney, P.: Thumbs up or thumbs down Sem<strong>an</strong>tic<br />

orientation applied to unsupervised classification<br />

of reviews, in Proceedings of the Association <strong>for</strong><br />

Computational Linguistics (ACL), pp. 417–424,<br />

(2002).<br />

Yu, H. <strong>an</strong>d V. Hatzivassiloglou: Towards <strong>an</strong>swering<br />

opinion questions: Separating facts from<br />

opinions <strong>an</strong>d identifying the polarity of opinion<br />

sentences. In Proceedings of the Conference on<br />

Empirical Methods in Natural L<strong>an</strong>guage<br />

Processing (EMNLP), (2003).<br />

221

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!