sentiment-annotated lexicon construction for an urdu ... - Paas.com.pk
sentiment-annotated lexicon construction for an urdu ... - Paas.com.pk
sentiment-annotated lexicon construction for an urdu ... - Paas.com.pk
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Pakist<strong>an</strong> Journal of Science (Vol. 63 No. 4 Dec, 2011)<br />
Table 2. Results of experimentation on both corpora<br />
Category Corpora Accuracy<br />
Negative<br />
MR 66%<br />
PR 77%<br />
Positive<br />
MR 74%<br />
PR 79%<br />
Conclusions This research work presents, the structure,<br />
development <strong>an</strong>d integration of a <strong>sentiment</strong>-<strong><strong>an</strong>notated</strong><br />
<strong>lexicon</strong>, developed as a <strong>com</strong>ponent of <strong>an</strong> Urdu text based<br />
<strong>sentiment</strong> <strong>an</strong>alysis system. Urdu is a morphologically<br />
rich l<strong>an</strong>guage, <strong>an</strong>d hence, poses m<strong>an</strong>y challenges <strong>for</strong> the<br />
development of such a <strong>lexicon</strong>. Moreover, due to<br />
unavailability of electronic text <strong>an</strong>d corpuses of<br />
opinionated reviews, our task be<strong>com</strong>es even more time<br />
consuming. The next step after the development of the<br />
<strong>lexicon</strong> is its integration with the <strong>sentiment</strong> classifier <strong>an</strong>d<br />
final implementation of the <strong>com</strong>plete system. There are<br />
two types of corpuses, which are used <strong>for</strong> testing, i.e.,<br />
movie <strong>an</strong>d product reviews. Despite of the inherent<br />
<strong>com</strong>plexities of the l<strong>an</strong>guage, the experimentation gives<br />
excellent results with <strong>an</strong> accuracy of about (74%).<br />
There<strong>for</strong>e, it is pl<strong>an</strong>ned to extend this <strong>lexicon</strong> on the same<br />
structure but with larger coverage of words.<br />
REFERENCES<br />
Andreevskaia, A. <strong>an</strong>d S. Bergler: Mining WordNet <strong>for</strong><br />
fuzzy <strong>sentiment</strong>: Sentiment tag extraction from<br />
WordNet glosses. In: EACL 2006, Trent, Italy,<br />
(2006).<br />
Annet, M. <strong>an</strong>d G. Kondark: A <strong>com</strong>parison of <strong>sentiment</strong><br />
<strong>an</strong>alysis techniques: Polarizing movie blogs. In:<br />
Bergler, S. (ed.) C<strong>an</strong>adi<strong>an</strong> AI 2008. LNCS<br />
(LNAI), vol. 5032, pp. 25–35. Springer,<br />
Heidelberg, (2008).<br />
Glaser, J., J. Dixit <strong>an</strong>d P. D. Green: Studying hate crime<br />
with the Internet: What makes racists advocate<br />
racial violence, Journal of Social Issues 58, 1,<br />
177-193, (2002).<br />
Hatzivassiloglou, V. <strong>an</strong>d J. Wiebe: Effects of Adjective<br />
Orientation <strong>an</strong>d Gradability on Sentence<br />
Subjectivity. In: 18th International Conference<br />
on Computational Linguistics, New Brunswick,<br />
NJ, (2000).<br />
Higashinaka, R., M. Walker <strong>an</strong>d R. Prasad: Learning to<br />
generate naturalistic utter<strong>an</strong>ces using reviews in<br />
spoken dialogue systems. ACM Tr<strong>an</strong>sactions<br />
onSpeech <strong>an</strong>d L<strong>an</strong>guage Processing (TSLP),<br />
(2007).<br />
Hu, M. <strong>an</strong>d B. Lui: Mining <strong>an</strong>d summarizing customer<br />
reviews. In: Conference on Hum<strong>an</strong> L<strong>an</strong>guage<br />
Technology <strong>an</strong>d Empirical Methods in Natural<br />
L<strong>an</strong>guage Processing, (2005).<br />
Humayoun, M., H. Hammarström, <strong>an</strong>d A. R<strong>an</strong>ta.: Urdu<br />
morphology, orthography <strong>an</strong>d <strong>lexicon</strong><br />
extraction. In A. Farghaly <strong>an</strong>d K.<br />
Megerdoomi<strong>an</strong> (Eds.). In: Proceedings of the<br />
2nd Workshop on Computational Approaches to<br />
Arabic Scriptbased L<strong>an</strong>guages, pp. 59–66.<br />
St<strong>an</strong><strong>for</strong>d LSA (2007).<br />
Ijaz, M. <strong>an</strong>d S. Hussain: Corpus based Urdu Lexicon<br />
Development. In: Conference on L<strong>an</strong>guage<br />
Technology (CLT 2007), University of<br />
Peshawar, Pakist<strong>an</strong>, (2007).<br />
Muaz, A., A. Ali <strong>an</strong>d S. Hussain: Analysis <strong>an</strong>d<br />
Development of Urdu POS Tagged Corpora. In:<br />
Proceedings of the 7 th Workshop on Asi<strong>an</strong><br />
L<strong>an</strong>guage Resources, IJCNLP, (2009).<br />
Mukund, S., D. Ghosh <strong>an</strong>d R. K. Srihari: Using Cross-<br />
Lingual Projections to Generate sem<strong>an</strong>tic Role<br />
Labeled Corpus <strong>for</strong> Urdu- A Resource Poor<br />
L<strong>an</strong>guage. In: 23 rd International Conference on<br />
Computational Linguistics COLING, (2010).<br />
P<strong>an</strong>g, B. <strong>an</strong>d L. Lee: Opinion mining <strong>an</strong>d <strong>sentiment</strong><br />
<strong>an</strong>alysis. Foundation <strong>an</strong>d Trends in In<strong>for</strong>mation<br />
Retrieval 2(1-2), 1–135, (2008).<br />
Riloff, E., J. Wiebe <strong>an</strong>d T. Wilson: Learning subjective<br />
nouns using extraction pattern bootstrapping. In<br />
Proceedings of the Conference on Natural<br />
L<strong>an</strong>guage Learning (CoNLL), pp. 25–32,<br />
(2003).<br />
Turney, P.: Thumbs up or thumbs down Sem<strong>an</strong>tic<br />
orientation applied to unsupervised classification<br />
of reviews, in Proceedings of the Association <strong>for</strong><br />
Computational Linguistics (ACL), pp. 417–424,<br />
(2002).<br />
Yu, H. <strong>an</strong>d V. Hatzivassiloglou: Towards <strong>an</strong>swering<br />
opinion questions: Separating facts from<br />
opinions <strong>an</strong>d identifying the polarity of opinion<br />
sentences. In Proceedings of the Conference on<br />
Empirical Methods in Natural L<strong>an</strong>guage<br />
Processing (EMNLP), (2003).<br />
221