Abdul-Mageed, M. & Korayem, M. (2010). Automatic Identification of Subjectivity in Morphologically Rich Languages: The Case of Arabic. In Proceedings of 1st Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2010). Lisbon, Portugal.
Abstract. As more user-generated content becomes available online, the need for mining that content becomes increasingly critical. One related area that has been witnessing a flurry of research is that of subjectivity and sentiment analysis. We report our efforts to annotate a corpus of 200 documents from the Penn Arabic Treebank, which is composed of news texts, for subjectivity, along with attempts to automatically classify that data at the sentence level. We investigate the performance of three different machine learning methods on the task with various features and vector settings. We achieve a very high accuracy using a support vector machines classifier. We finally briefly discuss issues related to performing text classification on Arabic, a morphologically rich language, and suggest future directions.
Calligraphy by hassan.massoudy