Muhammad Abdul-Mageed's Blog
Brief notes on my academic interests, research, etc.
Thursday, September 20, 2012
Saturday, June 09, 2012
SAMAR: A System for Subjectivity and Sentiment Analysis of Arabic Social Media
Abdul-Mageed,
M., Kuebler, S., & Diab, M. (2012). SAMAR: A System
for Subjectivity and Sentiment Analysis of Social Media Arabic. Proceedings of the 3rd Workshop on Computational Approaches
to Subjectivity and Sentiment Analysis (WASSA), Held in conjunction with
50th Annual Meeting on Association for Computational Linguistics. ICC Jeju,
Republic of Korea, July 12, 2012.
Tuesday, April 10, 2012
New Software: A system for Arabic subjectivity and sentiment analysis
New Software: A system for Arabic segmentation and morphosyntactic disambiguation
Friday, March 02, 2012
Linguistic features, language variety, and sentiment in online Arabic

[Photo Credit]
Abdul-Mageed, M., Abu Mostafa, H. (Forthcoming, 2012, April 19-21). Linguistic features, language variety, and sentiment in online Arabic. Pragmatics Festival. Indiana University, Bloomington, USA.
Friday, February 10, 2012
BANADOURA: The Shakespeare Language Syrian Dictator Twitter Insulter



This is a tiny script that uses Shakepeare language* to insult the bloodthirsty Syrian dictator. BANADOURA is geared toward tweet generation, with the two hashtags "#Bashar" and "#Syria". It uses the Python programming language and can be highly and easily customized (for example to generate tweets on other topics, using other language varieties/languages, etc.). Hopefully someone can take it and develop it further into an extractable. (Sorry, I did not have time to work any further on it). If you do develop this further, I'd appreciate it if you let me know so that I point people to your baby. The simple code is below, with a sample run. I typically generate 20 tweets per run. Why did I call it BANADOURA? Well, this is classified info.... Will tell you later, maybe?.
[For updates, follow me on Twitter https://twitter.com/#!/mageed]
*[The Shakespeare words are taken from here]
========================= CODE ==============================
#!/usr/bin/python
# -*- coding: utf-8 -*-
######################
__author__="mam"
__date__ ="02/10/2012"
######################
from random import choice
######################
columnA=["artless", "bawdy", "beslubbering", "bootless",\
"churlish", "cockered", "clouted", "craven", \
"currish", "dankish", "dissembling", "droning", "errant",\
"fawning", "fobbing ", "froward", "frothy", "gleeking", "goatish",\
"gorbellied", "impertinent", "infectious", "jarring", "loggerheaded",
"lumpish", "mammering", "mangled", "mewling ", "paunchy", "pribbling",\
"puking", "puny", "qualling", "rank", "reeky", "roguish", "ruttish", "saucy",\
"spleeny", "spongy", "surly", "tottering", "unmuzzled", "vain", "venomed", \
"villainous", "warped", "wayward", "weedy", "yeasty"]
#######################
columnB= ["base-court", "bat-fowling", "beef-witted", "beetle-headed", "boil-brained",\
"clapper-clawed", "clay-brained", "common-kissing", "crook-pated", \
"dismal-dreaming", "dizzy-eyed", "doghearted",\
"dread-bolted", "earth-vexing", "elf-skinned", "fat-kidneyed",\
"fen-sucked", "flap-mouthed", "fly-bitten", "folly-fallen", \
"fool-born", "full-gorged", "guts-griping", "half-faced",\
"hasty-witted", "hedge-born", "hell-hated", "idle-headed",\
"ill-breeding", "ill-nurtured", "knotty-pated", "milk-livered",\
"motley-minded", "onion-eyed", "plume-plucked", "pottle-deep",\
"pox-marked", "reeling-ripe", "rough-hewn", "rude-growing", \
"rump-fed", "shard-borne", "sheep-biting",\
"spur-galled", "swag-bellied", "tardy-gaited", "tickle-brained",\
"toad-spotted", "unchin-snouted",\
"weather-bitten"]
#######################
columnC= ["apple-john", "baggage", "barnacle", "bladder", "boar-pig", "bugbear", "bum-bailey",\
"canker-blossom", "clack-dish", "clotpole", "coxcomb", "codpiece", "death-token", \
"dewberry", "flap-dragon", "flax-wench", "flirt-gill", "foot-licker", "fustilarian", \
"giglet", "gudgeon", "haggard", "harpy", "hedge-pig", "horn-beast", "hugger-mugger",\
"joithead", "lewdster", "lout", "maggot-pie", "malt-worm", "mammet", "measle",\
"minnow", "miscreant", "moldwarp", "mumble-news", "nut-hook", "pigeon-egg", "pignut",\
"puttock", "pumpion", "ratsbane", "scut", "skainsmate", "strumpet", "varlot",\
"vassal", "whey-face", "wagtail"]
#######################
if __name__ == "__main__":
print "Welcome to BANADOURA, The Syrian Dictator Tweet Insulter...", "\n", "*"*59, "\n"
for i in range(1, 20):
print "#Bashar, Mr. bloodthirsty, thou art", choice(columnA)+", "+ choice(columnB)+", "+\ choice(columnC)+ "!! #Syria"
========================= SAMPLE RUN ========================
Welcome to BANADOURA, The Syrian Dictator Tweet Insulter...
***********************************************************
#Bashar, Mr. bloodthirsty, thou art spleeny, fen-sucked, gudgeon!! #Syria
#Bashar, Mr. bloodthirsty, thou art roguish, pox-marked, bladder!! #Syria
#Bashar, Mr. bloodthirsty, thou art froward, reeling-ripe, skainsmate!! #Syria
#Bashar, Mr. bloodthirsty, thou art frothy, ill-breeding, bladder!! #Syria
#Bashar, Mr. bloodthirsty, thou art spongy, toad-spotted, haggard!! #Syria
#Bashar, Mr. bloodthirsty, thou art infectious, beef-witted, giglet!! #Syria
#Bashar, Mr. bloodthirsty, thou art loggerheaded, common-kissing, measle!! #Syria
#Bashar, Mr. bloodthirsty, thou art currish, beetle-headed, whey-face!! #Syria
#Bashar, Mr. bloodthirsty, thou art warped, half-faced, harpy!! #Syria
#Bashar, Mr. bloodthirsty, thou art impertinent, pottle-deep, lout!! #Syria
#Bashar, Mr. bloodthirsty, thou art jarring, base-court, puttock!! #Syria
#Bashar, Mr. bloodthirsty, thou art mewling , fat-kidneyed, puttock!! #Syria
#Bashar, Mr. bloodthirsty, thou art droning, base-court, clack-dish!! #Syria
#Bashar, Mr. bloodthirsty, thou art saucy, doghearted, pigeon-egg!! #Syria
#Bashar, Mr. bloodthirsty, thou art roguish, doghearted, barnacle!! #Syria
#Bashar, Mr. bloodthirsty, thou art cockered, ill-breeding, canker-blossom!! #Syria
#Bashar, Mr. bloodthirsty, thou art wayward, hedge-born, barnacle!! #Syria
#Bashar, Mr. bloodthirsty, thou art craven, boil-brained, measle!! #Syria
#Bashar, Mr. bloodthirsty, thou art dankish, base-court, baggage!! #Syria
Wednesday, February 01, 2012
AWATIF: A Multi-Genre Corpus for Modern Standard Arabic Subjectivity and Sentiment Analysis


Abdul-Mageed, M. & Diab, M. (Forthcoming, 2012). AWATIF: A Multi-Genre Corpus for Modern Standard Arabic Subjectivity and Sentiment Analysis. The 8th International Conference on Language Resources and Evaluation (LREC2012) . Istanbul, Turkey.
[photo credits: Google pics]
Saturday, December 10, 2011
Egyptian's Latest Trend of 'Cyperactivism?': #occupyFacebook!

[Photo credit]
Egyptian activists are flooding Facebook walls of celebrities like Van Diesel and Shakira, as well as political figures like Obama, with satirical posts about the Egyptian revolution. Egyptian Twitter users are reporting the activity under the hashtag #occcupyfacebook. Most of the FB wall comments are funny, with sth like:
English: Hey uncle Diesel, come over to Egypt and solve the problem with the police and thugs. You're such a good guy and will be up to it. I promise you a couple of these Libya-smuggled marijuana cigarettes?
Another comment on Shakiras wall mimics and Egyptian song that is usually dedicated to mothers in the mother's day. It runs as follows:
ست الحبايب يا شاكيرا .. يا أغلى من روبي ونانسي!
English: Oh Shakira, most loved, you're dearer than Rouby and Nancy.
(Rouby and Nancy are two popular Arab singers).
Comments on Obama's FB page are fast growing, with ~100 comments per minute. These include serious comments in English like "Stop Exporting tear Gas to Egypt" and:
English: "Tell the Marshall to hand over power, we're really naughty and won't release the page."
Many comments are on the funnier side, including ones describing how to cook many types of Egyptian food:
From Obama's page: Chicken Kibdaky (English: a mock-up for "KFC"):
1كيلو دبابيس فراخ +1كيس بقسماط+2بيضة+شوية دقيق درة+[الطريقة نسلق الفراخ بعد كدة نحطها في الفريزر مدة سعتين وبعدين نطلعهم من الفريزر نحطهم في البيض والبقسماط وبعدين في الدقيق وبعد كدة ارجعي حطيهم في الفريزر تاني لمدة ساعتين واللة بيعملوها كدة في كنتاكي ممكن تعملي كتير وتبقي تطلعيها في اي وقت تحمري في الزيت جميلة وتنشفيها علي ورق مناديل وبالف هنا ويارب تعجبكم :D
From Diesel's page: Good-smelling taqliyah (English: "sauce"):
- ضعي ملعقة السمن في طاسة صغيرة وضعي باقي خلطة الثوم عليها
- عندما يتحمر الثوم ويصبح لونه ذهبي .... أوعي تحرقيه ... خليكي شاطرة .. تضيفيه على الملوخية ... وياسلااام على الريحة والطشة ....
- وبعدين تقلبي كله على بعضه وبالهنااااااو الشفاااااا .....
Yet other comments are Egyptian folklorish, proverbs, and popular songs like:
ادي الواد لأبوه
ياعيني الواد بيعيط شيل الواد من الارض
Other comments are more hilarious, imagining Obama as a call boy in a cafe:
عشان جعان فين هم ياد يا اوباما
English: "I ordered for tea and Nescafe. Where are these Obama?"
Performances include imagining the Wall as a physical space. In the following comment it is likened to a user's house:
English "Make us a couple cups of tea... grab a chair and come on in, it's your house":
The wall is also likened to a playground:
English: "We wanna play soccer, let's play on this wall..."
and a community gathering:
English: "Does anyone have a jam sandwich?"
The wall is also likened to a bedroom:
English: "Stop this noise; I need to sleep!"
Activists also use Glitchrs, like the below:
.... j̡̆ͣͯ̆҉̸͈͖̙͙͍̰̺̥̖̯̠̼̺̳̞ͅj̋
ᅠᅠ
ᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠ
▁▂▃▄▅▆▇█▓▒░␥ ... EGYPT ... ␥░▒▓█▇▆▅▄▃▂▁
ᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠ
Monday, October 24, 2011
Toward Building a Large-Scale Arabic Sentiment Lexicon

Abdul-Mageed, M. & Diab, M. (Forthcoming, 2012).
Toward Building a Large-Scale Arabic Sentiment Lexicon.
In Proceedings of the 6th International Global WordNet
Conference, January, 9-13, Matsue, Japan. [pdf][bib]
[photo 1 credit] [photo 2 credit]
Saturday, September 17, 2011
Linguistically-Motivated Subjectivity and Sentiment Annotation and Tagging of Modern Standard Arabic
Statistical Parsing, Computational Pragmatics, Computational Lexical Semantics, and Semitic Morphology & Syntax

*[photo credit: http://verbs.colorado.edu/LSA2011/manyfaces/nlp.html]
During part of the 2011 Summer, I attended the Linguistic Society of America's (LSA 2011) Summer Institute. It was held in Colorado University at Boulder. I took four courses, as follows:
- Statistical Parsing: (with Rebecca Hwa, Department of Computer Science, University of Pittsburgh). We covered various parsing algorithms. We also built a parser for a dummy language using some code provided by Rebecca. After we wrote a grammar for the language, I developed a simple algorithm that optimized the performance of the parser and could improve about 2% over the performance of the non-optimized parser. Rebecca, Sandra (Kuebler, my wonderful IU advisor) and I had great time dining in Boulder and eating ice cream. Oh, we also had an eventful trip to the Rocky Mountains!
- Computational Pragmatics: (with Chris Potts, from the Department of Linguistics, Stanford University). We looked into various ways of computing pragmatic phenomena using corpora. Chris had lots of data and the class was pretty interactive. We turned quick proof-of-concept exercises (and of course I did it all in Python ;)). I turned a proof-of-concept system for gender detection as a final project. Hopefully I will have time to improve and publish!
- Computational Lexical Semantics: (with Martha Palmer, Colorado linguistics & Christian Fellbaum, from Princeton Computer Science Dept.). We covered a lot of computational semantics, including (multi-lingual) semantic role labeling. Martha and Christian introduced several resources and we turned exercises where we used such resources in meaningful ways. I turned a Web-mining project that I am excited about!! Christiane was generous with her time and we met once for coffee (my treat :), I could convince Christiane) and another time for lunch.
- Semitic Morphology & Syntax: (with Abbas Benmamoun from Univ. of Illinois Linguistics and Adam Ussishkin, from Univ. of Arizona Linguistics) We covered what the course title suggests and I reviewed two articles about the automatic processing of Hebrew and Egyptian Arabic as a final project.
Saturday, September 03, 2011
Tweeting in Arabic: What, How and Whither (New!)
Abdul-Mageed, M., Albogmi, H., Gerrio, A., Hamed, E.; Aldibasi, O. (2011, October 10-13). Tweeting in Arabic: What, How and Whither. A paper accepted for presentation at the 12th annual conference of the Association of Internet Researchers (Internet Research 12.0 – Performance and Participation). Seattle, USA.
Reception of the Obama Healthcare Reform Plan in Professional and User-Generated Web Content
YoussefAgha, A., Abdul-Mageed, M., Loherman, D., Lieberman, T. (2011, October 10-13). Reception of the Obama Healthcare Reform Plan in Professional and User-Generated Web Content. A paper accepted for presentation at the 12th annual conference of the Association of Internet Researchers (Internet Research 12.0 – Performance and Participation). Seattle, USA.
Taghreed?: What Arabs say on Twitter and how they say it.
Abdul-Mageed, M., Albogmi, H. (2011). Taghreed?: What Arabs say on Twitter and how they say it. Georgetown University Round Table on Languages and Linguistics (GURT2011). Language and New Media: Discourse 2.0. (Posetr).
Thursday, June 16, 2011
”Yes we can?”: Subjectivity Annotation and Tagging for the Health Domain


Abdul-Mageed, M., Korayem, M. & YoussefAgha, A. (2011). "Yes we can?": Subjectivity Annotation and Tagging for the Health Domain. The International Conference on Recent Advances in Natural Language Processing (RANLP2011), 12-14 September, Hissar, Bulgaria. [pdf] [bib]
----------------------------------------------------------------
Abdul-Mageed, M., Korayem, M. & YoussefAgha, A. (2011). "Yes we can?": Subjectivity Annotation and Tagging for the Health Domain. The International Conference on Recent Advances in Natural Language Processing (RANLP2011), 12-14 September, Hissar, Bulgaria. [pdf][bib]Tuesday, May 10, 2011
Subjectivity and Sentiment Annotation of Modern Standard Arabic Newswire


Abdul-Mageed, M. & Diab, M. (To appear, 2011). Subjectivity and Sentiment Annotation of Modern Standard Arabic Newswire. Proceedings of the the Fourth Linguistic Annotation Workshop. Portland, Oregon, USA, June 23-24, 2011. [pdf] [bib]
Sunday, May 08, 2011
SUBJECTIVITY AND SENTIMENT ANALYSIS OF MODERN STANDARD ARABIC



Abdul-Mageed, M., Diab, M. & Korayem, M. (To appear, 2011). SUBJECTIVITY AND SENTIMENT ANALYSIS OF MODERN STANDARD ARABIC. Proceedings of the 49th Annual Meeting on Association for Computational Linguistics. Portland, Oregon, USA, June 19-24, 2011. [pdf] [bib]
Thursday, March 17, 2011
Automatic Detection of Arabic Non-Anaphoric Pronouns for Improving Anaphora Resolution

NEW PUBLICATION | |
| Abdul-Mageed, M. (2011). Automatic Detection of Arabic Non-Anaphoric Pronouns for Improving Anaphora Resolution. ACM Transactions on Asian Language Information Processing (TALIP), 10(1), 5. (15% acceptance rate as of 2009) | |
| doi>10.1145/1929908.1929913 | |
Full text: PDF | |
| Anaphora resolution is one of the most difficult tasks in NLP. The ability to identify non-referential pronouns before attempting an anaphora resolution task would be significant, since the system would not have to attempt resolving such pronouns and ... Anaphora resolution is one of the most difficult tasks in NLP. The ability to identify non-referential pronouns before attempting an anaphora resolution task would be significant, since the system would not have to attempt resolving such pronouns and hence end up with fewer errors. In addition, the number of non-referential pronouns has been found to be non-trivial in many domains. The task of detecting non-referential pronouns could also be incorporated into a part-of-speech tagger or a parser, or treated as an initial step in semantic interpretation. In this article, I describe a machine learning method for identifying non-referential pronouns in an annotated subsegment of the Penn Arabic Treebank using three different feature settings. I achieve an accuracy of 97.22% with 52 different features extracted from a small window size of -5/+5 tokens surrounding each potentially non-referential pronoun. |
Friday, December 24, 2010
Linguistic features, language variety, and sentiment in Arabic comments on Aljazeera and Alarabiya YouTube Videos


Abdul-Mageed, M., AlAhmed, A. & Korayem, M. (2011). Linguistic features, language variety, and sentiment in Arabic comments on Aljazeera and Alarabiya YouTube Videos. Georgetown University Round Table on Languages and Linguistics (GURT2011). Language and New Media: Discourse 2.0.

PDF