TY - GEN
T1 - If you are happy and you know it... tweet
AU - Asiaee T., Amir
AU - Tepper, Mariano
AU - Banerjee, Arindam
AU - Sapiro, Guillermo
PY - 2012
Y1 - 2012
N2 - Extracting sentiment from Twitter data is one of the fundamental problems in social media analytics. Twitter's length constraint renders determining the positive/negative sentiment of a tweet difficult, even for a human judge. In this work we present a general framework for per-tweet (in contrast with batches of tweets) sentiment analysis which consists of: (1) extracting tweets about a desired target subject, (2) separating tweets with sentiment, and (3) setting apart positive from negative tweets. For each step, we study the performance of a number of classical and new machine learning algorithms. We also show that the intrinsic sparsity of tweets allows performing classification in a low dimensional space, via random projections, without losing accuracy. In addition, we present weighted variants of all employed algorithms, exploiting the available labeling uncertainty, which further improve classification accuracy. Finally, we show that spatially aggregating our per-tweet classification results produces a very satisfactory outcome, making our approach a good candidate for batch tweet sentiment analysis.
AB - Extracting sentiment from Twitter data is one of the fundamental problems in social media analytics. Twitter's length constraint renders determining the positive/negative sentiment of a tweet difficult, even for a human judge. In this work we present a general framework for per-tweet (in contrast with batches of tweets) sentiment analysis which consists of: (1) extracting tweets about a desired target subject, (2) separating tweets with sentiment, and (3) setting apart positive from negative tweets. For each step, we study the performance of a number of classical and new machine learning algorithms. We also show that the intrinsic sparsity of tweets allows performing classification in a low dimensional space, via random projections, without losing accuracy. In addition, we present weighted variants of all employed algorithms, exploiting the available labeling uncertainty, which further improve classification accuracy. Finally, we show that spatially aggregating our per-tweet classification results produces a very satisfactory outcome, making our approach a good candidate for batch tweet sentiment analysis.
KW - bayes classification
KW - compressed learning
KW - sparse modeling
KW - supervised learning
KW - svm
KW - twitter sentiment analysis
UR - http://www.scopus.com/inward/record.url?scp=84871071465&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84871071465&partnerID=8YFLogxK
U2 - 10.1145/2396761.2398481
DO - 10.1145/2396761.2398481
M3 - Conference contribution
AN - SCOPUS:84871071465
SN - 9781450311564
T3 - ACM International Conference Proceeding Series
SP - 1602
EP - 1606
BT - CIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management
T2 - 21st ACM International Conference on Information and Knowledge Management, CIKM 2012
Y2 - 29 October 2012 through 2 November 2012
ER -