Outlier Detection Techniques for Biased Opinion Discovery

Jongheum Yeon, Junho Shim, Sang goo Lee

Abstract


Users in social media post various types of opinions such as product reviews and movie reviews. It is a common trend that customers get assistance from the opinions in making their decisions. However, as opinion usage grows, distorted feedbacks also have increased. For example, exaggerated positive opinions are posted for promoting target products. So are negative opinions which are far from common evaluations. Finding these biased opinions becomes important to keep social media reliable. Techniques of opinion mining (or sentiment analysis) have been developed to determine sentiment polarity of opinionated documents. These techniques can be utilized for finding the biased opinions. However, the previous techniques have some drawback. They categorize the text into only positive and negative, and they also need a large amount of training data to build the classifier. In this paper, we propose methods for discovering the biased opinions which are skewed from the overall common opinions. The methods are based on angle based outlier detection and personalized PageRank, which can be applied without training data. We analyze the performance of the proposed techniques by presenting experimental results on a movie review dataset.


Full Text:

PDF

References


Scaffidi, C., Bierhoff, K., Chang, E., Felker, M., Ng, H. and Jin, C., “Red Opal : Product‐Feature Scoring from Reviews,” In Proceedings of the 8th ACM conference on Electronic Commerce, 2007.

Jindal, N. and Liu, B., “Opinion Spam and Analysis,” In Proceedings of the international conference on Web search and web data mining, 2008.

Castillo, C. and Davison, B. D., “Adversarial Web Search,” Foundations and Trends in Information Retrieval, Vol. 4, No. 5, 2010.

Liu, B., “Web Data Mining : Exploring Hyperlinks, Contents, and Usage Data,” Springer, 2011.

Pang, B., Lee, L. and Vaithyanathan, S., “Thumbs up? Sentiment Classification using Machine Learning Techniques,” In Proceedings of the ACL 02 conference on Empirical methods in natural language processing, Vol. 10, 2002.

Ding, X., Liu, B., and Yu, P. S., “A holistic lexicon based approach to opinion mining,” In Proceedings of the international conference on Web search and web data mining, 2008.

Hu, M. and Liu, B., “Mining and summarizing customer reviews,” In Proceedings of the 10th ACM SIGKDD international conference on Knowledge Discovery and Data mining, 2004.

Liu, B., Hu, M. and Cheng, J., “Opinion observer : analyzing and comparing opinions on the Web,” In Proceedings of the 14th international on World Wide Web, 2005.

Scaffidi, C., Bierhoff, K., Chang, E., M. Felker, Ng, H. and Jin, C., “Red Opal : Product Feature Scoring from Reviews,” In Proceedings of the 8th ACM conference on Electronic Commerce, 2007.

Jin, W., Ho, H. and Srihari, R., “Opinion- Miner : a novel machine learning system for web opinion mining and extraction,” In Proceedings of the 15th ACM SIGKDD international conference on Knowledge Discovery and Data mining, 2009.

Esuli, A. and Sebastiani, F., “Determining Term Subjectivity and Term Orientation for Opinion Mining,” In Proceedings of 11th conference of the European chapter of the Association for Computational Linguistics, 2006.

Denecke, K., “Using SentiWordNet for Multilingual Sentiment Analysis,” In Proceedings of the International Conference on Data Engineering : ICDE, Workshop on Data Engineering for Blogs, Social Media, and Web 2.0, 2008.

Lim, E., Nguyen, V., Jindal, N., Liu, B., and Lauw, H., “Detecting product review spammers using rating behaviors,” In Proceedings of the 19th ACM international conference on Information and knowledge management, 2010.

Mukherjee, A., Liu, B. and Glance, N., “Spotting fake reviewer groups in consumer reviews,” In Proceedings of the 21st international conference on World Wide Web, 2012.

Yeom, J., Lee, D. Shim, J., Lee, S. g., “Product Review Data and Sentiment Analytical Processing Modeling,”The Journal of Society for e-Business Studies, Vol. 16, No. 4, 2011.


Refbacks

  • There are currently no refbacks.