Survey on Vector Similarity Measures : Focusing on Algebraic Characteristics

Dongjoo Lee, Junho Shim

Abstract


Objects such as products, product reviews, and user profiles are important in e-commerce domain. Vector is one of the most widely used object representation scheme. Information of e-commerce objects may be modeled by vectors in which the featured values are assigned to various dimensions. E-commerce objects are in general quantitatively large while some are similar or even same in reality. It Plays, therefore, an important role to measure the similarity between objects. In this paper, we survey the state-of-the -art vector similarity measures. Similarity measures are analyzed to feature the algebraic characteristics and relationship of those, and upon which we classify the related measures accordingly. We then present such features that standard vector similarity measures should convey.

Full Text:

PDF

References


Batagelj, V. and Bren, M., "Comparing resemblance measures," Journal of Classification, Vol. 12, 1995.

Bouchon-Meunier, B., Rifqi, M., and Bothorel, S., "Towards general measures of comparison of objects," Fuzzy Sets Systems, Vol. 84, 1996.

Cha, S.-H., "Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions," INTERNATIONAL JOURNAL of MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES, Vol. 1, 2007.

Choi, S.-S., Cha, S.-H., and Tappert, C. C., "A Survey of Binary Similarity and Distance Measures," Journal of Systemics, Cybernetics and Informatics, Vol. 8, 2010.

Deza, M -M. and Deza, E., Dictionary of Distances, Elsevier Science, 2006.

Jaccard, P., "Etude comparative de la distribution florale dans une portion des Alpes et des Jura," Bulletin del la Societe Vaudoise des Sciences Naturelles, 1901.

Lee, D., An Efficient Filtering Framework for Vector Similarity Joins, PhD. Thesis, Seoul National University, 2011.

Lesot, M.- J, Rifqi, M., and Benhadda, H., "Similarity measures for binary and numerical data: a survey," International Journal of Knowledge Engineering and Soft Data Paradigms, Vol. 1, 2009.

Levenshtein, V., "Binary codes capable of correcting deletions, insertions and reversals," Soviet Physics Doklady, Vol. 10, 1966.

Salton, G., Wong, A., and Yang, C. S., "A vector space model for automatic indexing," Communications of the ACM, Vol. 18, 1975.

Santini, S. and Jain, R., "Similarity Measures," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 21, 1999.

Yeon, J., Lee, D., Shim, J., and Lee, S.-G., "Product Review Data and Sentiment Analytical Processing Modeling," The Journal of Society for e-Business Studies, Vol. 16, 2011.


Refbacks

  • There are currently no refbacks.