A Study on Method for User Gender Prediction Using Multi-Modal Smart Device Log Data

Yoonjung Kim, Yerim Choi, Solee Kim, Kyuyon Park, Jonghun Park

Abstract


Gender information of a smart device user is essential to provide personalized services, and multi-modal data obtained from the device is useful for predicting the gender of the user. However, the method for utilizing each of the multi-modal data for gender prediction differs according to the characteristics of the data. Therefore, in this study, an ensemble method for predicting the gender of a smart device user by using three classifiers that have text, application, and acceleration data as inputs, respectively, is proposed. To alleviate privacy issues that occur when text data generated in a smart device are sent outside, a classification method which scans smart device text data only on the device and classifies the gender of the user by matching text data with predefined sets of word. An application based classifier assigns gender labels to executed applications and predicts gender of the user by comparing the label ratio. Acceleration data is used with Support Vector Machine to classify user gender. The proposed method was evaluated by using the actual smart device log data collected from an Android application. The experimental results showed that the proposed method outperformed the compared methods.


Full Text:

PDF

References


Böhmer, M., Hecht, B., Schöning, J., Krüger, A., and Bauer, G., “Falling Asleep with Angry Birds, Facebook and Kindle: A Large Scale Study on Mobile Application Usage,” Proceedings of the International Conference on Human Computer Interaction with Mobile Devices and Services, 2011.

Baek, S. I. and Choi, D. S., “Exploring User Attitude to Information Privacy,” The Journal of Society for e-Business Studies, Vol. 20, No. 1, pp. 45-59, 2015.

Brdar, S., Ćulibrk, D., and Crnojević, V., “Demographic Attributes Prediction on the Real-World Mobile Data,” Proceedings of Mobile Data Challenge by Nokia Workshop, 2012.

Chang, C.-C. and Lin, C.-J., “LIBSVM: A Library for Support Vector Machines,” ACM Transactions on Intelligent Systems and Technology, Vol. 2, No. 3, p. 27, 2011.

Chen, P.-T. and Hsieh, H.-P., “Personalized Mobile Advertising: Its Key Attributes, Trends, and Social Impact,” Technological Forecasting and Social Change, Vol. 79, No. 3, pp. 543-557, 2012.

Croft, W. B., Metzler, D., and Strohman, T., Search Engines: Information Retrieval in Practice, Pearson, 2009.

Delany, S. J., Buckley, M., and Greene, D., “SMS Spam Filtering: Methods and Data,” Expert Systems with Applications, Vol. 39, No. 10, pp. 9899-9908, 2012.

Ha, S. H., Oh, J., and Lee, B. G., “The Analysis of Advertisement Effect in Smart Phone Environment: The Comparison of Users with Providers of Commercial,” The Journal of Society for e-Business Studies, Vol. 16, No. 4, pp. 221-239, 2011.

Hu, J., Zeng, H.-J., Li, H., Niu, C., and Chen, Z., “Demographic Prediction based on User’s Browsing Behavior,” Proceedings of the International Conference on World Wide Web, 2007.

Igarashi, T., Takai, J., and Yoshida, T., “Gender Differences in Social Network Development via Mobile Phone Text Messages: A Longitudinal Study,” Journal of Social and Personal Relationships, Vol. 22, No. 5, pp. 691-713, 2005.

Joachims, T., “Making Large-Scale SVM Learning Practical,” in Advances in Kernel Methods-Support Vector Learning, ed Cambridge, Massachusetts: MIT Press, pp. 169-184, 1999.

Kim, S., Choi, Y., Kim, Y., Park, K., and Park, J., “On-Device Gender Prediction Framework Based on the Development of Discriminative Word and Emoticon Sets,” KIISE Transactions on Computing Practices, Vol. 21, No. 11, pp. 733-738, 2015.

Kuncheva, L. I., Combining Pattern Classifiers: Methods and Algorithms, John Wiley and Sons, 2004.

Laurila, J. K., Gatica-Perez, D., Aad, I., Blom, J., Bornet, O., Do, T. M. T., Dousse, O., Eberle, J., and Miettinen, M., “From Big Smartphone Data to Worldwide Research: The Mobile Data Challenge,” Pervasive and Mobile Computing, Vol. 9, No. 6, pp. 752-771, 2013.

Lee, D. and Shim, J., “Survey on Vector Similarity Measures: Focusing on Algebraic Characteristics,” The Journal of Society for e-Business Studies, Vol. 17, No. 4, pp. 209-219, 2012.

Lee, Z., Choi, H., and Choi, S., “Study on How Service Usefulness and Privacy Concern Influence on Service Acceptance,” The Journal of Society for e-Business Studies, Vol. 12, No. 4, pp. 37-51, 2007.

Mohrehkesh, S., Ji, S., Nadeem, T., and Weigle, M. C., “Demographic Prediction of Mobile User from Phone Usage,” Proceedings of Mobile Data Challenge by Nokia Workshop, 2012.

Roh, J.-H., Kim, H.-j., and Chang, J.-Y., “Improving Hypertext Classification Systems Through WordNet-based Feature Abstraction,” The Journal of Society for e-Business Studies, Vol. 18, No. 2, pp. 95-110, 2013.

Seneviratne, S., Seneviratne, A., Mohapatra, P. and Mahanti, A., “Your Installed Apps Reveal Your Gender and More!,” SIGMOBILE Mobile Computing and Communications Review, Vol. 18, pp. 55-61, 2015.

Shim, K.-S., “MADE: Morphological Analyzer Development Environment,” Journal of Internet Computing and Services, Vol. 8, No. 4, pp. 159-171, 2007.

Walkowiak, K., Sztajer, S., and Woźniak, M., “Decentralized Distributed Computing System for Privacy-Preserving Combined Classifiers-Modeling and Optimization,” Proceedings of the International Conference on Computational Science and Its Applications, 2011.

Weiss, G. M. and Lockhart, J. W., “Identifying User Traits By Mining Smart Phone Accelerometer Data,” Proceedings of the International Workshop on Knowledge Discovery from Sensor Data, 2011.

Woźniak, M., Graña, M., and Corchado, E., “A Survey of Multiple Classifier Systems as Hybrid Systems,” Information Fusion, Vol. 16, pp. 3-17, 2014.

Ying, J. J.-C., Chang, Y.-J., Huang, C.-M. and Tseng, V. S., “Demographic Prediction based on Users Mobile Behaviors,” Proceedings of Mobile Data Challenge by Nokia Workshop, 2012.

Zenobi, G. and Cunningham, P., “Using Diversity in Preparing Ensembles of Classifiers based on Different Feature Subsets to Minimize Generalization Error,” Proceedings of the European Conference on Machine Learning, 2001.

Zhong, E., Tan, B., Mo, K., and Yang, Q., “User Demographics Prediction Based on Mobile Data,” Pervasive and Mobile Computing, Vol. 9, No. 6, pp. 823-837, 2013.


Refbacks

  • There are currently no refbacks.