A Study on the Performance Evaluation of Machine Learning for Predicting the Number of Movie Audiences

Chan-Mi Jeong, Daiki Min


The accurate prediction of box office in the early stage is crucial for film industry to make better managerial decision. With aims to improve the prediction performance, the purpose of this paper is to evaluate the use of machine learning methods. We tested both classification and regression based methods including k-NN, SVM and Random Forest. We first evaluate input variables, which show that reputation-related information generated during the first two-week period after release is significant. Prediction test results show that regression based methods provides lower prediction error, and Random Forest particularly outperforms other machine learning methods. Regression based method has better prediction power when films have small box office earnings. On the other hand, classification based method works better for predicting large box office earnings.

Full Text:



Abel, F., Diaz-Aviles, E., Henze, N., Krause, D., and Siehndel, P., “Analyzing the Blogosphere for predicting the success of music and movie products,” International Conference on Advances in Social Networks Analysis and Mining, pp. 276-280, 2010.

Breiman, L., Machine Learning 45:5. Kluwer Academic Publishers, 2001.

Brewer, S. M., Kelley, J. M., and Jozefowicz, J. J., “A blueprint for success in the US film industry,” Applied Economics, Vol. 41, No. 5, pp. 589-606, 2009.

Chintagunta, P. K., Gopinath, S., and Venkataraman, S., “The effects of online user reviews on movie box office performance: Accounting for sequential rollout and aggregation across local markets,” Marketing Science, Vol. 29, No. 5, pp. 944-957, 2010.

Chong, M., “Evaluating real-time search query variation for intelligent information retrieval service,” Journal of Digital Convergence, Vol. 16, No. 12, pp. 335-342, 2018.

Delen, D., Sharda, R., and Kumar, P., “Movie forecast Guru: A Web-based DSS for Hollywood managers,” Decision Support Systems, Vol. 43, No. 4, pp. 1151-1170, 2007.

Demir, D., Kapralova, O., and Lai, H., “Predicting IMDB movie ratings using Google Trends,” 2012.

Eliashberg, J. and Shugan, S. M., “Film Critics: Influencers or Predictors?,” Journal of Marketing, Vol. 61, No. 2, pp. 68-78, 1997.

Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M., and Brilliant, L., “Detecting influenza epidemics using search engine query data,” Nature, Vol. 457, No. 19, pp. 1012-1015, 2009.

Gong, J. J., Young, S. M., and der Stede, W. A. V.. “Real options in the motion picture industry: Evidence from film marketing and sequels,” Contemporary Accounting Research, Vol. 28, No. 5, pp. 1438- 1466, 2011.

Gunn, S. R., “Support vector machines for classification and regression,” ISIS technical report, Vol. 14, No. 1, pp. 5-16, 1998.

Guo, Z., Zhang, X., and Hou, Y., “Predicting box office receipts of movies with pruned Random Forest,” International Conference on Neural Information Processing ICONIP 2015: Neural Information Processing, pp. 55-62, 2015.

Jung, J., Hwang, S., and Kwon, C., “Forecasting Korean Unemployment Rate with Web Queries,” Korean Institute of Industrial Engineers, pp. 3373-3377, 2015.

Kim, J. and Kim., J., “Relationship between Internet Buzz Share and Market Share : Movie Ticket Case”, The Journal of Society for e-Business Studies, Vol. 18, No. 2, pp. 241-255, 2013.

Kim, T., Hong, J., and Kang, P., “Box office forecasting using machine learning algorithms based on SNS data,” International Journal of Forecasting, Vol. 31, pp. 364- 390, 2015.

Koo, P. and Kim, M., “A Study on the Relationship between Internet Search Trends and Company’s Stock Price and Trading Volume”, The Journal of Society for e- Business Studies, Vol. 20, No. 2, pp. 1-14, 2015.

Kwon, S. J., “Factors influencing Cinema Success: using News and Online Rates,” Review of Culture & Economy, Vol. 17, No. 1, pp. 35-55, 2014.

Lee, K. J. and Chang, W., “Bayesian belief network for box-office performance: A case study on Korean movies,” Expert Systems with Applications, Vol. 36, pp. 280-291, 2009.

Litman, B. R., “Predicting Success of Theatrical Movies: An Empirical Study,” The Journal of Popular Culture, Vol. 16, No. 4, pp. 159-175, 1983.

Lovallo, D., Clarke, C., and Camerer, C., “Robust analogizing and the outside view: two empirical tests of case-based decision making,” Strategic Management Journal, Vol. 33, No. 5, pp. 496-512, 2012.

Preis, T., Moat, H., and Stanley, H., “Quantifying trading behavior in financial markets using Google trends,” Science Report, Vol. 3, p. 1684, 2013.

Qin, L., “Word-of-Blog for movies: A predictor and an outcome of box office revenue?,” Journal of Electronic Commerce Research, Vol. 12, No. 3, pp. 187-198, 2011.

Ravid, S. A., “Information, blockbusters, and stars: A study of the film industry,” The Journal of Business, Vol. 72, No. 4, pp. 463-492, 1999.

Rogers, E. M., “New product adoption and diffusion,” Journal of Consumer Research, Vol. 2, No. 4, pp. 290-301, 1976.

Sawhney, M. S. and Eliashberg, J., “A parsimonious model for forecasting gross box-office revenues of motion pictures,” Marketing Science, Vol. 15, No. 2, pp. 113- 131, 1996.

Sharda, R. and Delen, D., “Predicting box- office success of motion pictures with neural networks,” Expert Systems with Applications, Vol. 30, pp. 243-254, 2006.

Simonoff, J. S. and Sparraw, I. R., “Predicting movie grosses: winners and losers, blockbusters and sleepers,” Chance, Vol. 13, No. 3, pp. 15-24, 2000.

Siroky, D. S., “Navigating Random Forests and related advances in algorithmic modeling,” Statistics Survey, Vol. 3, pp. 147- 163, 2009.

Song, J., Choi., K., and Kim. G., “Development of New Variables Affecting Movie Success and Prediction of Weekly Box Office Using Them Based on Machine Learning,” Journal of Intelligent Information System, Vol. 24, No. 4, pp. 67-83, 2018.

Subramaniyaswamy, V., Viginesh, V. M., Vishnu, P. R., and Logesh, R., “Predicting movie box office success using multiple regression and SVM,” 2017 International Conference on Intelligent Sustainable Systems(ICISS), pp. 182-186, 2017.

Wang, F., Zhang, Y., Li, X., and Zhu, H., “Why do moviegoers go to the theater? The role of prerelease media publicity and online word of mouth in driving moviegoing behavior,” Journal of Interactive Advertising, Vol. 11, No. 1, pp. 50-62, 2010.

Wen, K. and Yang, C., “Determinants of the box office performance of motion picture in China-indication for Chinese motion picture market by adapting determinants of the box office(part II),” Journal of Science and Innovation, Vol. 1, No. 4, pp. 17-26, 2011.

Yu, L., Zhao, Y., Tang, L., and Yang, Z., “Online big data-driven oil consumption forecasting with Google trends,” International Journal of Forecasting, Vol. 35, pp. 213-223, 2019.

Zhang, L., Luo, J., and Yang, S., “Forecasting box office revenue of movies with BP neural network,” Expert Systems with Applications, Vol. 36, pp. 6580-6587, 2009.

Zhang, W. and Skiena, S., “Improving movie gross prediction through news analysis,” 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology-Workshops, pp. 301-304, 2009.

Zhang, Z., Li, B., Deng, Z., Chai, J., Wang, Y., and An, M., “Research on movie box office forecasting based on internet data,” 2015 8th International Symposium on Computational Intelligence and Design, 2015.


  • There are currently no refbacks.