A New Ensemble Machine Learning Technique with Multiple Stacking

Su-eun Lee; Han-joon Kim

A New Ensemble Machine Learning Technique with Multiple Stacking

Su-eun Lee, Han-joon Kim

Abstract

Machine learning refers to a model generation technique that can solve specific problems from the generalization process for given data. In order to generate a high performance model, high quality training data and learning algorithms for generalization process should be prepared. As one way of improving the performance of model to be learned, the Ensemble technique generates multiple models rather than a single model, which includes bagging, boosting, and stacking learning techniques. This paper proposes a new Ensemble technique with multiple stacking that outperforms the conventional stacking technique. The learning structure of multiple stacking ensemble technique is similar to the structure of deep learning, in which each layer is composed of a combination of stacking models, and the number of layers get increased so as to minimize the misclassification rate of each layer. Through experiments using four types of datasets, we have showed that the proposed method outperforms the exiting ones.

Full Text:

PDF

References

Breiman, L., “Bagging predictors,” Machine Learning, Vol. 24, No. 2, pp. 123-140, 1996.

Brown, G., “Ensemble Learning,” Encyclopedia of Machine Learning, Vol. 312, pp. 15-19, 2010.

Cutler, D. R., Edwards, T. C., Beard, K. H., Cutler, A., Hess, K. T., Joshua, J. G., and Lawler, J. J., “Random forests for classification in ecology,” Ecology, Vol. 88, No. 11, pp. 2783-2792, 2017.

Demir, N., and Dalkiliç, G., “Modi ed stacking ensemble approach to detect network intrusion,” Turkish Journal of Electrical Engineering & Computer Sciences, Vol. 26, No. 1 pp. 418-433, 2018.

El-Khatib, M. J., Abu-Naser, B. S., and Abu-Naser, S. S., “Glass classification using artificial neural network,” International Journal of Academic Pedagogical Research (IJAPR), Vol. 3, No. 2, pp. 25-31, 2019.

Garrett, D., Peterson, D. A., Anderson, C. W., and Thaut, M. H., “Comparison of linear, nonlinear, and feature selection methods for EEG signal classification,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, Vol. 11, No. 2, pp. 141-144, 2003.

Goyal, M., and Rajapakse, J. C., “Deep neural network ensemble by data augmentation and bagging for skin lesion classification,” Computer Vision and Pattern Recognition, Vol. 1807.05496, 2018.

Kaggle Datase, Titanic Data. https://www.kaggle.com/c/titanic/data.

Kaggle DataSets, https://www.kaggle.com/datasets.

Kim, Y. J., Choi, Y. L., Kim, S. L., Park, K. Y., and Park, J. H., “A study on method for user gender prediction using multi-modal smart device log data,” The Journal of Society for e-Business Studies, Vol. 21, No. 1, pp. 147-163, 2016.

Pari, R., Sandhya, M., and Sankar, S., “A multitier stacked ensemble algorithm for improving classification accuracy,” Computing in Science & Engineering, Vol. 22, No. 4, pp. 74-85, 2020.

Patil, T. R., and Sherekar, S. S., “Performance analysis of Naive Bayes and J48 classification algorithm for data classification,” International journal of computer science and applications, Vol. 6, No. 2, pp. 256-261, 2013.

Ramamurthy, M., and Krishnamurthi, I., “Decision tree based classification type question/answer e-assessment system,” Advances in Natural and Applied Sciences, Vol. 10, No. 1 pp. 22-26, 2016.

Schapire, R. E., Freund, Y., Bartlett, P., and Lee, W. S., “Boosting the margin: A new explanation for the effectiveness of voting methods,” The annals of statistics, Vol. 26, No. 5 pp. 1651-1686, 1998.

Syarif, I., Zaluska, E., Prugel-Bennett, A., and Wills, G., “Application of bagging, boosting and stacking to intrusion detection,” International Workshop on Machine Learning and Data Mining in Pattern Recognition, Vol. 7376, No. 8, pp. 593-602, 2012.

UCI Repository Adult Data, http://archive.ics.uci.edu/ml/datasets/Adult.

UCI Repository Cervical Data, http://archive.ics.uci.edu/ml/datasets/Cervical+cancer+%28Risk+Factors%29.

UCI Repository German Data, http://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data).

UCI Repository, http://archive.ics.uci.edu/ml/datasets.php.

Yang, X., Lo, D., Xia, X., and Sun, J., “TLEL: A two-layer ensemble learning approach for just-in-time defect prediction,” Information and Software Technology, Vol. 87, No. 1 pp. 206-220, 2017.

Zhang, S., Li, X., Zong, M., Zhu, X. and Wang, R., “Efficient knn classification with different numbers of nearest neighbors,” IEEE Transactions on Neural Networks and Learning Systems, Vol. 29, No. 5, pp. 1774-1785, 2018.

Zhou, Z. H., “Ensemble methods: Foundations and algorithms,” Chapman and Hall/CRC, 2012.

Refbacks

There are currently no refbacks.

Username
Password
Remember me