![]() |
[email protected] |
![]() |
3275638434 |
![]() |
![]() |
Paper Publishing WeChat |
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Automatic Selection of Classification Algorithms for Non-Experts Using Meta-Features
Munehiro Nakamura, Atsushi Otsuka, Haruhiko Kimura
Full-Text PDF
XML 424 Views
DOI:10.17265/1537-1514/2014.03.006
Munehiro Nakamura, Ph.D., Department of Information Science, Kanazawa Institute of Technology.
Atsushi Otsuka, B.D., Department of Natural Science and Technology, Kanazawa University.
Haruhiko Kimura, Ph.D., Department of Natural Science and Technology, Kanazawa University.
With the arrival of big-data society, methods for classifying real-world problems have attracted much attention for researchers and developers in various fields. In recent years, much effort has been devoted for improving performances of classification algorithms by adding functions or modifying their weaknesses. However, since a large variety of classification algorithms has been available, it is difficult for non-experts to find classification algorithms that achieve good results on a given data set. Therefore, if there is a system which automatically selects the best classification algorithm for a given data set, non-experts would receive various benefits such as saving time and effort. This paper presents a system of predicting the best possible classification algorithm for a given data set with respect to the accuracy. To the best of our knowledge, this is the first approach focused on predicting the best one. The main target users of the proposed system are non-experts who do not have knowledge and experience in data mining. The proposed system utilizes useful meta-features selected from existing meta-features to increase the performance of the prediction. The feature selection is conducted by a wrapper approach with the genetic search algorithm. In the proposed system, K-nearest neighbor algorithm is used to learn the selected meta-features and build a classification model for predicting future data. Experiments using 58 real-world data sets show that the proposed system predicted the best classification algorithm with 60.34% accuracy from the top five in 30 classification algorithms.
feature selection, wrapper method, meta-feature, classifier, k-nearest neighbor
Alexandros, K., & Melanie, H. (2001). Feature selection for meta-learning. Lecture in Notes in Computer Science, 2035, 222-233.
Bache, K., & Lichman, M. (2013). UCI machine learning repository. Retrieved from http://archive.ics.uci.edu/ml/
David, E. G. (1989). Genetic algorithms in search, optimization and machine learning. Boston, M.A.: Addison-Wesley Longman Publishing Co., Inc..
David, H. W. (1996). The lack of a priori distinctions between learning algorithms. Neural Computing, 8(7), 1341-1390.
Faisal, S., Matthias, R., Christian, K., & Thomas, B. (2010). Pattern recognition engineering. Proceedings from RapidMiner Community Meeting and Conference (RCOMM-10). Dortmund, Germany.
Geoffrey, H., Bernhard, P., Richard, K., Eibe, F., & Mark, H. (2002). Multiclass alternating decision trees. Proceedings from ECML’02: The 13th European Conference on Machine Learning (pp. 161-172). London, UK.
Hilan, B., & Alexandros, K. (2001). Estimating the predictive accuracy of a classifier. Lecture Notes in Computer Science, 2167, 25-36.
Joao, G. (2004). Functional trees. Machine Learning, 55(3), 219-250.
Leo, B. (2001). Random forests. Machine Learning, 45(1), 5-32.
Mark, H., Eibe, F., Geoffrey, H., Bernhard, P., Peter, R., & Ian, H. W. (2009). The WEKA data miming software: An update. SIGKDD Explorations, 11(1), 10-18.
Matthias, R., Faisal, S., Markus, G., Thomas, B., & Andreas, D. (2014). Automatic classifier selection for non-experts. Pattern Analysis and Applications, 17(1), 83-96.
Niels, L., Mark, H., & Eibe, F. (2005). Logistic model trees. Machine Learning, 95(1-2), 161-205.
Pavel, B. B., Carlos, S., & Joaquim, P. C. (2003). Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. Machine Learning, 50(3), 251-277.
Sarah, D. A., Faisal, S., Matthias, R., & Markus, G. (2010). Landmarking for meta-learning using RapidMiner. Proceedings from RapidMiner Community Meeting and Conference (RCOMM-10). Dortmund, Germany.
Shawkat, A., & Kate, A. S. (2006). On learning algorithm selection for classification. Applied Soft Computing, 6(2), 119-138.
Yonghong, P., Peter, A. F., Carlos, S., & Pavel, B. (2002). Improved dataset characterisation for meta-learning. Lecture in Notes in Computer Science, 2534, 193-208.