Cover
Vol. 12 No. 1 (2016)

Published: June 30, 2016

Pages: 96-102

Original Article

Classification Algorithms for Determining Handwritten Digit

Abstract

Data-intensive science is a critical science paradigm that interferes with all other sciences. Data mining (DM) is a powerful and useful technology with wide potential users focusing on important meaningful patterns and discovers a new knowledge from a collected dataset. Any predictive task in DM uses some attribute to classify an unknown class. Classification algorithms are a class of prominent mathematical techniques in DM. Constructing a model is the core aspect of such algorithms. However, their performance highly depends on the algorithm behavior upon manipulating data. Focusing on binarazaition as an approach for preprocessing, this paper analysis and evaluates different classification algorithms when construct a model based on accuracy in the classification task. The Mixed National Institute of Standards and Technology (MNIST) handwritten digits dataset provided by Yann LeCun has been used in evaluation. The paper focuses on machine learning approaches for handwritten digits detection. Machine learning establishes classification methods, such as K-Nearest Neighbor(KNN), Decision Tree (DT), and Neural Networks (NN). Results showed that the knowledge-based method, i.e. NN algorithm, is more accurate in determining the digits as it reduces the error rate. The implication of this evaluation is providing essential insights for computer scientists and practitioners for choosing the suitable DM technique that fit with their data.

References

  1. A. Smola and S. V. N. Vishwanathan, “Introduction to machine learning,” Methods Mol. Biol., vol. 1107, pp. 105–128, 2014.
  2. Mooney R.J., Bunescu R., “Mining knowledge from text using information extraction,” ACM SIGKDD explorations newsletter ., vol. 7, no. 1, pp. 3–10, 2005.
  3. R. Williams, "Data-Intensive Computing in the 21st Century," IEEE Computer , vol.41, no.4, pp. 30-32, 2008.
  4. K. P. Feder and A. Majnemer, “Handwriting development, competency, and Developmental Medicine and Child Neurology , vol. 49, no. 4. pp. 312–317, 2007.
  5. C. Davatzikos and J. L. Prince, "Convexity analysis of active contour problems", Image Visual Computing J. , vol. 17, no. 1, pp. 27-36, 1999.
  6. E. Kussul and T. Baidyk, "Improved method of handwritten digit recognition tested on MNIST database", Proc. 15th Int. Conf. Vision Interface , vol.22, pp. 192-197, 2002
  7. C.L. Liu, K. Nakashima, H. Sako and H. Fujisawa, "Handwritten Digit Recognition: Benchmarking of Stateof-the-Art Techniques," Pattern Recognition , vol. 36, no. 10, pp. 22712285, Oct. 2003.
  8. Y. Lee, "Handwritten digit recognition using k-nearest neighbor, radial-basis functions, and backpropagation neural networks", Neural Computation , vol. 3, no. 3, pp. 440-449, 1991
  9. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, “The WEKA Data Mining Software: An Update,” ACM SIGKDD Explor., vol. 11, no. 1, pp. 10–18, 2009.
  10. L. Bottou, C. Cortes, J. S. Denker, H. Drucker, I. Guyon, L. D. Jackel, Y. A. LeCun, U. A. Mü,ller, E. Sä,ckinger, P. Y. Simard, and V. N. Vapnik, "Comparison of classifier methods: A case study in hand written digit recognition", Proc. 12th Conf. Pattern Recognition and Neural Networks, Jerusalem , pp. 77-87, 1994.