AbstractsComputer Science

Principal component pyramids using image blurring for nonlinearity reduction in hand shape recognition

by Mohamed Farouk Kheir Eldin




Institution: Dublin City University
Department: School of Computing
Year: 2015
Keywords: Image processing; Computer vision; Shape recognition; Principal component analysis; Multi-stage hierarchy; Hand-shapes
Record ID: 1180048
Full text PDF: http://doras.dcu.ie/20432/


Abstract

The thesis presents four algorithms using a multistage hierarchical strategy for hand shape recognition. The proposed multistage hierarchy analyzes new patterns by projecting them into the different levels of a data pyramid, which consists of different principal component spaces. Image blurring is used to reduce the nonlinearity in manifolds generated by a set of example images. Flattening the space helps in classifying different hand shapes more accurately. Four algorithms using different pattern recognition techniques are proposed. The first algorithm is based on using perpendicular distance to measure the distance between new patterns and the nearest manifold. The second algorithm is based on using supervised multidimensional grids. The third algorithm uses unsupervised multidimensional grids to cluster the space into cells of similar objects. The fourth algorithm is based on training a set of simple architecture multi-layer neural networks at the different levels of the pyramid to map new patterns to the closest class. The proposed algorithms are categorized as example-based approaches where a large set of computer generated images are used to densely sample the space. Experimental results are presented to examine the accuracy and performance of the proposed algorithms. The effect of image blurring on reducing the nonlinearity in manifolds is examined. The results are compared with the exhaustive search scenario. The experimental results show that the proposed algorithms are applicable for real time applications with high accuracy measures. They can achieve frame rates of more than 10 frames per second and accuracies of up to 98% on test data.