Abstract
In this paper, we present a new approach for human action recognition based on key-pose selection and representation. Poses in video frames are described by the proposed extensive pyramidal features (EPFs), which include the Gabor, Gaussian, and wavelet pyramids. These features are able to encode the orientation, intensity, and contour information and therefore provide an informative representation of human poses. Due to the fact that not all poses in a sequence are discriminative and representative, we further utilize the AdaBoost algorithm to learn a subset of discriminative poses. Given the boosted poses for each video sequence, a new classifier named weighted local naive Bayes nearest neighbor is proposed for the final action classification, which is demonstrated to be more accurate and robust than other classifiers, e.g., support vector machine (SVM) and naive Bayes nearest neighbor. The proposed method is systematically evaluated on the KTH data set, the Weizmann data set, the multiview IXMAS data set, and the challenging HMDB51 data set. Experimental results manifest that our method outperforms the state-of-the-art techniques in terms of recognition rate.
Original language | English |
---|---|
Pages (from-to) | 1860-1870 |
Number of pages | 11 |
Journal | IEEE Transactions on Cybernetics |
Volume | 43 |
Issue number | 6 |
Early online date | 11 Jan 2013 |
DOIs | |
Publication status | Published - 1 Dec 2013 |