Publication Date



Sparse representation and compressive sensing have attracted substantial interests in computer vision. In this paper, by introducing two new classification criteria, we extended the sparse representation classification method (SRC) for individual images to classify a video that contains a group of local spatial-temporal features. A dictionary is constructed by concatenating all class-specific dictionaries, each of which is learned from a motion class. A test video is assigned to a class label based on the minimum of reconstruction errors of individual local features or overall reconstruction error. Moreover, we compared the effectiveness of the traditional Principal Component Analysis (PCA) and two compressive sensing based dimensionality reduction methods, i.e., Random Matrix projection and Hash Matrix projection in the framework of sparse representation for motion recognition. Experimental results on four public datasets including hand gesture, human facial, human action and mouse behavior demonstrate that the proposed method achieves comparable or higher recognition accuracies compared to other state-of-the-art methods in the literatures. Although the traditional PCA requires more computation to get the transformation matrix, it performs better than the Random Matrix and Hash Matrix projections using gradient features. However, when raw features (i.e., pixel values) are used, the performance of the Random Matrix and Hash Matrix projections is significantly improved.


Institute for Learning Sciences and Teacher Education

Document Type

Journal Article

Access Rights

ERA Access

Access may be restricted.