Due to the increasing amount of video data available in various databases, on the Internet and elsewhere, new methods of managing these data are required, leading to the development of content-based video retrieval systems. We explore several recently developed action representation and information retrieval techniques in a human action retrieval system. These techniques include various means of local feature extraction; soft-assignment clustering; Bag-of-Words, vocabulary guided and spatio-temporal pyramid matches for action representation; SVMs and ABRS-SVMs for relevance feedback. Successful application of relevance feedback in particular will result in far more practical systems. We evaluate the performance of several combinations of the above techniques in three realistic action datasets: UCF Sports, UCF YouTube and HOHA2.
- Content-based video retrieval
- Relevance feedback
- Human action recognition