Recently, very high-dimensional feature representations, e.g., Fisher Vector, have achieved excellent performance for visual recognition and retrieval. However, these lengthy representations always cause extremely heavy computational and storage costs and even become unfeasible in some large-scale applications. A few existing techniques can transfer very high-dimensional data into binary codes, but they still require the reduced code length to be relatively long to maintain acceptable accuracies. To target a better balance between computational efficiency and accuracies, in this paper, we propose a novel embedding method called Binary Projection Bank (BPB), which can effectively reduce the very high-dimensional representations to medium-dimensional binary codes without sacrificing accuracies. Instead of using conventional single linear or bilinear projections, the proposed method learns a bank of small projections via the max-margin constraint to optimally preserve the intrinsic data similarity. We have systematically evaluated the proposed method on three datasets: Flickr 1M, ILSVR2010 and UCF101, showing competitive retrieval and recognition accuracies compared with state-of-the-art approaches, but with a significantly smaller memory footprint and lower coding complexity.
|Title of host publication||2015 IEEE International Conference on Computer Vision (ICCV)|
|Publication status||Published - 18 Feb 2016|
|Event||2015 IEEE International Conference on Computer Vision (ICCV) - Santiago, Chile|
Duration: 7 Dec 2015 → 13 Dec 2015
|Conference||2015 IEEE International Conference on Computer Vision (ICCV)|
|Period||7/12/15 → 13/12/15|