Sufficient training examples are the fundamental requirement for most of the learning tasks. However, collecting welllabelled training examples is costly. Inspired by Zero-shot Learning (ZSL) that can make use of visual attributes or natural language semantics as an intermediate level clue to associate low-level features with high-level classes, in a novel extension of this idea, we aim to synthesise training data for novel classes using only semantic attributes. Despite the simplicity of this idea, there are several challenges. Firstly, how to prevent the synthesised data from over-fitting to training classes Secondly, how to guarantee the synthesised data is discriminative for ZSL tasks Thirdly, we observe that only a few dimensions of the learnt features gain high variances whereas most of the remaining dimensions are not informative. Thus, the question is how to make the concentrated information diffuse to most of the dimensions of synthesised data. To address the above issues, we propose a novel embedding algorithm named Unseen Visual Data Synthesis (UVDS) that projects semantic features to the high-dimensional visual feature space. Two main techniques are introduced in our proposed algorithm.
|Journal||IEEE Transactions on Pattern Analysis and Machine Intelligence|
|Early online date||12 Oct 2017|
|Publication status||Published - 1 Oct 2018|