Abstract
Accurately clustering Internet-scale Internet users into multiple communities according to their aesthetic styles is a useful technique in image modeling and data mining. In this work, we present a novel partially-supervised model which seeks a sparse representation to capture photo aesthetics1. It optimally fuzes multi-channel features, i.e., human gaze behavior, quality scores, and semantic tags, each of which could be absent. Afterward, by leveraging the KL-divergence to distinguish the aesthetic distributions between photo sets, a large-scale graph is constructed to describe the aesthetic correlations between users. Finally, a dense subgraph mining algorithm which intrinsically supports outliers (i.e., unique users not belong to any community) is adopted to detect aesthetic communities. Comprehensive experimental results on a million-scale image set crawled from Flickr have demonstrated the superiority of our method. As a byproduct, the discovered aesthetic communities can enhance photo retargeting and video summarization substantially.
Original language | English |
---|---|
Pages (from-to) | 3462-3476 |
Number of pages | 15 |
Journal | IEEE Transactions on Image Processing |
Volume | 28 |
Issue number | 7 |
Early online date | 6 Feb 2019 |
DOIs | |
Publication status | Published - Jul 2019 |
Keywords
- aesthetic community
- Clustering algorithms
- Computational modeling
- Feature extraction
- Flickr
- Gaze behavior
- Graph mining
- Machine learning
- Multimodal
- Partially-supervised
- Semantics
- Training
- Visualization