TY - JOUR
T1 - Community-aware photo quality evaluation by deeply encoding human perception
AU - Wang, Zepeng
AU - Li, Ping
AU - Zhang, Luming
AU - Shao, Ling
PY - 2019/1/1
Y1 - 2019/1/1
N2 - Computational photo quality evaluation is a useful technique in many tasks of computer vision and graphics, $e.g.$, photo retaregeting, 3D rendering, and fashion recommendation. Conventional photo quality models are designed by characterizing pictures from all communities (eg "architecture" and "colorful") indiscriminately, wherein community-specific features are not encoded explicitly. In this work, we develop a new community-aware photo quality evaluation framework. It uncovers the latent community-specific topics by a regularized latent topic model (LTM), and captures human visual quality perception by exploring multiple attributes. More specifically, given massive-scale online photos from multiple communities, a novel ranking algorithm is proposed to measure the visual/semantic attractiveness of regions inside each photo. Meanwhile, three attributes: photo quality scores, weak semantic tags, and inter-region correlations, are seamlessly and collaboratively incorporated during ranking. Subsequently, we construct gaze shifting path (GSP) for each photo by sequentially linking the top-ranking regions from each photo, and an aggregation-based deep CNN calculates the deep representation for each GSP. Based on this, an LTM is proposed to model the GSP distribution from multiple communities in the latent space. To mitigate the overfitting problem caused by communities with very few photos, a regularizer is added into our LTM. Finally, given a test photo, we obtain its deep GSP representation and its quality score is determined by the posterior probability of the regularized LTM. Comprehensive comparative studies on four image sets have shown the competitiveness of our method. Besides, eye tracking experiments demonstrated that our ranking-based GSPs are highly consistent with real human gaze movements.
AB - Computational photo quality evaluation is a useful technique in many tasks of computer vision and graphics, $e.g.$, photo retaregeting, 3D rendering, and fashion recommendation. Conventional photo quality models are designed by characterizing pictures from all communities (eg "architecture" and "colorful") indiscriminately, wherein community-specific features are not encoded explicitly. In this work, we develop a new community-aware photo quality evaluation framework. It uncovers the latent community-specific topics by a regularized latent topic model (LTM), and captures human visual quality perception by exploring multiple attributes. More specifically, given massive-scale online photos from multiple communities, a novel ranking algorithm is proposed to measure the visual/semantic attractiveness of regions inside each photo. Meanwhile, three attributes: photo quality scores, weak semantic tags, and inter-region correlations, are seamlessly and collaboratively incorporated during ranking. Subsequently, we construct gaze shifting path (GSP) for each photo by sequentially linking the top-ranking regions from each photo, and an aggregation-based deep CNN calculates the deep representation for each GSP. Based on this, an LTM is proposed to model the GSP distribution from multiple communities in the latent space. To mitigate the overfitting problem caused by communities with very few photos, a regularizer is added into our LTM. Finally, given a test photo, we obtain its deep GSP representation and its quality score is determined by the posterior probability of the regularized LTM. Comprehensive comparative studies on four image sets have shown the competitiveness of our method. Besides, eye tracking experiments demonstrated that our ranking-based GSPs are highly consistent with real human gaze movements.
KW - Community
KW - Deep feature
KW - Gaze behavior
KW - Machine learning
KW - Quality model
KW - Topic model
UR - http://www.scopus.com/inward/record.url?scp=85082027086&partnerID=8YFLogxK
U2 - 10.1109/TCYB.2019.2937319
DO - 10.1109/TCYB.2019.2937319
M3 - Article
AN - SCOPUS:85082027086
SP - 1
EP - 11
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
SN - 1520-9210
ER -