Online unsupervised video object segmentation via contrastive motion clustering

Lin Xi, Weihai Chen, Xingming Wu, Zhong Liu, Zhengguo Li

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)
6 Downloads (Pure)

Abstract

Online unsupervised video object segmentation (UVOS) uses the previous frames as its input to automatically separate the primary object(s) from a streaming video without using any further manual annotation. A major challenge is that the model has no access to the future and must rely solely on the history, i.e., the segmentation mask is predicted from the current frame as soon as it is captured. In this work, a novel contrastive motion clustering algorithm with an optical flow as its input is proposed for the online UVOS by exploiting the common fate principle that visual elements tend to be perceived as a group if they possess the same motion pattern. We build a simple and effective auto-encoder to iteratively summarize non-learnable prototypical bases for the motion pattern, while the bases in turn help learn the representation of the embedding network. Further, a contrastive learning strategy based on a boundary prior is developed to improve foreground and background feature discrimination in the representation learning stage. The proposed algorithm can be optimized on arbitrarily-scale data (i.e., frame, clip, dataset) and performed in an online fashion. Experiments on DAVIS 16, FBMS, and SegTrackV2 datasets show that the accuracy of our method surpasses the previous state-of-the-art (SoTA) online UVOS method by a margin of 0.8%, 2.9%, and 1.1%, respectively. Furthermore, by using an online deep subspace clustering to tackle the motion grouping, our method is able to achieve higher accuracy at 3 × faster inference time compared to SoTA online UVOS method, and making a good trade-off between effectiveness and efficiency. Our code is available at https://github.com/xilin1991/CluterNet.

Original languageEnglish
Pages (from-to)995-1006
Number of pages12
JournalIEEE Transactions on Circuits and Systems for Video Technology
Volume34
Issue number2
Early online date23 Jun 2023
DOIs
Publication statusPublished - 6 Feb 2024

Keywords

  • clustering methods
  • image motion analysis
  • Object segmentation
  • optical flow
  • self-supervised learning
  • unsupervised learning

Cite this