Abstract
Automatic vehicle detection and annotation for streaming video data with complex scenes is an interesting but challenging task for intelligent transportation systems. In this paper, we present a fast algorithm: detection and annotation for vehicles (DAVE), which effectively combines vehicle detection and attributes annotation into a unified framework. DAVE consists of two convolutional neural networks: a shallow fully convolutional fast vehicle proposal network (FVPN) for extracting all vehicles' positions, and a deep attributes learning network (ALN), which aims to verify each detection candidate and infer each vehicle's pose, color, and type information simultaneously. These two nets are jointly optimized so that abundant latent knowledge learned from the deep empirical ALN can be exploited to guide training the much simpler FVPN. Once the system is trained, DAVE can achieve efficient vehicle detection and attributes annotation for real-world traffic surveillance data, while the FVPN can be independently adopted as a real-time high-performance vehicle detector as well. We evaluate the DAVE on a new self-collected urban traffic surveillance data set and the public PASCAL VOC2007 car and LISA 2010 data sets, with consistent improvements over existing algorithms.
Original language | English |
---|---|
Pages (from-to) | 1973-1984 |
Number of pages | 12 |
Journal | IEEE Transactions on Intelligent Transportation Systems |
Volume | 19 |
Issue number | 6 |
Early online date | 24 Oct 2017 |
DOIs | |
Publication status | Published - Jun 2018 |
Keywords
- Vehicle detection
- attributes annotation
- latent knowledge guidance
- joint learning
- deep networks