Abstract
We present a new method for self-supervised monocular depth estimation. Contemporary monocular depth estimation methods use a triplet of consecutive video frames to estimate the central depth image. We make the assumption that the ego-centric view progresses linearly in the scene, based on the kinematic and physical properties of the camera. During the training phase, we can exploit this assumption to create a depth estimation for each image in the triplet. We then apply a new geometry constraint that supports novel synthetic views, thus providing a strong supervisory signal. Our contribution is simple to implement, requires no additional trainable parameter, and produces competitive results when compared with other state-of-the-art methods on the popular KITTI corpus.
Original language | English |
---|---|
Title of host publication | CVMP '20: European Conference on Visual Media Production |
Editors | Stephen N. Spencer |
Publisher | Association for Computing Machinery (ACM) |
Pages | 1-8 |
Number of pages | 8 |
ISBN (Electronic) | 9781450381987 |
DOIs | |
Publication status | Published - 7 Dec 2020 |
Keywords
- Deep Learning
- Monocular Depth Estimation
- Self-supervised Learning