TY - GEN
T1 - Self-distillation and uncertainty boosting self-supervised monocular depth estimation
AU - Zhou, Hang
AU - Greenwood, David
AU - Taylor, Sarah
AU - Mackiewicz, Michal
N1 - © 2022. The copyright of this document resides with its authors. It may be distributed unchanged freely in print or electronic form.
PY - 2022/11
Y1 - 2022/11
N2 - For self-supervised monocular depth estimation (SDE), recent works have introduced additional learning objectives, for example semantic segmentation, into the training pipeline and have demonstrated improved performance. However, such multi-task learning frameworks require extra ground truth labels, neutralising the biggest advantage of self-supervision. In this paper, we propose SUB-Depth to overcome these limitations. Our main contribution is that we design an auxiliary self-distillation scheme and incorporate it into the standard SDE framework, to take advantage of multi-task learning without labelling cost. Then, instead of using a simple weighted sum of the multiple objectives, we employ generative task-dependent uncertainty to weight each task in our proposed training framework. We present extensive evaluations on KITTI to demonstrate the improvements achieved by training a range of existing networks using the proposed framework, and we achieve state-of-the-art performance on this task.
AB - For self-supervised monocular depth estimation (SDE), recent works have introduced additional learning objectives, for example semantic segmentation, into the training pipeline and have demonstrated improved performance. However, such multi-task learning frameworks require extra ground truth labels, neutralising the biggest advantage of self-supervision. In this paper, we propose SUB-Depth to overcome these limitations. Our main contribution is that we design an auxiliary self-distillation scheme and incorporate it into the standard SDE framework, to take advantage of multi-task learning without labelling cost. Then, instead of using a simple weighted sum of the multiple objectives, we employ generative task-dependent uncertainty to weight each task in our proposed training framework. We present extensive evaluations on KITTI to demonstrate the improvements achieved by training a range of existing networks using the proposed framework, and we achieve state-of-the-art performance on this task.
M3 - Conference contribution
BT - THe 33rd British Machine Vision Conference Proceedings
ER -