Perceiving Systems – Max Planck Institute for Intelligent Systems

Perceiving Systems Conference Paper Semantic Video CNNs through Representation Warping Gadde, R., Jampani, V., Gehler, P. V. In Proceedings IEEE International Conference on Computer Vision (ICCV), 4463-4472, IEEE, Piscataway, NJ, USA, IEEE International Conference on Computer Vision (ICCV), October 2017 (Accepted)

Abstract ›

In this work, we propose a technique to convert CNN models for semantic segmentation of static images into CNNs for video data. We describe a warping method that can be used to augment existing architectures with very lit- tle extra computational cost. This module is called Net- Warp and we demonstrate its use for a range of network architectures. The main design principle is to use optical flow of adjacent frames for warping internal network repre- sentations across time. A key insight of this work is that fast optical flow methods can be combined with many different CNN architectures for improved performance and end-to- end training. Experiments validate that the proposed ap- proach incurs only little extra computational cost, while im- proving performance, when video streams are available. We achieve new state-of-the-art results on the standard CamVid and Cityscapes benchmark datasets and show reliable im- provements over different baseline networks. Our code and models are available at http://segmentation.is. tue.mpg.de

pdf Supplementary BibTeX

Perceiving Systems Conference Paper Video Propagation Networks Jampani, V., Gadde, R., Gehler, P. V. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, 3154-3164, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 pdf supplementary arXiv project page code BibTeX

Perceiving Systems Conference Paper Video segmentation via object flow Tsai, Y., Yang, M., Black, M. J. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 1426-1434, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2016

Abstract ›

Video object segmentation is challenging due to fast moving objects, deforming shapes, and cluttered backgrounds. Optical flow can be used to propagate an object segmentation over time but, unfortunately, flow is often inaccurate, particularly around object boundaries. Such boundaries are precisely where we want our segmentation to be accurate. To obtain accurate segmentation across time, we propose an efficient algorithm that considers video segmentation and optical flow estimation simultaneously. For video segmentation, we formulate a principled, multiscale, spatio-temporal objective function that uses optical flow to propagate information between frames. For optical flow estimation, particularly at object boundaries, we compute the flow independently in the segmented regions and recompose the results. We call the process object flow and demonstrate the effectiveness of jointly optimizing optical flow and video segmentation using an iterative scheme. Experiments on the SegTrack v2 and Youtube-Objects datasets show that the proposed algorithm performs favorably against the other state-of-the-art methods.

pdf BibTeX