Back

Perceiving Systems Members Publications

Members

Perceiving Systems
Guest Scientist
Perceiving Systems
Affiliated Researcher
Autonomous Motion
  • Doctoral Researcher
Robust Machine Learning
  • Postdoctoral Researcher
Perceiving Systems
  • Research Group Leader
Probabilistic Numerics
  • Doctoral Researcher
Probabilistic Numerics
Probabilistic Numerics, Empirical Inference
Affiliated Researcher
Perceiving Systems
Affiliated Researcher
Probabilistic Numerics, Empirical Inference
Affiliated Researcher

Publications

Perceiving Systems Conference Paper Joint Graph Decomposition and Node Labeling by Local Search Levinkov, E., Uhrig, J., Tang, S., Omran, M., Insafutdinov, E., Kirillov, A., Rother, C., Brox, T., Schiele, B., Andres, B. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1904-1912, IEEE, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 PDF Supplementary DOI BibTeX

Autonomous Vision Perceiving Systems Conference Paper OctNet: Learning Deep 3D Representations at High Resolutions Riegler, G., Ulusoy, O., Geiger, A. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, 6620-6629, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
We present OctNet, a representation for deep learning with sparse 3D data. In contrast to existing models, our representation enables 3D convolutional networks which are both deep and high resolution. Towards this goal, we exploit the sparsity in the input data to hierarchically partition the space using a set of unbalanced octrees where each leaf node stores a pooled feature representation. This allows to focus memory allocation and computation to the relevant dense regions and enables deeper networks without compromising resolution. We demonstrate the utility of our OctNet representation by analyzing the impact of resolution on several 3D tasks including 3D object classification, orientation estimation and point cloud labeling.
pdf suppmat Project Page Video BibTeX

Probabilistic Numerics Perceiving Systems Article Early Stopping Without a Validation Set Mahsereci, M., Balles, L., Lassner, C., Hennig, P. arXiv preprint arXiv:1703.09580, 2017
Early stopping is a widely used technique to prevent poor generalization performance when training an over-expressive model by means of gradient-based optimization. To find a good point to halt the optimizer, a common practice is to split the dataset into a training and a smaller validation set to obtain an ongoing estimate of the generalization performance. In this paper we propose a novel early stopping criterion which is based on fast-to-compute, local statistics of the computed gradients and entirely removes the need for a held-out validation set. Our experiments show that this is a viable approach in the setting of least-squares and logistic regression as well as neural networks.
URL BibTeX

Perceiving Systems Conference Paper Learning to Filter Object Detections Prokudin, S., Kappler, D., Nowozin, S., Gehler, P. In Pattern Recognition: 39th German Conference, GCPR 2017, Basel, Switzerland, September 12–15, 2017, Proceedings, 52-62, Springer International Publishing, Cham, 2017
Most object detection systems consist of three stages. First, a set of individual hypotheses for object locations is generated using a proposal generating algorithm. Second, a classifier scores every generated hypothesis independently to obtain a multi-class prediction. Finally, all scored hypotheses are filtered via a non-differentiable and decoupled non-maximum suppression (NMS) post-processing step. In this paper, we propose a filtering network (FNet), a method which replaces NMS with a differentiable neural network that allows joint reasoning and re-scoring of the generated set of hypotheses per image. This formulation enables end-to-end training of the full object detection pipeline. First, we demonstrate that FNet, a feed-forward network architecture, is able to mimic NMS decisions, despite the sequential nature of NMS. We further analyze NMS failures and propose a loss formulation that is better aligned with the mean average precision (mAP) evaluation metric. We evaluate FNet on several standard detection datasets. Results surpass standard NMS on highly occluded settings of a synthetic overlapping MNIST dataset and show competitive behavior on PascalVOC2007 and KITTI detection benchmarks.
Paper DOI URL BibTeX

Perceiving Systems Autonomous Motion Conference Paper Barrista - Caffe Well-Served Lassner, C., Kappler, D., Kiefel, M., Gehler, P. In ACM Multimedia Open Source Software Competition, ACM OSSC16, October 2016 (Published)
The caffe framework is one of the leading deep learning toolboxes in the machine learning and computer vision community. While it offers efficiency and configurability, it falls short of a full interface to Python. With increasingly involved procedures for training deep networks and reaching depths of hundreds of layers, creating configuration files and keeping them consistent becomes an error prone process. We introduce the barrista framework, offering full, pythonic control over caffe. It separates responsibilities and offers code to solve frequently occurring tasks for pre-processing, training and model inspection. It is compatible to all caffe versions since mid 2015 and can import and export .prototxt files. Examples are included, e.g., a deep residual network implemented in only 172 lines (for arbitrary depths), comparing to 2320 lines in the official implementation for the equivalent model.
pdf DOI URL BibTeX

Perceiving Systems Conference Paper Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks Jampani, V., Kiefel, M., Gehler, P. V. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 4452-4461, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Bilateral filters have wide spread use due to their edge-preserving properties. The common use case is to manually choose a parametric filter type, usually a Gaussian filter. In this paper, we will generalize the parametrization and in particular derive a gradient descent algorithm so the filter parameters can be learned from data. This derivation allows to learn high dimensional linear filters that operate in sparsely populated feature spaces. We build on the permutohedral lattice construction for efficient filtering. The ability to learn more general forms of high-dimensional filters can be used in several diverse applications. First, we demonstrate the use in applications where single filter applications are desired for runtime reasons. Further, we show how this algorithm can be used to learn the pairwise potentials in densely connected conditional random fields and apply these to different image segmentation tasks. Finally, we introduce layers of bilateral filters in CNNs and propose bilateral neural networks for the use of high-dimensional sparse data. This view provides new ways to encode model structure into network architectures. A diverse set of experiments empirically validates the usage of general forms of filters.
code CVF open-access pdf supplementary poster BibTeX