Header logo is


2019


Thumb xl model
Resisting Adversarial Attacks using Gaussian Mixture Variational Autoencoders

Ghosh, P., Losalka, A., Black, M. J.

In Proc. AAAI, 2019 (inproceedings)

Abstract
Susceptibility of deep neural networks to adversarial attacks poses a major theoretical and practical challenge. All efforts to harden classifiers against such attacks have seen limited success till now. Two distinct categories of samples against which deep neural networks are vulnerable, ``adversarial samples" and ``fooling samples", have been tackled separately so far due to the difficulty posed when considered together. In this work, we show how one can defend against them both under a unified framework. Our model has the form of a variational autoencoder with a Gaussian mixture prior on the latent variable, such that each mixture component corresponds to a single class. We show how selective classification can be performed using this model, thereby causing the adversarial objective to entail a conflict. The proposed method leads to the rejection of adversarial samples instead of misclassification, while maintaining high precision and recall on test data. It also inherently provides a way of learning a selective classifier in a semi-supervised scenario, which can similarly resist adversarial attacks. We further show how one can reclassify the detected adversarial samples by iterative optimization.

ps

link (url) [BibTex]

2018


no image
Enhancing the Accuracy and Fairness of Human Decision Making

Valera, I., Singla, A., Gomez Rodriguez, M.

32th Annual Conference on Neural Information Processing Systems, December 2018 (conference) Accepted

ei

arXiv [BibTex]

2018


arXiv [BibTex]


no image
Boosting Black Box Variational Inference

Locatello*, F., Dresdner*, G., R., K., Valera, I., Rätsch, G.

32th Annual Conference on Neural Information Processing Systems, December 2018, *equal contribution (conference) Accepted

ei

arXiv [BibTex]

arXiv [BibTex]


no image
When do random forests fail?

Tang, C., Garreau, D., von Luxburg, U.

In Proceedings Neural Information Processing Systems, Neural Information Processing Systems (NIPS 2018) , December 2018 (inproceedings)

slt

Project Page [BibTex]

Project Page [BibTex]


no image
Consolidating the Meta-Learning Zoo: A Unifying Perspective as Posterior Predictive Inference

Gordon*, J., Bronskill*, J., Bauer*, M., Nowozin, S., Turner, R. E.

Workshop on Meta-Learning (MetaLearn 2018) at the 32nd Conference on Neural Information Processing Systems, December 2018, *equal contribution (conference) Accepted

ei

[BibTex]

[BibTex]


no image
Versa: Versatile and Efficient Few-shot Learning

Gordon*, J., Bronskill*, J., Bauer*, M., Nowozin, S., Turner, R. E.

Third Workshop on Bayesian Deep Learning at the 32nd Conference on Neural Information Processing Systems, December 2018, *equal contribution (conference) Accepted

ei

[BibTex]

[BibTex]


no image
Deep Reinforcement Learning for Event-Triggered Control

Baumann, D., Zhu, J., Martius, G., Trimpe, S.

In Proceedings of the 57th IEEE International Conference on Decision and Control (CDC), Miami, Fl, USA, December 2018 (inproceedings) Accepted

al ics

arXiv PDF Project Page Project Page [BibTex]

arXiv PDF Project Page Project Page [BibTex]


no image
Learning Invariances using the Marginal Likelihood

van der Wilk, M., Bauer, M., John, S. T., Hensman, J.

32th Annual Conference on Neural Information Processing Systems, December 2018 (conference) Accepted

ei

[BibTex]

[BibTex]


no image
Deep Nonlinear Non-Gaussian Filtering for Dynamical Systems

Mehrjou, A., Schölkopf, B.

Workshop: Infer to Control: Probabilistic Reinforcement Learning and Structured Control at the 32nd Conference on Neural Information Processing Systems, December 2018 (conference) Accepted

ei

PDF [BibTex]

PDF [BibTex]


no image
Resampled Priors for Variational Autoencoders

Bauer, M., Mnih, A.

Third Workshop on Bayesian Deep Learning at the 32nd Conference on Neural Information Processing Systems, December 2018 (conference) Accepted

ei

[BibTex]

[BibTex]


no image
Generalisation in humans and deep neural networks

Geirhos, R., Temme, C. R. M., Rauber, J., Schütt, H., Bethge, M., Wichmann, F. A.

32th Annual Conference on Neural Information Processing Systems, December 2018 (conference) Accepted

ei

[BibTex]

[BibTex]


Thumb xl 2018 prd
Assessing Generative Models via Precision and Recall

Sajjadi, M. S. M., Bachem, O., Lucic, M., Bousquet, O., Gelly, S.

32th Annual Conference on Neural Information Processing Systems, December 2018 (conference) Accepted

ei

arXiv [BibTex]

arXiv [BibTex]


no image
Adaptive Skip Intervals: Temporal Abstraction for Recurrent Dynamical Models

Neitz, A., Parascandolo, G., Bauer, S., Schölkopf, B.

32th Annual Conference on Neural Information Processing Systems, December 2018 (conference) Accepted

ei

arXiv [BibTex]

arXiv [BibTex]


Thumb xl pac gp img2
Learning Gaussian Processes by Minimizing PAC-Bayesian Generalization Bounds

Reeb, D., Doerr, A., Gerwinn, S., Rakitsch, B.

In Proceedings Neural Information Processing Systems, Neural Information Processing Systems (NIPS) , December 2018 (inproceedings)

Abstract
Gaussian Processes (GPs) are a generic modelling tool for supervised learning. While they have been successfully applied on large datasets, their use in safety critical applications is hindered by the lack of good performance guarantees. To this end, we propose a method to learn GPs and their sparse approximations by directly optimizing a PAC-Bayesian bound on their generalization performance, instead of maximizing the marginal likelihood. Besides its theoretical appeal, we find in our evaluation that our learning method is robust and yields significantly better generalization guarantees than other common GP approaches on several regression benchmark datasets.

ics

[BibTex]

[BibTex]


no image
A Computational Camera with Programmable Optics for Snapshot High Resolution Multispectral Imaging

Chen, J., Hirsch, M., Eberhardt, B., Lensch, H. P. A.

Computer Vision - ACCV 2018 - 14th Asian Conference on Computer Vision, December 2018 (conference) Accepted

ei

[BibTex]

[BibTex]


Thumb xl unbenannte pr%c3%a4sentation 1
Efficient Encoding of Dynamical Systems through Local Approximations

Solowjow, F., Mehrjou, A., Schölkopf, B., Trimpe, S.

In Proceedings of the 57th IEEE International Conference on Decision and Control (CDC), Miami, Fl, USA, December 2018 (inproceedings) Accepted

ei ics

arXiv PDF Project Page [BibTex]

arXiv PDF Project Page [BibTex]


no image
Informative Features for Model Comparison

Jitkrittum, W., Kanagawa, H., Sangkloy, P., Hays, J., Schölkopf, B., Gretton, A.

32th Annual Conference on Neural Information Processing Systems, December 2018 (conference) Accepted

ei

[BibTex]

[BibTex]


no image
Flex-Convolution (Million-Scale Point-Cloud Learning Beyond Grid-Worlds)

Groh*, F., Wieschollek*, P., Lensch, H. P. A.

Computer Vision - 14th Asian Conference on Computer Vision (ACCV), December 2018, *equal contribution (conference) Accepted

ei

[BibTex]

[BibTex]


no image
Bayesian Nonparametric Hawkes Processes

Kapoor, J., Vergari, A., Gomez Rodriguez, M., Valera, I.

Bayesian Nonparametrics workshop at the 32nd Conference on Neural Information Processing Systems, December 2018 (conference) Accepted

ei

PDF [BibTex]

PDF [BibTex]


Thumb xl imgidx 00326
Customized Multi-Person Tracker

Ma, L., Tang, S., Black, M. J., Gool, L. V.

In Computer Vision – ACCV 2018, Springer International Publishing, Asian Conference on Computer Vision, December 2018 (inproceedings)

ps

PDF [BibTex]

PDF [BibTex]


Thumb xl lars2018
Depth Control of Underwater Robots using Sliding Modes and Gaussian Process Regression

Lima, G. S., Bessa, W. M., Trimpe, S.

In Proceeding of the 15th Latin American Robotics Symposium, João Pessoa, Brazil, 15th Latin American Robotics Symposium, November 2018 (inproceedings) Accepted

Abstract
The development of accurate control systems for underwater robotic vehicles relies on the adequate compensation for hydrodynamic effects. In this work, a new robust control scheme is presented for remotely operated underwater vehicles. In order to meet both robustness and tracking requirements, sliding mode control is combined with Gaussian process regression. The convergence properties of the closed-loop signals are analytically proven. Numerical results confirm the stronger improved performance of the proposed control scheme.

ics

[BibTex]

[BibTex]


Thumb xl sevillagcpr
On the Integration of Optical Flow and Action Recognition

Sevilla-Lara, L., Liao, Y., Guney, F., Jampani, V., Geiger, A., Black, M. J.

In German Conference on Pattern Recognition (GCPR), October 2018 (inproceedings)

Abstract
Most of the top performing action recognition methods use optical flow as a "black box" input. Here we take a deeper look at the combination of flow and action recognition, and investigate why optical flow is helpful, what makes a flow method good for action recognition, and how we can make it better. In particular, we investigate the impact of different flow algorithms and input transformations to better understand how these affect a state-of-the-art action recognition method. Furthermore, we fine tune two neural-network flow methods end-to-end on the most widely used action recognition dataset (UCF101). Based on these experiments, we make the following five observations: 1) optical flow is useful for action recognition because it is invariant to appearance, 2) optical flow methods are optimized to minimize end-point-error (EPE), but the EPE of current methods is not well correlated with action recognition performance, 3) for the flow methods tested, accuracy at boundaries and at small displacements is most correlated with action recognition performance, 4) training optical flow to minimize classification error instead of minimizing EPE improves recognition performance, and 5) optical flow learned for the task of action recognition differs from traditional optical flow especially inside the human body and at the boundary of the body. These observations may encourage optical flow researchers to look beyond EPE as a goal and guide action recognition researchers to seek better motion cues, leading to a tighter integration of the optical flow and action recognition communities.

am ps

arXiv [BibTex]

arXiv [BibTex]


no image
Regularizing Reinforcement Learning with State Abstraction

Akrour, R., Veiga, F., Peters, J., Neuman, G.

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2018 (conference) Accepted

ei

link (url) [BibTex]

link (url) [BibTex]


Thumb xl iros18
Towards Robust Visual Odometry with a Multi-Camera System

Liu, P., Geppert, M., Heng, L., Sattler, T., Geiger, A., Pollefeys, M.

In International Conference on Intelligent Robots and Systems (IROS) 2018, International Conference on Intelligent Robots and Systems, October 2018 (inproceedings)

Abstract
We present a visual odometry (VO) algorithm for a multi-camera system and robust operation in challenging environments. Our algorithm consists of a pose tracker and a local mapper. The tracker estimates the current pose by minimizing photometric errors between the most recent keyframe and the current frame. The mapper initializes the depths of all sampled feature points using plane-sweeping stereo. To reduce pose drift, a sliding window optimizer is used to refine poses and structure jointly. Our formulation is flexible enough to support an arbitrary number of stereo cameras. We evaluate our algorithm thoroughly on five datasets. The datasets were captured in different conditions: daytime, night-time with near-infrared (NIR) illumination and night-time without NIR illumination. Experimental results show that a multi-camera setup makes the VO more robust to challenging environments, especially night-time conditions, in which a single stereo configuration fails easily due to the lack of features.

avg

pdf [BibTex]

pdf [BibTex]


no image
Learning to Categorize Bug Reports with LSTM Networks

Gondaliya, K., Peters, J., Rueckert, E.

Proceedings of the 10th International Conference on Advances in System Testing and Validation Lifecycle (VALID), October 2018 (conference) Accepted

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Domain Randomization for Simulation-Based Policy Optimization with Transferability Assessment

Muratore, F., Treede, F., Gienger, M., Peters, J.

Conference on Robot Learning (CoRL), October 2018 (conference) Accepted

ei

[BibTex]

[BibTex]


no image
Reinforcement Learning of Phase Oscillators for Fast Adaptation to Moving Targets

Maeda, G., Koc, O., Morimoto, J.

Proceedings of The 2nd Conference on Robot Learning (CoRL), 87, pages: 630-640, (Editors: Aude Billard, Anca Dragan, Jan Peters, Jun Morimoto ), PMLR, October 2018 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


Thumb xl interpolation
Temporal Interpolation as an Unsupervised Pretraining Task for Optical Flow Estimation

Wulff, J., Black, M. J.

In German Conference on Pattern Recognition (GCPR), October 2018 (inproceedings)

Abstract
The difficulty of annotating training data is a major obstacle to using CNNs for low-level tasks in video. Synthetic data often does not generalize to real videos, while unsupervised methods require heuristic n losses. Proxy tasks can overcome these issues, and start by training a network for a task for which annotation is easier or which can be trained unsupervised. The trained network is then fine-tuned for the original task using small amounts of ground truth data. Here, we investigate frame interpolation as a proxy task for optical flow. Using real movies, we train a CNN unsupervised for temporal interpolation. Such a network implicitly estimates motion, but cannot handle untextured regions. By fi ne-tuning on small amounts of ground truth flow, the network can learn to fill in homogeneous regions and compute full optical flow fi elds. Using this unsupervised pre-training, our network outperforms similar architectures that were trained supervised using synthetic optical flow.

ps

pdf arXiv [BibTex]

pdf arXiv [BibTex]


no image
Constraint-Space Projection Direct Policy Search

Akrour, R., Peters, J., Neuman, G.

14th European Workshop on Reinforcement Learning (EWRL), October 2018 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


Thumb xl bmvc pic
Human Motion Parsing by Hierarchical Dynamic Clustering

Zhang, Y., Tang, S., Sun, H., Neumann, H.

In Proceedings of the British Machine Vision Conference (BMVC), pages: 269, BMVA Press, 29th British Machine Vision Conference, September 2018 (inproceedings)

Abstract
Parsing continuous human motion into meaningful segments plays an essential role in various applications. In this work, we propose a hierarchical dynamic clustering framework to derive action clusters from a sequence of local features in an unsuper- vised bottom-up manner. We systematically investigate the modules in this framework and particularly propose diverse temporal pooling schemes, in order to realize accurate temporal action localization. We demonstrate our method on two motion parsing tasks: temporal action segmentation and abnormal behavior detection. The experimental results indicate that the proposed framework is significantly more effective than the other related state-of-the-art methods on several datasets.

ps

pdf [BibTex]

pdf [BibTex]


no image
Spatio-temporal Transformer Network for Video Restoration

Kim, T. H., Sajjadi, M. S. M., Hirsch, M., Schölkopf, B.

15th European Conference on Computer Vision (ECCV), Part III, 11207, pages: 111-127, Lecture Notes in Computer Science, (Editors: Vittorio Ferrari, Martial Hebert,Cristian Sminchisescu and Yair Weiss), Springer, September 2018 (conference)

ei

DOI [BibTex]

DOI [BibTex]


Thumb xl coma faces
Generating 3D Faces using Convolutional Mesh Autoencoders

Ranjan, A., Bolkart, T., Sanyal, S., Black, M. J.

In European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, vol 11207, pages: 725-741, Springer, Cham, September 2018 (inproceedings)

Abstract
Learned 3D representations of human faces are useful for computer vision problems such as 3D face tracking and reconstruction from images, as well as graphics applications such as character generation and animation. Traditional models learn a latent representation of a face using linear subspaces or higher-order tensor generalizations. Due to this linearity, they can not capture extreme deformations and non-linear expressions. To address this, we introduce a versatile model that learns a non-linear representation of a face using spectral convolutions on a mesh surface. We introduce mesh sampling operations that enable a hierarchical mesh representation that captures non-linear variations in shape and expression at multiple scales within the model. In a variational setting, our model samples diverse realistic 3D faces from a multivariate Gaussian distribution. Our training data consists of 20,466 meshes of extreme expressions captured over 12 different subjects. Despite limited training data, our trained model outperforms state-of-the-art face models with 50% lower reconstruction error, while using 75% fewer parameters. We also show that, replacing the expression space of an existing state-of-the-art face model with our autoencoder, achieves a lower reconstruction error. Our data, model and code are available at http://coma.is.tue.mpg.de/.

ps

code paper supplementary link (url) DOI [BibTex]

code paper supplementary link (url) DOI [BibTex]


Thumb xl ianeccv18
Learning Priors for Semantic 3D Reconstruction

Cherabier, I., Schönberger, J., Oswald, M., Pollefeys, M., Geiger, A.

In Computer Vision – ECCV 2018, Springer International Publishing, Cham, September 2018 (inproceedings)

Abstract
We present a novel semantic 3D reconstruction framework which embeds variational regularization into a neural network. Our network performs a fixed number of unrolled multi-scale optimization iterations with shared interaction weights. In contrast to existing variational methods for semantic 3D reconstruction, our model is end-to-end trainable and captures more complex dependencies between the semantic labels and the 3D geometry. Compared to previous learning-based approaches to 3D reconstruction, we integrate powerful long-range dependencies using variational coarse-to-fine optimization. As a result, our network architecture requires only a moderate number of parameters while keeping a high level of expressiveness which enables learning from very little data. Experiments on real and synthetic datasets demonstrate that our network achieves higher accuracy compared to a purely variational approach while at the same time requiring two orders of magnitude less iterations to converge. Moreover, our approach handles ten times more semantic class labels using the same computational resources.

avg

pdf suppmat Project Page Video DOI [BibTex]

pdf suppmat Project Page Video DOI [BibTex]


no image
Separating Reflection and Transmission Images in the Wild

Wieschollek, P., Gallo, O., Gu, J., Kautz, J.

European Conference on Computer Vision (ECCV), September 2018 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


Thumb xl person reid.001
Part-Aligned Bilinear Representations for Person Re-identification

Suh, Y., Wang, J., Tang, S., Mei, T., Lee, K. M.

In European Conference on Computer Vision (ECCV), 11218, pages: 418-437, Springer, Cham, September 2018 (inproceedings)

Abstract
Comparing the appearance of corresponding body parts is essential for person re-identification. However, body parts are frequently misaligned be- tween detected boxes, due to the detection errors and the pose/viewpoint changes. In this paper, we propose a network that learns a part-aligned representation for person re-identification. Our model consists of a two-stream network, which gen- erates appearance and body part feature maps respectively, and a bilinear-pooling layer that fuses two feature maps to an image descriptor. We show that it results in a compact descriptor, where the inner product between two image descriptors is equivalent to an aggregation of the local appearance similarities of the cor- responding body parts, and thereby significantly reduces the part misalignment problem. Our approach is advantageous over other pose-guided representations by learning part descriptors optimal for person re-identification. Training the net- work does not require any part annotation on the person re-identification dataset. Instead, we simply initialize the part sub-stream using a pre-trained sub-network of an existing pose estimation network and train the whole network to minimize the re-identification loss. We validate the effectiveness of our approach by demon- strating its superiority over the state-of-the-art methods on the standard bench- mark datasets including Market-1501, CUHK03, CUHK01 and DukeMTMC, and standard video dataset MARS.

ps

pdf supplementary DOI [BibTex]

pdf supplementary DOI [BibTex]


no image
Risk-Sensitivity in Simulation Based Online Planning

Schmid, K., Belzner, L., Kiermeier, M., Neitz, A., Phan, T., Gabor, T., Linnhoff, C.

KI 2018: Advances in Artificial Intelligence - 41st German Conference on AI, pages: 229-240, (Editors: F. Trollmann and A. Y. Turhan), Springer, Cham, September 2018 (conference)

ei

DOI [BibTex]

DOI [BibTex]


Thumb xl nbf
Neural Body Fitting: Unifying Deep Learning and Model-Based Human Pose and Shape Estimation

(Best Student Paper Award)

Omran, M., Lassner, C., Pons-Moll, G., Gehler, P. V., Schiele, B.

3DV, September 2018 (conference)

Abstract
Direct prediction of 3D body pose and shape remains a challenge even for highly parameterized deep learning models. Mapping from the 2D image space to the prediction space is difficult: perspective ambiguities make the loss function noisy and training data is scarce. In this paper, we propose a novel approach (Neural Body Fitting (NBF)). It integrates a statistical body model within a CNN, leveraging reliable bottom-up semantic body part segmentation and robust top-down body model constraints. NBF is fully differentiable and can be trained using 2D and 3D annotations. In detailed experiments, we analyze how the components of our model affect performance, especially the use of part segmentations as an explicit intermediate representation, and present a robust, efficiently trainable framework for 3D human pose estimation from 2D images with competitive results on standard benchmarks. Code is available at https://github.com/mohomran/neural_body_fitting

ps

arXiv code [BibTex]


no image
Discovering and Teaching Optimal Planning Strategies

Lieder, F., Callaway, F., Krueger, P. M., Das, P., Griffiths, T. L., Gul, S.

In The 14th biannual conference of the German Society for Cognitive Science, GK, September 2018 (inproceedings)

re

Project Page [BibTex]

Project Page [BibTex]


Thumb xl persondetect  copy
Learning Human Optical Flow

Ranjan, A., Romero, J., Black, M. J.

In 29th British Machine Vision Conference, September 2018 (inproceedings)

Abstract
The optical flow of humans is well known to be useful for the analysis of human action. Given this, we devise an optical flow algorithm specifically for human motion and show that it is superior to generic flow methods. Designing a method by hand is impractical, so we develop a new training database of image sequences with ground truth optical flow. For this we use a 3D model of the human body and motion capture data to synthesize realistic flow fields. We then train a convolutional neural network to estimate human flow fields from pairs of images. Since many applications in human motion analysis depend on speed, and we anticipate mobile applications, we base our method on SpyNet with several modifications. We demonstrate that our trained network is more accurate than a wide range of top methods on held-out test data and that it generalizes well to real image sequences. When combined with a person detector/tracker, the approach provides a full solution to the problem of 2D human flow estimation. Both the code and the dataset are available for research.

ps

video code pdf link (url) [BibTex]

video code pdf link (url) [BibTex]


Thumb xl joeleccv18
Unsupervised Learning of Multi-Frame Optical Flow with Occlusions

Janai, J., Güney, F., Ranjan, A., Black, M. J., Geiger, A.

In European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, vol 11220, pages: 713-731, Springer, Cham, September 2018 (inproceedings)

avg ps

pdf suppmat DOI Project Page [BibTex]

pdf suppmat DOI Project Page [BibTex]


no image
Discovering Rational Heuristics for Risky Choice

Gul, S., Krueger, P. M., Callaway, F., Griffiths, T. L., Lieder, F.

The 14th biannual conference of the German Society for Cognitive Science, GK, The 14th biannual conference of the German Society for Cognitive Science, GK, September 2018 (conference)

re

Project Page [BibTex]

Project Page [BibTex]


Thumb xl sample3 merge black
Learning an Infant Body Model from RGB-D Data for Accurate Full Body Motion Analysis

Hesse, N., Pujades, S., Romero, J., Black, M. J., Bodensteiner, C., Arens, M., Hofmann, U. G., Tacke, U., Hadders-Algra, M., Weinberger, R., Muller-Felber, W., Schroeder, A. S.

In Int. Conf. on Medical Image Computing and Computer Assisted Intervention (MICCAI), September 2018 (inproceedings)

Abstract
Infant motion analysis enables early detection of neurodevelopmental disorders like cerebral palsy (CP). Diagnosis, however, is challenging, requiring expert human judgement. An automated solution would be beneficial but requires the accurate capture of 3D full-body movements. To that end, we develop a non-intrusive, low-cost, lightweight acquisition system that captures the shape and motion of infants. Going beyond work on modeling adult body shape, we learn a 3D Skinned Multi-Infant Linear body model (SMIL) from noisy, low-quality, and incomplete RGB-D data. We demonstrate the capture of shape and motion with 37 infants in a clinical environment. Quantitative experiments show that SMIL faithfully represents the data and properly factorizes the shape and pose of the infants. With a case study based on general movement assessment (GMA), we demonstrate that SMIL captures enough information to allow medical assessment. SMIL provides a new tool and a step towards a fully automatic system for GMA.

ps

pdf Project page video extended arXiv version [BibTex]

pdf Project page video extended arXiv version [BibTex]


Thumb xl eccv pascal results  thumbnail
Deep Directional Statistics: Pose Estimation with Uncertainty Quantification

Prokudin, S., Gehler, P., Nowozin, S.

European Conference on Computer Vision (ECCV), September 2018 (conference)

Abstract
Modern deep learning systems successfully solve many perception tasks such as object pose estimation when the input image is of high quality. However, in challenging imaging conditions such as on low resolution images or when the image is corrupted by imaging artifacts, current systems degrade considerably in accuracy. While a loss in performance is unavoidable we would like our models to quantify their uncertainty in order to achieve robustness against images of varying quality. Probabilistic deep learning models combine the expressive power of deep learning with uncertainty quantification. In this paper, we propose a novel probabilistic deep learning model for the task of angular regression. Our model uses von Mises distributions to predict a distribution over object pose angle. Whereas a single von Mises distribution is making strong assumptions about the shape of the distribution, we extend the basic model to predict a mixture of von Mises distributions. We show how to learn a mixture model using a finite and infinite number of mixture components. Our model allow for likelihood-based training and efficient inference at test time. We demonstrate on a number of challenging pose estimation datasets that our model produces calibrated probability predictions and competitive or superior point estimates compared to the current state-of-the-art.

ps

code pdf [BibTex]

code pdf [BibTex]


Thumb xl beneccv18
SphereNet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images

Coors, B., Condurache, A. P., Geiger, A.

European Conference on Computer Vision (ECCV), September 2018 (conference)

Abstract
Omnidirectional cameras offer great benefits over classical cameras wherever a wide field of view is essential, such as in virtual reality applications or in autonomous robots. Unfortunately, standard convolutional neural networks are not well suited for this scenario as the natural projection surface is a sphere which cannot be unwrapped to a plane without introducing significant distortions, particularly in the polar regions. In this work, we present SphereNet, a novel deep learning framework which encodes invariance against such distortions explicitly into convolutional neural networks. Towards this goal, SphereNet adapts the sampling locations of the convolutional filters, effectively reversing distortions, and wraps the filters around the sphere. By building on regular convolutions, SphereNet enables the transfer of existing perspective convolutional neural network models to the omnidirectional case. We demonstrate the effectiveness of our method on the tasks of image classification and object detection, exploiting two newly created semi-synthetic and real-world omnidirectional datasets.

avg

pdf suppmat Project Page [BibTex]


Thumb xl vip
Recovering Accurate 3D Human Pose in The Wild Using IMUs and a Moving Camera

Marcard, T. V., Henschel, R., Black, M. J., Rosenhahn, B., Pons-Moll, G.

In European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, vol 11214, pages: 614-631, Springer, Cham, September 2018 (inproceedings)

Abstract
In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild. This poses many new challenges: the moving camera, heading drift, cluttered background, occlusions and many people visible in the video. We associate 2D pose detections in each image to the corresponding IMU-equipped persons by solving a novel graph based optimization problem that forces 3D to 2D coherency within a frame and across long range frames. Given associations, we jointly optimize the pose of a statistical body model, the camera pose and heading drift using a continuous optimization framework. We validated our method on the TotalCapture dataset, which provides video and IMU synchronized with ground truth. We obtain an accuracy of 26mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in the wild. Using our method, we recorded 3D Poses in the Wild (3DPW ), a new dataset consisting of more than 51; 000 frames with accurate 3D pose in challenging sequences, including walking in the city, going up-stairs, having co ffee or taking the bus. We make the reconstructed 3D poses, video, IMU and 3D models available for research purposes at http://virtualhumans.mpi-inf.mpg.de/3DPW.

ps

pdf SupMat data project DOI [BibTex]

pdf SupMat data project DOI [BibTex]


no image
From Deterministic ODEs to Dynamic Structural Causal Models

Rubenstein, P. K., Bongers, S., Mooij, J. M., Schölkopf, B.

Proceedings of the 34th Conference on Uncertainty in Artificial Intelligence (UAI), August 2018 (conference) Accepted

ei

Arxiv link (url) [BibTex]

Arxiv link (url) [BibTex]


no image
Learning-Based Robust Model Predictive Control with State-Dependent Uncertainty

Soloperto, R., Müller, M. A., Trimpe, S., Allgöwer, F.

In Proceedings of the IFAC Conference on Nonlinear Model Predictive Control (NMPC), Madison, Wisconsin, USA, 6th IFAC Conference on Nonlinear Model Predictive Control, August 2018 (inproceedings) Accepted

ics

PDF [BibTex]

PDF [BibTex]


no image
The Unreasonable Effectiveness of Texture Transfer for Single Image Super-resolution

Gondal, M. W., Schölkopf, B., Hirsch, M.

Workshop and Challenge on Perceptual Image Restoration and Manipulation (PIRM) at the 15th European Conference on Computer Vision (ECCV), August 2018 (conference)

ei

arXiv [BibTex]

arXiv [BibTex]


no image
Generalized Score Functions for Causal Discovery

Huang, B., Zhang, K., Lin, Y., Schölkopf, B., C., G.

Proceedings of the 24th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pages: 1551-1560, (Editors: Yike Guo and Faisal Farooq), ACM, August 2018 (conference)

ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl aircap ca 3
Decentralized MPC based Obstacle Avoidance for Multi-Robot Target Tracking Scenarios

Tallamraju, R., Rajappa, S., Black, M. J., Karlapalem, K., Ahmad, A.

2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), pages: 1-8, IEEE, August 2018 (conference)

Abstract
In this work, we consider the problem of decentralized multi-robot target tracking and obstacle avoidance in dynamic environments. Each robot executes a local motion planning algorithm which is based on model predictive control (MPC). The planner is designed as a quadratic program, subject to constraints on robot dynamics and obstacle avoidance. Repulsive potential field functions are employed to avoid obstacles. The novelty of our approach lies in embedding these non-linear potential field functions as constraints within a convex optimization framework. Our method convexifies nonconvex constraints and dependencies, by replacing them as pre-computed external input forces in robot dynamics. The proposed algorithm additionally incorporates different methods to avoid field local minima problems associated with using potential field functions in planning. The motion planner does not enforce predefined trajectories or any formation geometry on the robots and is a comprehensive solution for cooperative obstacle avoidance in the context of multi-robot target tracking. We perform simulation studies for different scenarios to showcase the convergence and efficacy of the proposed algorithm.

ps

Published Version link (url) DOI Project Page [BibTex]

Published Version link (url) DOI Project Page [BibTex]