Header logo is


2020


A gamified app that helps people overcome self-limiting beliefs by promoting metacognition
A gamified app that helps people overcome self-limiting beliefs by promoting metacognition

Amo, V., Lieder, F.

pages: 6, SIG 8 Meets SIG 16, September 2020 (conference)

Abstract
Previous research has shown that approaching learning with a growth mindset is key for maintaining motivation and overcoming setbacks. Mindsets are systems of beliefs that people hold to be true. They influence a person's attitudes, thoughts, and emotions when they learn something new or encounter challenges. In clinical psychology, metareasoning (reflecting on one's mental processes) and meta-awareness (recognizing thoughts as mental events instead of equating them to reality) have proven effective for overcoming maladaptive thinking styles. Hence, they are potentially an effective method for overcoming self-limiting beliefs in other domains as well. However, the potential of integrating assisted metacognition into mindset interventions has not been explored yet. Here, we propose that guiding and training people on how to leverage metareasoning and meta-awareness for overcoming self-limiting beliefs can significantly enhance the effectiveness of mindset interventions. To test this hypothesis, we develop a gamified mobile application that guides and trains people to use metacognitive strategies based on Cognitive Restructuring (CR) and Acceptance Commitment Therapy (ACT) techniques. The application helps users to identify and overcome self-limiting beliefs by working with aversive emotions when they are triggered by fixed mindsets in real-life situations. Our app aims to help people sustain their motivation to learn when they face inner obstacles (e.g. anxiety, frustration, and demotivation). We expect the application to be an effective tool for helping people better understand and develop the metacognitive skills of emotion regulation and self-regulation that are needed to overcome self-limiting beliefs and develop growth mindsets.

re

A gamified app that helps people overcome self-limiting beliefs by promoting metacognition [BibTex]


Learning to Dress 3D People in Generative Clothing
Learning to Dress 3D People in Generative Clothing

Ma, Q., Yang, J., Ranjan, A., Pujades, S., Pons-Moll, G., Tang, S., Black, M. J.

In Computer Vision and Pattern Recognition (CVPR), June 2020 (inproceedings)

Abstract
Three-dimensional human body models are widely used in the analysis of human pose and motion. Existing models, however, are learned from minimally-clothed 3D scans and thus do not generalize to the complexity of dressed people in common images and videos. Additionally, current models lack the expressive power needed to represent the complex non-linear geometry of pose-dependent clothing shape. To address this, we learn a generative 3D mesh model of clothed people from 3D scans with varying pose and clothing. Specifically, we train a conditional Mesh-VAE-GAN to learn the clothing deformation from the SMPL body model, making clothing an additional term on SMPL. Our model is conditioned on both pose and clothing type, giving the ability to draw samples of clothing to dress different body shapes in a variety of styles and poses. To preserve wrinkle detail, our Mesh-VAE-GAN extends patchwise discriminators to 3D meshes. Our model, named CAPE, represents global shape and fine local structure, effectively extending the SMPL body model to clothing. To our knowledge, this is the first generative model that directly dresses 3D human body meshes and generalizes to different poses.

ps

arxiv [BibTex]

arxiv [BibTex]


Generating 3D People in Scenes without People
Generating 3D People in Scenes without People

Zhang, Y., Hassan, M., Neumann, H., Black, M. J., Tang, S.

In Computer Vision and Pattern Recognition (CVPR), June 2020 (inproceedings)

Abstract
We present a fully-automatic system that takes a 3D scene and generates plausible 3D human bodies that are posed naturally in that 3D scene. Given a 3D scene without people, humans can easily imagine how people could interact with the scene and the objects in it. However, this is a challenging task for a computer as solving it requires (1) the generated human bodies should be semantically plausible with the 3D environment, e.g. people sitting on the sofa or cooking near the stove; (2) the generated human-scene interaction should be physically feasible in the way that the human body and scene do not interpenetrate while, at the same time, body-scene contact supports physical interactions. To that end, we make use of the surface-based 3D human model SMPL-X. We first train a conditional variational autoencoder to predict semantically plausible 3D human pose conditioned on latent scene representations, then we further refine the generated 3D bodies using scene constraints to enforce feasible physical interaction. We show that our approach is able to synthesize realistic and expressive 3D human bodies that naturally interact with 3D environment. We perform extensive experiments demonstrating that our generative framework compares favorably with existing methods, both qualitatively and quantitatively. We believe that our scene-conditioned 3D human generation pipeline will be useful for numerous applications; e.g. to generate training data for human pose estimation, in video games and in VR/AR.

ps

PDF link (url) [BibTex]

PDF link (url) [BibTex]


no image
A Continuous-time Perspective for Modeling Acceleration in Riemannian Optimization

F Alimisis, F., Orvieto, A., Becigneul, G., Lucchi, A.

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), June 2020 (conference) Accepted

ei

[BibTex]

[BibTex]


no image
A Kernel Mean Embedding Approach to Reducing Conservativeness in Stochastic Programming and Control

Zhu, J., Diehl, M., Schölkopf, B.

2nd Annual Conference on Learning for Dynamics and Control (L4DC), June 2020 (conference) Accepted

ei

arXiv [BibTex]

arXiv [BibTex]


Learning Physics-guided Face Relighting under Directional Light
Learning Physics-guided Face Relighting under Directional Light

Nestmeyer, T., Lalonde, J., Matthews, I., Lehrmann, A. M.

In Conference on Computer Vision and Pattern Recognition, IEEE/CVF, June 2020 (inproceedings) Accepted

Abstract
Relighting is an essential step in realistically transferring objects from a captured image into another environment. For example, authentic telepresence in Augmented Reality requires faces to be displayed and relit consistent with the observer's scene lighting. We investigate end-to-end deep learning architectures that both de-light and relight an image of a human face. Our model decomposes the input image into intrinsic components according to a diffuse physics-based image formation model. We enable non-diffuse effects including cast shadows and specular highlights by predicting a residual correction to the diffuse render. To train and evaluate our model, we collected a portrait database of 21 subjects with various expressions and poses. Each sample is captured in a controlled light stage setup with 32 individual light sources. Our method creates precise and believable relighting results and generalizes to complex illumination conditions and challenging poses, including when the subject is not looking straight at the camera.

ps

[BibTex]

[BibTex]


{VIBE}: Video Inference for Human Body Pose and Shape Estimation
VIBE: Video Inference for Human Body Pose and Shape Estimation

Kocabas, M., Athanasiou, N., Black, M. J.

In Computer Vision and Pattern Recognition (CVPR), June 2020 (inproceedings)

Abstract
Human motion is fundamental to understanding behavior. Despite progress on single-image 3D pose and shape estimation, existing video-based state-of-the-art methodsfail to produce accurate and natural motion sequences due to a lack of ground-truth 3D motion data for training. To address this problem, we propose “Video Inference for Body Pose and Shape Estimation” (VIBE), which makes use of an existing large-scale motion capture dataset (AMASS) together with unpaired, in-the-wild, 2D keypoint annotations. Our key novelty is an adversarial learning framework that leverages AMASS to discriminate between real human motions and those produced by our temporal pose and shape regression networks. We define a temporal network architecture and show that adversarial training, at the sequence level, produces kinematically plausible motion sequences without in-the-wild ground-truth 3D labels. We perform extensive experimentation to analyze the importance of motion and demonstrate the effectiveness of VIBE on challenging 3D pose estimation datasets, achieving state-of-the-art performance. Code and pretrained models are available at https://github.com/mkocabas/VIBE

ps

arXiv code [BibTex]

arXiv code [BibTex]


no image
Mixed-curvature Variational Autoencoders

Skopek, O., Ganea, O., Becigneul, G.

8th International Conference on Learning Representations (ICLR), April 2020 (conference) Accepted

ei

link (url) [BibTex]

link (url) [BibTex]


From Variational to Deterministic Autoencoders
From Variational to Deterministic Autoencoders

Ghosh*, P., Sajjadi*, M. S. M., Vergari, A., Black, M. J., Schölkopf, B.

8th International Conference on Learning Representations (ICLR) , April 2020, *equal contribution (conference) Accepted

Abstract
Variational Autoencoders (VAEs) provide a theoretically-backed framework for deep generative models. However, they often produce “blurry” images, which is linked to their training objective. Sampling in the most popular implementation, the Gaussian VAE, can be interpreted as simply injecting noise to the input of a deterministic decoder. In practice, this simply enforces a smooth latent space structure. We challenge the adoption of the full VAE framework on this specific point in favor of a simpler, deterministic one. Specifically, we investigate how substituting stochasticity with other explicit and implicit regularization schemes can lead to a meaningful latent space without having to force it to conform to an arbitrarily chosen prior. To retrieve a generative mechanism for sampling new data points, we propose to employ an efficient ex-post density estimation step that can be readily adopted both for the proposed deterministic autoencoders as well as to improve sample quality of existing VAEs. We show in a rigorous empirical study that regularized deterministic autoencoding achieves state-of-the-art sample quality on the common MNIST, CIFAR-10 and CelebA datasets.

ei ps

arXiv [BibTex]

arXiv [BibTex]


Chained Representation Cycling: Learning to Estimate 3D Human Pose and Shape by Cycling Between Representations
Chained Representation Cycling: Learning to Estimate 3D Human Pose and Shape by Cycling Between Representations

Rueegg, N., Lassner, C., Black, M. J., Schindler, K.

In Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), Febuary 2020 (inproceedings)

Abstract
The goal of many computer vision systems is to transform image pixels into 3D representations. Recent popular models use neural networks to regress directly from pixels to 3D object parameters. Such an approach works well when supervision is available, but in problems like human pose and shape estimation, it is difficult to obtain natural images with 3D ground truth. To go one step further, we propose a new architecture that facilitates unsupervised, or lightly supervised, learning. The idea is to break the problem into a series of transformations between increasingly abstract representations. Each step involves a cycle designed to be learnable without annotated training data, and the chain of cycles delivers the final solution. Specifically, we use 2D body part segments as an intermediate representation that contains enough information to be lifted to 3D, and at the same time is simple enough to be learned in an unsupervised way. We demonstrate the method by learning 3D human pose and shape from un-paired and un-annotated images. We also explore varying amounts of paired data and show that cycling greatly alleviates the need for paired data. While we present results for modeling humans, our formulation is general and can be applied to other vision problems.

ps

pdf [BibTex]

pdf [BibTex]


Learning Multi-Human Optical Flow
Learning Multi-Human Optical Flow

Ranjan, A., Hoffmann, D. T., Tzionas, D., Tang, S., Romero, J., Black, M. J.

International Journal of Computer Vision (IJCV), January 2020 (article)

Abstract
The optical flow of humans is well known to be useful for the analysis of human action. Recent optical flow methods focus on training deep networks to approach the problem. However, the training data used by them does not cover the domain of human motion. Therefore, we develop a dataset of multi-human optical flow and train optical flow networks on this dataset. We use a 3D model of the human body and motion capture data to synthesize realistic flow fields in both single-and multi-person images. We then train optical flow networks to estimate human flow fields from pairs of images. We demonstrate that our trained networks are more accurate than a wide range of top methods on held-out test data and that they can generalize well to real image sequences. The code, trained models and the dataset are available for research.

ps

Paper Publisher Version poster link (url) DOI [BibTex]


no image
More Powerful Selective Kernel Tests for Feature Selection

Lim, J. N., Yamada, M., Jitkrittum, W., Terada, Y., Matsui, S., Shimodaira, H.

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), 2020 (conference) To be published

ei

arXiv [BibTex]

arXiv [BibTex]


no image
ACTrain: Ein KI-basiertes Aufmerksamkeitstraining für die Wissensarbeit [ACTrain: An AI-based attention training for knowledge work]

Wirzberger, M., Oreshnikov, I., Passy, J., Lado, A., Shenhav, A., Lieder, F.

66th Spring Conference of the German Ergonomics Society, 2020 (conference)

Abstract
Unser digitales Zeitalter lebt von Informationen und stellt unsere begrenzte Verarbeitungskapazität damit täglich auf die Probe. Gerade in der Wissensarbeit haben ständige Ablenkungen erhebliche Leistungseinbußen zur Folge. Unsere intelligente Anwendung ACTrain setzt genau an dieser Stelle an und verwandelt Computertätigkeiten in eine Trainingshalle für den Geist. Feedback auf Basis maschineller Lernverfahren zeigt anschaulich den Wert auf, sich nicht von einer selbst gewählten Aufgabe ablenken zu lassen. Diese metakognitive Einsicht soll zum Durchhalten motivieren und das zugrunde liegende Fertigkeitsniveau der Aufmerksamkeitskontrolle stärken. In laufenden Feldexperimenten untersuchen wir die Frage, ob das Training mit diesem optimalen Feedback die Aufmerksamkeits- und Selbstkontrollfertigkeiten im Vergleich zu einer Kontrollgruppe ohne Feedback verbessern kann.

re sf

link (url) [BibTex]


no image
Computationally Tractable Riemannian Manifolds for Graph Embeddings

Cruceru, C., Becigneul, G., Ganea, O.

37th International Conference on Machine Learning (ICML), 2020 (conference) Submitted

ei

[BibTex]

[BibTex]


General Movement Assessment from videos of computed {3D} infant body models is equally effective compared to conventional {RGB} Video rating
General Movement Assessment from videos of computed 3D infant body models is equally effective compared to conventional RGB Video rating

Schroeder, S., Hesse, N., Weinberger, R., Tacke, U., Gerstl, L., Hilgendorff, A., Heinen, F., Arens, M., Bodensteiner, C., Dijkstra, L. J., Pujades, S., Black, M., Hadders-Algra, M.

Early Human Development, 2020 (article)

Abstract
Background: General Movement Assessment (GMA) is a powerful tool to predict Cerebral Palsy (CP). Yet, GMA requires substantial training hampering its implementation in clinical routine. This inspired a world-wide quest for automated GMA. Aim: To test whether a low-cost, marker-less system for three-dimensional motion capture from RGB depth sequences using a whole body infant model may serve as the basis for automated GMA. Study design: Clinical case study at an academic neurodevelopmental outpatient clinic. Subjects: Twenty-nine high-risk infants were recruited and assessed at their clinical follow-up at 2-4 month corrected age (CA). Their neurodevelopmental outcome was assessed regularly up to 12-31 months CA. Outcome measures: GMA according to Hadders-Algra by a masked GMA-expert of conventional and computed 3D body model (“SMIL motion”) videos of the same GMs. Agreement between both GMAs was assessed, and sensitivity and specificity of both methods to predict CP at ≥12 months CA. Results: The agreement of the two GMA ratings was substantial, with κ=0.66 for the classification of definitely abnormal (DA) GMs and an ICC of 0.887 (95% CI 0.762;0.947) for a more detailed GM-scoring. Five children were diagnosed with CP (four bilateral, one unilateral CP). The GMs of the child with unilateral CP were twice rated as mildly abnormal. DA-ratings of both videos predicted bilateral CP well: sensitivity 75% and 100%, specificity 88% and 92% for conventional and SMIL motion videos, respectively. Conclusions: Our computed infant 3D full body model is an attractive starting point for automated GMA in infants at risk of CP.

ps

[BibTex]

[BibTex]


no image
Practical Accelerated Optimization on Riemannian Manifolds

F Alimisis, F., Orvieto, A., Becigneul, G., Lucchi, A.

37th International Conference on Machine Learning (ICML), 2020 (conference) Submitted

ei

[BibTex]

[BibTex]


no image
Constant Curvature Graph Convolutional Networks

Bachmann*, G., Becigneul*, G., Ganea, O.

37th International Conference on Machine Learning (ICML), 2020, *equal contribution (conference) Submitted

ei

[BibTex]

[BibTex]

2018


no image
Non-factorised Variational Inference in Dynamical Systems

Ialongo, A. D., Van Der Wilk, M., Hensman, J., Rasmussen, C. E.

1st Symposion on Advances in Approximate Bayesian Inference, December 2018 (conference)

ei

PDF link (url) [BibTex]

2018


PDF link (url) [BibTex]


no image
Enhancing the Accuracy and Fairness of Human Decision Making

Valera, I., Singla, A., Gomez Rodriguez, M.

Advances in Neural Information Processing Systems 31, pages: 1774-1783, (Editors: S. Bengio and H. Wallach and H. Larochelle and K. Grauman and N. Cesa-Bianchi and R. Garnett), Curran Associates, Inc., 32nd Annual Conference on Neural Information Processing Systems, December 2018 (conference)

ei

arXiv link (url) Project Page [BibTex]

arXiv link (url) Project Page [BibTex]


no image
Consolidating the Meta-Learning Zoo: A Unifying Perspective as Posterior Predictive Inference

Gordon*, J., Bronskill*, J., Bauer*, M., Nowozin, S., Turner, R. E.

Workshop on Meta-Learning (MetaLearn 2018) at the 32nd Conference on Neural Information Processing Systems, December 2018, *equal contribution (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Versa: Versatile and Efficient Few-shot Learning

Gordon*, J., Bronskill*, J., Bauer*, M., Nowozin, S., Turner, R. E.

Third Workshop on Bayesian Deep Learning at the 32nd Conference on Neural Information Processing Systems, December 2018, *equal contribution (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
DP-MAC: The Differentially Private Method of Auxiliary Coordinates for Deep Learning

Harder, F., Köhler, J., Welling, M., Park, M.

Workshop on Privacy Preserving Machine Learning at the 32nd Conference on Neural Information Processing Systems, December 2018 (conference)

ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Boosting Black Box Variational Inference

Locatello*, F., Dresdner*, G., R., K., Valera, I., Rätsch, G.

Advances in Neural Information Processing Systems 31, pages: 3405-3415, (Editors: S. Bengio and H. Wallach and H. Larochelle and K. Grauman and N. Cesa-Bianchi and R. Garnett), Curran Associates, Inc., 32nd Annual Conference on Neural Information Processing Systems, December 2018, *equal contribution (conference)

ei

arXiv link (url) Project Page [BibTex]

arXiv link (url) Project Page [BibTex]


no image
Deep Nonlinear Non-Gaussian Filtering for Dynamical Systems

Mehrjou, A., Schölkopf, B.

Workshop: Infer to Control: Probabilistic Reinforcement Learning and Structured Control at the 32nd Conference on Neural Information Processing Systems, December 2018 (conference)

ei

PDF link (url) [BibTex]

PDF link (url) [BibTex]


no image
Resampled Priors for Variational Autoencoders

Bauer, M., Mnih, A.

Third Workshop on Bayesian Deep Learning at the 32nd Conference on Neural Information Processing Systems, December 2018 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Learning Invariances using the Marginal Likelihood

van der Wilk, M., Bauer, M., John, S. T., Hensman, J.

Advances in Neural Information Processing Systems 31, pages: 9960-9970, (Editors: S. Bengio and H. Wallach and H. Larochelle and K. Grauman and N. Cesa-Bianchi and R. Garnett), Curran Associates, Inc., 32nd Annual Conference on Neural Information Processing Systems, December 2018 (conference)

ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Data-Efficient Hierarchical Reinforcement Learning

Nachum, O., Gu, S., Lee, H., Levine, S.

Advances in Neural Information Processing Systems 31, pages: 3307-3317, (Editors: S. Bengio and H. Wallach and H. Larochelle and K. Grauman and N. Cesa-Bianchi and R. Garnett), Curran Associates, Inc., 32nd Annual Conference on Neural Information Processing Systems, December 2018 (conference)

ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Generalisation in humans and deep neural networks

Geirhos, R., Temme, C. R. M., Rauber, J., Schütt, H., Bethge, M., Wichmann, F. A.

Advances in Neural Information Processing Systems 31, pages: 7549-7561, (Editors: S. Bengio and H. Wallach and H. Larochelle and K. Grauman and N. Cesa-Bianchi and R. Garnett), Curran Associates, Inc., 32nd Annual Conference on Neural Information Processing Systems, December 2018 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Parallel and functionally segregated processing of task phase and conscious content in the prefrontal cortex

Kapoor, V., Besserve, M., Logothetis, N. K., Panagiotaropoulos, T. I.

Communications Biology, 1(215):1-12, December 2018 (article)

ei

link (url) DOI Project Page [BibTex]

link (url) DOI Project Page [BibTex]


no image
A Computational Camera with Programmable Optics for Snapshot High Resolution Multispectral Imaging

Chen, J., Hirsch, M., Eberhardt, B., Lensch, H. P. A.

Computer Vision - ACCV 2018 - 14th Asian Conference on Computer Vision, December 2018 (conference) Accepted

ei

[BibTex]

[BibTex]


no image
Adaptive Skip Intervals: Temporal Abstraction for Recurrent Dynamical Models

Neitz, A., Parascandolo, G., Bauer, S., Schölkopf, B.

Advances in Neural Information Processing Systems 31, pages: 9838-9848, (Editors: S. Bengio and H. Wallach and H. Larochelle and K. Grauman and N. Cesa-Bianchi and R. Garnett), Curran Associates, Inc., 32nd Annual Conference on Neural Information Processing Systems, December 2018 (conference)

ei

arXiv link (url) [BibTex]

arXiv link (url) [BibTex]


Assessing Generative Models via Precision and Recall
Assessing Generative Models via Precision and Recall

Sajjadi, M. S. M., Bachem, O., Lucic, M., Bousquet, O., Gelly, S.

Advances in Neural Information Processing Systems 31, pages: 5234-5243, (Editors: S. Bengio and H. Wallach and H. Larochelle and K. Grauman and N. Cesa-Bianchi and R. Garnett), Curran Associates, Inc., 32nd Annual Conference on Neural Information Processing Systems, December 2018 (conference)

ei

arXiv link (url) [BibTex]

arXiv link (url) [BibTex]


Efficient Encoding of Dynamical Systems through Local Approximations
Efficient Encoding of Dynamical Systems through Local Approximations

Solowjow, F., Mehrjou, A., Schölkopf, B., Trimpe, S.

In Proceedings of the 57th IEEE International Conference on Decision and Control (CDC), pages: 6073 - 6079 , Miami, Fl, USA, December 2018 (inproceedings)

ei ics

arXiv PDF DOI Project Page [BibTex]

arXiv PDF DOI Project Page [BibTex]


no image
Flex-Convolution (Million-Scale Point-Cloud Learning Beyond Grid-Worlds)

Groh*, F., Wieschollek*, P., Lensch, H. P. A.

Computer Vision - ACCV 2018 - 14th Asian Conference on Computer Vision, December 2018, *equal contribution (conference) Accepted

ei

[BibTex]

[BibTex]


no image
Bayesian Nonparametric Hawkes Processes

Kapoor, J., Vergari, A., Gomez Rodriguez, M., Valera, I.

Bayesian Nonparametrics workshop at the 32nd Conference on Neural Information Processing Systems, December 2018 (conference)

ei

PDF link (url) [BibTex]

PDF link (url) [BibTex]


no image
Informative Features for Model Comparison

Jitkrittum, W., Kanagawa, H., Sangkloy, P., Hays, J., Schölkopf, B., Gretton, A.

Advances in Neural Information Processing Systems 31, pages: 816-827, (Editors: S. Bengio and H. Wallach and H. Larochelle and K. Grauman and N. Cesa-Bianchi and R. Garnett), Curran Associates, Inc., 32nd Annual Conference on Neural Information Processing Systems, December 2018 (conference)

ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Customized Multi-Person Tracker
Customized Multi-Person Tracker

Ma, L., Tang, S., Black, M. J., Van Gool, L.

In Computer Vision – ACCV 2018, Springer International Publishing, Asian Conference on Computer Vision, December 2018 (inproceedings)

ps

PDF Project Page [BibTex]

PDF Project Page [BibTex]


Deep Inertial Poser: Learning to Reconstruct Human Pose from Sparse Inertial Measurements in Real Time
Deep Inertial Poser: Learning to Reconstruct Human Pose from Sparse Inertial Measurements in Real Time

Huang, Y., Kaufmann, M., Aksan, E., Black, M. J., Hilliges, O., Pons-Moll, G.

ACM Transactions on Graphics, (Proc. SIGGRAPH Asia), 37, pages: 185:1-185:15, ACM, November 2018, Two first authors contributed equally (article)

Abstract
We demonstrate a novel deep neural network capable of reconstructing human full body pose in real-time from 6 Inertial Measurement Units (IMUs) worn on the user's body. In doing so, we address several difficult challenges. First, the problem is severely under-constrained as multiple pose parameters produce the same IMU orientations. Second, capturing IMU data in conjunction with ground-truth poses is expensive and difficult to do in many target application scenarios (e.g., outdoors). Third, modeling temporal dependencies through non-linear optimization has proven effective in prior work but makes real-time prediction infeasible. To address this important limitation, we learn the temporal pose priors using deep learning. To learn from sufficient data, we synthesize IMU data from motion capture datasets. A bi-directional RNN architecture leverages past and future information that is available at training time. At test time, we deploy the network in a sliding window fashion, retaining real time capabilities. To evaluate our method, we recorded DIP-IMU, a dataset consisting of 10 subjects wearing 17 IMUs for validation in 64 sequences with 330,000 time instants; this constitutes the largest IMU dataset publicly available. We quantitatively evaluate our approach on multiple datasets and show results from a real-time implementation. DIP-IMU and the code are available for research purposes.

ps

data code pdf preprint errata video DOI Project Page [BibTex]

data code pdf preprint errata video DOI Project Page [BibTex]


On the Integration of Optical Flow and Action Recognition
On the Integration of Optical Flow and Action Recognition

Sevilla-Lara, L., Liao, Y., Güney, F., Jampani, V., Geiger, A., Black, M. J.

In German Conference on Pattern Recognition (GCPR), LNCS 11269, pages: 281-297, Springer, Cham, October 2018 (inproceedings)

Abstract
Most of the top performing action recognition methods use optical flow as a "black box" input. Here we take a deeper look at the combination of flow and action recognition, and investigate why optical flow is helpful, what makes a flow method good for action recognition, and how we can make it better. In particular, we investigate the impact of different flow algorithms and input transformations to better understand how these affect a state-of-the-art action recognition method. Furthermore, we fine tune two neural-network flow methods end-to-end on the most widely used action recognition dataset (UCF101). Based on these experiments, we make the following five observations: 1) optical flow is useful for action recognition because it is invariant to appearance, 2) optical flow methods are optimized to minimize end-point-error (EPE), but the EPE of current methods is not well correlated with action recognition performance, 3) for the flow methods tested, accuracy at boundaries and at small displacements is most correlated with action recognition performance, 4) training optical flow to minimize classification error instead of minimizing EPE improves recognition performance, and 5) optical flow learned for the task of action recognition differs from traditional optical flow especially inside the human body and at the boundary of the body. These observations may encourage optical flow researchers to look beyond EPE as a goal and guide action recognition researchers to seek better motion cues, leading to a tighter integration of the optical flow and action recognition communities.

avg ps

arXiv DOI [BibTex]

arXiv DOI [BibTex]


no image
Regularizing Reinforcement Learning with State Abstraction

Akrour, R., Veiga, F., Peters, J., Neuman, G.

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2018 (conference) Accepted

ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Deep Neural Network-based Cooperative Visual Tracking through Multiple Micro Aerial Vehicles
Deep Neural Network-based Cooperative Visual Tracking through Multiple Micro Aerial Vehicles

Price, E., Lawless, G., Ludwig, R., Martinovic, I., Buelthoff, H. H., Black, M. J., Ahmad, A.

IEEE Robotics and Automation Letters, Robotics and Automation Letters, 3(4):3193-3200, IEEE, October 2018, Also accepted and presented in the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). (article)

Abstract
Multi-camera tracking of humans and animals in outdoor environments is a relevant and challenging problem. Our approach to it involves a team of cooperating micro aerial vehicles (MAVs) with on-board cameras only. DNNs often fail at objects with small scale or far away from the camera, which are typical characteristics of a scenario with aerial robots. Thus, the core problem addressed in this paper is how to achieve on-board, online, continuous and accurate vision-based detections using DNNs for visual person tracking through MAVs. Our solution leverages cooperation among multiple MAVs and active selection of most informative regions of image. We demonstrate the efficiency of our approach through simulations with up to 16 robots and real robot experiments involving two aerial robots tracking a person, while maintaining an active perception-driven formation. ROS-based source code is provided for the benefit of the community.

ps

Published Version link (url) DOI [BibTex]

Published Version link (url) DOI [BibTex]


no image
Learning to Categorize Bug Reports with LSTM Networks

Gondaliya, K., Peters, J., Rueckert, E.

Proceedings of the 10th International Conference on Advances in System Testing and Validation Lifecycle (VALID), pages: 7-12, October 2018 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Domain Randomization for Simulation-Based Policy Optimization with Transferability Assessment

Muratore, F., Treede, F., Gienger, M., Peters, J.

2nd Annual Conference on Robot Learning (CoRL), 87, pages: 700-713, Proceedings of Machine Learning Research, PMLR, October 2018 (conference)

ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Reinforcement Learning of Phase Oscillators for Fast Adaptation to Moving Targets

Maeda, G., Koc, O., Morimoto, J.

Proceedings of The 2nd Conference on Robot Learning (CoRL), 87, pages: 630-640, (Editors: Aude Billard, Anca Dragan, Jan Peters, Jun Morimoto ), PMLR, October 2018 (conference)

ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Control of Musculoskeletal Systems using Learned Dynamics Models
Control of Musculoskeletal Systems using Learned Dynamics Models

Büchler, D., Calandra, R., Schölkopf, B., Peters, J.

IEEE Robotics and Automation Letters, Robotics and Automation Letters, 3(4):3161-3168, IEEE, 2018 (article)

Abstract
Controlling musculoskeletal systems, especially robots actuated by pneumatic artificial muscles, is a challenging task due to nonlinearities, hysteresis effects, massive actuator de- lay and unobservable dependencies such as temperature. Despite such difficulties, muscular systems offer many beneficial prop- erties to achieve human-comparable performance in uncertain and fast-changing tasks. For example, muscles are backdrivable and provide variable stiffness while offering high forces to reach high accelerations. In addition, the embodied intelligence deriving from the compliance might reduce the control demands for specific tasks. In this paper, we address the problem of how to accurately control musculoskeletal robots. To address this issue, we propose to learn probabilistic forward dynamics models using Gaussian processes and, subsequently, to employ these models for control. However, Gaussian processes dynamics models cannot be set-up for our musculoskeletal robot as for traditional motor- driven robots because of unclear state composition etc. We hence empirically study and discuss in detail how to tune these approaches to complex musculoskeletal robots and their specific challenges. Moreover, we show that our model can be used to accurately control an antagonistic pair of pneumatic artificial muscles for a trajectory tracking task while considering only one- step-ahead predictions of the forward model and incorporating model uncertainty.

ei

RAL18final link (url) DOI Project Page [BibTex]

RAL18final link (url) DOI Project Page [BibTex]