Institute Talks

- Borja de Balle Pigem

- MPI IS lecture hall (N0.002)

The Gaussian mechanism is an essential building block used in multitude of differentially private data analysis algorithms. In this talk I will revisit the classical analysis of the Gaussian mechanism and show it has several important limitations. For example, our analysis reveals that the variance formula for the original mechanism is far from tight in the high privacy regime and that it cannot be extended to the low privacy regime. We address these limitations by developing a new Gaussian mechanism whose variance is optimally calibrated by solving an equation involving the Gaussian cumulative density function. Our analysis side-steps the use of tail bounds approximations and relies on a novel characterisation of differential privacy that might be of independent interest. We numerically show that analytical calibration removes at least a third of the variance of the noise compared to the classical Gaussian mechanism. We also propose to equip the Gaussian mechanism with a post-processing step based on adaptive denoising estimators by leveraging that the variance of the perturbation is known. Experiments with synthetic and real data show that this denoising step yields dramatic accuracy improvements in the high-dimensional regime. Based on joint work with Y.-X. Wang to appear at ICML 2018. Pre-print: https://arxiv.org/abs/1805.06530

Organizers: Michel Besserve Isabel Valera

- Dr. Sebastian Trimpe

- MPI-IS, Stuttgart, Lecture Hall 2 D5

Modern technology allows us to collect, process, and share more data than ever before. This data revolution opens up new ways to design control and learning algorithms, which will form the algorithmic foundation for future intelligent systems that shall act autonomously in the physical world. Starting from a discussion of the special challenges when combining machine learning and control, I will present some of our recent research in this exciting area. Using the example of the Apollo robot learning to balance a stick in its hand, I will explain how intelligent agents can learn new behavior from just a few experimental trails. I will also discuss the need for theoretical guarantees in learning-based control, and how we can obtain them by combining learning and control theory.

Organizers: Katherine Kuchenbecker Ildikó Papp-Wiedmann Matthias Tröndle Claudia Daefler

Talk

- 13 July 2018 • 14:45 15:15

- Dr. Martin Hägele

- MPI-IS, Stuttgart, Lecture Hall 2 D5

In 1995 Fraunhofer IPA embarked on a mission towards designing a personal robot assistant for everyday tasks. In the following years Care-O-bot developed into a long-term experiment for exploring and demonstrating new robot technologies and future product visions. The recent fourth generation of the Care-O-bot, introduced in 2014 aimed at designing an integrated system which addressed a number of innovations such as modularity, “low-cost” by making use of new manufacturing processes, and advanced human-user interaction. Some 15 systems were built and the intellectual property (IP) generated by over 20 years of research was recently licensed to a start-up. The presentation will review the path from an experimental platform for building up expertise in various robotic disciplines to recent pilot applications based on the now commercial Care-O-bot hardware.

Organizers: Katherine Kuchenbecker Ildikó Papp-Wiedmann Matthias Tröndle Claudia Daefler

- Dr. Wenqi Hu

- MPI-IS, Stuttgart, Lecture Hall 2 D5

Organizers: Katherine Kuchenbecker Ildikó Papp-Wiedmann Matthias Tröndle Claudia Daefler

Talk

- 13 July 2018 • 15:45 16:15

- Prof. Dr. Dawn Bonnell

- MPI-IS, Stuttgart, Lecture Hall 2 D5

With the ubiquity of catalyzed reactions in manufacturing, the emergence of the device laden internet of things, and global challenges with respect to water and energy, it has never been more important to understand atomic interactions in the functional materials that can provide solutions in these spaces.

Organizers: Katherine Kuchenbecker Ildikó Papp-Wiedmann Matthias Tröndle Claudia Daefler

- Prof. Dr. Thomas Ertl

- MPI-IS, Stuttgart, Lecture Hall 2 D5

Big Data has become the general term relating to the benefits and threats which result from the huge amount of data collected in all parts of society. While data acquisition, storage and access are relevant technical aspects, the analysis of the collected data turns out to be at the core of the Big Data challenge. Automatic data mining and information retrieval techniques have made much progress but many application scenarios remain in which the human in the loop plays an essential role. Consequently, interactive visualization techniques have become a key discipline of Big Data analysis and the field is reaching out to many new application domains. This talk will give examples from current visualization research projects at the University of Stuttgart demonstrating the thematic breadth of application scenarios and the technical depth of the employed methods. We will cover advances in scientific visualization of fields and particles, visual analytics of document collections and movement patterns as well as cognitive aspects.

Organizers: Katherine Kuchenbecker Ildikó Papp-Wiedmann Matthias Tröndle Claudia Daefler

- Jim Mainprice

- N3.022 (Aquarium)

Humans act upon their environment through motion, the ability to plan their movements is therefore an essential component of their autonomy. In recent decades, motion planning has been widely studied in robotics and computer graphics. Nevertheless robots still fail to achieve human reactivity and coordination. The need for more efficient motion planning algorithms has been present through out my own research on "human-aware" motion planning, which aims to take the surroundings humans explicitly into account. I believe imitation learning is the key to this particular problem as it allows to learn both, new motion skills and predictive models, two capabilities that are at the heart of "human-aware" robots while simultaneously holding the promise of faster and more reactive motion generation. In this talk I will present my work in this direction.

Archived Talks

- Nadine Rüegg

- PS greenhouse

We transfer a monocular motion stereo 3D reconstruction algorithm from a mobile device (Google Project Tango Tablet) to a rigidly mounted external camera of higher image resolution. A reliable camera synchronization is crucial for the usability of the tablets IMU data and thus a time synchronization method developed. It is based on the joint movement of the cameras. In a second project, we move from outdoor video scenes to aerial images and strive to segment them into polygonal shapes. While most existing approaches address the problem of automated generation of online maps as a pixel-wise segmentation task, we instead frame this problem as constructing polygons representing objects. An approach based on Faster R-CNN, a successful object detection algorithm, is presented.

Organizers: Siyu Tang

- Partha Ghosh

- Aquarium

We propose a new architecture for the learning of predictive spatio-temporal motion models from data alone. Our approach, dubbed the Dropout Autoencoder LSTM, is capable of synthesizing natural looking motion sequences over long time horizons without catastrophic drift or mo- tion degradation. The model consists of two components, a 3-layer recurrent neural network to model temporal aspects and a novel auto-encoder that is trained to implicitly recover the spatial structure of the human skeleton via randomly removing information about joints during train- ing time. This Dropout Autoencoder (D-AE) is then used to filter each predicted pose of the LSTM, reducing accumulation of error and hence drift over time. Furthermore, we propose new evaluation protocols to assess the quality of synthetic motion sequences even for which no groundtruth data exists. The proposed protocols can be used to assess generated sequences of arbitrary length. Finally, we evaluate our proposed method on two of the largest motion- capture datasets available to date and show that our model outperforms the state-of-the-art on a variety of actions, including cyclic and acyclic motion, and that it can produce natural looking sequences over longer time horizons than previous methods.

Organizers: Gerard Pons-Moll

- Endri Dibra

- Aquarium

Estimating 3D shape from monocular 2D images is a challenging and ill-posed problem. Some of these challenges can be alleviated if 3D shape priors are taken into account. In the field of human body shape estimation, research has shown that accurate 3D body estimations can be achieved through optimization, by minimizing error functions on image cues, such as e.g. the silhouette. These methods though, tend to be slow and typically require manual interactions (e.g. for pose estimation). In this talk, we present some recent works that try to overcome such limitations, achieving interactive rates, by learning mappings from 2D image to 3D shape spaces, utilizing data-driven priors, generated from statistically learned parametric shape models. We demonstrate this, either by extracting handcrafted features or directly utilizing CNN-s. Furthermore, we introduce the notion and application of cross-modal or multi-view learning, where abundance of data coming from various views representing the same object at training time, can be leveraged in a semi-supervised setting to boost estimations at test time. Additionally, we show similar applications of the above techniques for the task of 3D garment estimation from a single image.

Organizers: Gerard Pons-Moll

- Endri Dibra

- Aquarium

Estimating 3D shape from monocular 2D images is a challenging and ill-posed problem. Some of these challenges can be alleviated if 3D shape priors are taken into account. In the field of human body shape estimation, research has shown that accurate 3D body estimations can be achieved through optimization, by minimizing error functions on image cues, such as e.g. the silhouette. These methods though, tend to be slow and typically require manual interactions (e.g. for pose estimation). In this talk, we present some recent works that try to overcome such limitations, achieving interactive rates, by learning mappings from 2D image to 3D shape spaces, utilizing data-driven priors, generated from statistically learned parametric shape models. We demonstrate this, either by extracting handcrafted features or directly utilizing CNN-s. Furthermore, we introduce the notion and application of cross-modal or multi-view learning, where abundance of data coming from various views representing the same object at training time, can be leveraged in a semi-supervised setting to boost estimations at test time. Additionally, we show similar applications of the above techniques for the task of 3D garment estimation from a single image.

Organizers: Gerard Pons-Moll

- Sven Dickinson

- Green-House (PS)

Human observers can classify photographs of real-world scenes after only a very brief exposure to the image (Potter & Levy, 1969; Thorpe, Fize, Marlot, et al., 1996; VanRullen & Thorpe, 2001). Line drawings of natural scenes have been shown to capture essential structural information required for successful scene categorization (Walther et al., 2011). Here, we investigate how the spatial relationships between lines and line segments in the line drawings affect scene classification. In one experiment, we tested the effect of removing either the junctions or the middle segments between junctions. Surprisingly, participants performed better when shown the middle segments (47.5%) than when shown the junctions (42.2%). It appeared as if the images with middle segments tended to maintain the most parallel/locally symmetric portions of the contours. In order to test this hypothesis, in a second experiment, we either removed the most symmetric half of the contour pixels or the least symmetric half of the contour pixels using a novel method of measuring the local symmetry of each contour pixel in the image. Participants were much better at categorizing images containing the most symmetric contour pixels (49.7%) than the least symmetric (38.2%). Thus, results from both experiments demonstrate that local contour symmetry is a crucial organizing principle in complex real-world scenes. Joint work with John Wilder (UofT CS, Psych), Morteza Rezanejad (McGill CS), Kaleem Siddiqi (McGill CS), Allan Jepson (UofT CS), and Dirk Bernhardt-Walther (UofT Psych), to be presented at VSS 2017.

Organizers: Ahmed Osman

IS Colloquium

- 29 May 2017 • 12:00 13:00

- Sebastian Nowozin

- Max Planck House Lecture Hall

Probabilistic deep learning methods have recently made great progress for generative and discriminative modeling. I will give a brief overview of recent developments and then present two contributions. The first is on a generalization of generative adversarial networks (GAN), extending their use considerably. GANs can be shown to approximately minimize the Jensen-Shannon divergence between two distributions, the true sampling distribution and the model distribution. We extend GANs to the class of f-divergences which include popular divergences such as the Kullback-Leibler divergence. This enables applications to variational inference and likelihood-free maximum likelihood, as well as enables GAN models to become basic building blocks in larger models. The second contribution is to consider representation learning using variational autoencoder models. To make learned representations of data useful we need ground them in semantic concepts. We propose a generative model that can decompose an observation into multiple separate latent factors, each of which represents a separate concept. Such disentangled representation is useful for recognition and for precise control in generative modeling. We learn our representations using weak supervision in the form of groups of observations where all samples within a group share the same value in a given latent factor. To make such learning feasible we generalize recent methods for amortized probabilistic inference to the dependent case. Joint work with: Ryota Tomioka (MSR Cambridge), Botond Cseke (MSR Cambridge), Diane Bouchacourt (Oxford)

Organizers: Lars Mescheder

- Yael Moses

- Greenhouse (PS)

Dynamic events such as family gatherings, concerts or sports events are often photographed by a group of people. The set of still images obtained this way is rich in dynamic content. We consider the question of whether such a set of still images, rather the traditional video sequences, can be used for analyzing the dynamic content of the scene. This talk will describe several instances of this problem, their solutions and directions for future studies. In particular, we will present a method to extend epipolar geometry to predict location of a moving feature in CrowdCam images. The method assumes that the temporal order of the set of images, namely photo-sequencing, is given. We will briefly describe our method to compute photo-sequencing using geometric considerations and rank aggregation. We will also present a method for identifying the moving regions in a scene, which is a basic component in dynamic scene analysis. Finally, we will consider a new vision of developing collaborative CrowdCam, and a first step toward this goal.

Organizers: Jonas Wulff

- Guido Montúfar

- N4.022 (EI Dept. meeting room / 4th floor, north building)

Deep Learning is one of the most successful machine learning approaches to artificial intelligence. In this talk I discuss the geometry of neural networks as a way to study the success of Deep Learning at a mathematical level and to develop a theoretical basis for making further advances, especially in situations with limited amounts of data and challenging problems in reinforcement learning. I present a few recent results on the representational power of neural networks and then demonstrate how to align this with structures from perception-action problems in order to obtain more efficient learning systems.

Organizers: Jane Walters

- Dino Sejdinovic

Kernel embeddings of distributions and the Maximum Mean Discrepancy (MMD), the resulting distance between distributions, are useful tools for fully nonparametric hypothesis testing and for learning on distributional inputs. I will give an overview of this framework and present some of its recent applications within the context of approximate Bayesian inference. Further, I will discuss a recent modification of MMD which aims to encode invariance to additive symmetric noise and leads to learning on distributions robust to the distributional covariate shift, e.g. where measurement noise on the training data differs from that on the testing data.

Organizers: Philipp Hennig

- Cordelia Schmid

- Greenhouse (PS)

This talk addresses the task of segmenting moving objects in unconstrained videos. We introduce a novel two-stream neural network with an explicit memory module to achieve this. The two streams of the network encode spatial and temporal features in a video sequence respectively, while the memory module captures the evolution of objects over time. The module to build a “visual memory” in video, i.e., a joint representation of all the video frames, is realized with a convolutional recurrent unit learned from a small number of training video sequences. Given video frames as input, our approach first assigns each pixel an object or background label obtained with an encoder-decoder network that takes as input optical flow and is trained on synthetic data. Next, a “visual memory” specific to the video is acquired automatically without any manually-annotated frames. The visual memory is implemented with convolutional gated recurrent units, which allows to propagate spatial information over time. We evaluate our method extensively on two benchmarks, DAVIS and Freiburg-Berkeley motion segmentation datasets, and show state-of-the-art results. This is joint work with K. Alahari and P. Tokmakov.

Organizers: Osman Ulusoy