Institute Talks

IS Colloquium

- 25 September 2019 • 13:00 14:00

- Jean-Louis Thonnard

- MPI-IS Stuttgart, Heisenbergstr. 3, Room 2P4

Fingertip skin friction plays a critical role during object manipulation. We will describe a simple and reliable method to estimate the fingertip static coefficient of friction (CF) continuously and quickly during object manipulation, and we will describe a global expression of the CF as a function of the normal force and fingertip moisture. Then we will show how skin hydration modifies the skin deformation dynamics during grip-like contacts. Certain motor behaviours observed during object manipulation could be explained by the effects of skin hydration. Then the biomechanics of the partial slip phenomenon will be described, and we will examine how this partial slip phenomenon is related to the subjective perception of fingertip slip.

Organizers: Katherine J. Kuchenbecker David Gueorguiev

Talk

- 25 September 2019 • 11:00 12:00

- Peter Blümler

- MPI-IS Stuttgart, Heisenbergstr. 3, room 2P04

A new concept of using permanent magnet systems for guiding superparamagnetic nano-particles (SPP) on arbitrary trajectories over a large volume is presented. The same instrument can also be used for magnetic resonance imaging (MRI) using the inherent contrast of the SPP [1]. The basic idea is to use one magnet system, which provides a strong, homogeneous, dipolar magnetic field to magnetize and orient the particles, and a second constantly graded, quadrupolar field, superimposed on the first, to generate a force on the oriented particles. As a result, particles are guided with constant force and in a single direction over the entire volume. Prototypes of various sizes were constructed to demonstrate the principle in two dimensions on several nanoparticles, which were moved along a rough square by manual adjustment of the force angle [1]. Surprisingly even SPP with sizes < 100 nm could be moved with speeds exceeding 10 mm/s due to reversible agglomeration, for which a first hydrodynamic model is presented. Furthermore, a more advanced system with two quadrupoles is presented which allows canceling the force, hence stopping the SPP and moving them around sharp edges. Additionally, this system also allows for MRI and some first experiments are presented. Recently this concept was combined with liquid crystalline elastomers with incorporated SPP to create “micro-robots” whose coarse maneuvers are performed by a MagGuider-system while there microscopic actuation is controlled either by light or temperature [2]. 1. O. Baun, PB, JMMM 439 (2017) 294-304. doi: 10.1016/j.jmmm.2017.05.001 2. D. Ditter, PB et al. Adv. Functional Mater. 1902454 (2019) doi: 10.1002/adfm.201902454

Organizers: Metin Sitti

- Mirko Kovac

- 2R04

Future cities and infrastructure systems will evolve into complex conglomerates where autonomous aerial, aquatic and ground-based robots will coexist with people and cooperate in symbiosis. To create this human-robot ecosystem, robots will need to respond more flexibly, robustly and efficiently than they do today. They will need to be designed with the ability to move across terrain boundaries and physically interact with infrastructure elements to perform sensing and intervention tasks. Taking inspiration from nature, aerial robotic systems can integrate multi-functional morphology, new materials, energy-efficient locomotion principles and advanced perception abilities that will allow them to successfully operate and cooperate in complex and dynamic environments. This talk will describe the scientific fundamentals, design principles and technologies for the development of biologically inspired flying robots with adaptive morphology that can perform monitoring and manufacturing tasks for future infrastructure and building systems. Examples will include flying robots with perching capabilities and origami-based landing systems, drones for aerial construction and repair, and combustion-based jet thrusters for aerial-aquatic vehicles.

Organizers: Metin Sitti

Archived Talks

- Alexey Dosovitskiy

- PS Seminar Room (N3.022)

Our world is dynamic and three-dimensional. Understanding the 3D layout of scenes and the motion of objects is crucial for successfully operating in such an environment. I will talk about two lines of recent research in this direction. One is on end-to-end learning of motion and 3D structure: optical flow estimation, binocular and monocular stereo, direct generation of large volumes with convolutional networks. The other is on sensorimotor control in immersive three-dimensional environments, learned from experience or from demonstration.

Organizers: Lars Mescheder Aseem Behl

- Nadine Rüegg

- PS greenhouse

We transfer a monocular motion stereo 3D reconstruction algorithm from a mobile device (Google Project Tango Tablet) to a rigidly mounted external camera of higher image resolution. A reliable camera synchronization is crucial for the usability of the tablets IMU data and thus a time synchronization method developed. It is based on the joint movement of the cameras. In a second project, we move from outdoor video scenes to aerial images and strive to segment them into polygonal shapes. While most existing approaches address the problem of automated generation of online maps as a pixel-wise segmentation task, we instead frame this problem as constructing polygons representing objects. An approach based on Faster R-CNN, a successful object detection algorithm, is presented.

Organizers: Siyu Tang

- Partha Ghosh

- Aquarium

We propose a new architecture for the learning of predictive spatio-temporal motion models from data alone. Our approach, dubbed the Dropout Autoencoder LSTM, is capable of synthesizing natural looking motion sequences over long time horizons without catastrophic drift or mo- tion degradation. The model consists of two components, a 3-layer recurrent neural network to model temporal aspects and a novel auto-encoder that is trained to implicitly recover the spatial structure of the human skeleton via randomly removing information about joints during train- ing time. This Dropout Autoencoder (D-AE) is then used to filter each predicted pose of the LSTM, reducing accumulation of error and hence drift over time. Furthermore, we propose new evaluation protocols to assess the quality of synthetic motion sequences even for which no groundtruth data exists. The proposed protocols can be used to assess generated sequences of arbitrary length. Finally, we evaluate our proposed method on two of the largest motion- capture datasets available to date and show that our model outperforms the state-of-the-art on a variety of actions, including cyclic and acyclic motion, and that it can produce natural looking sequences over longer time horizons than previous methods.

Organizers: Gerard Pons-Moll

- Endri Dibra

- Aquarium

Estimating 3D shape from monocular 2D images is a challenging and ill-posed problem. Some of these challenges can be alleviated if 3D shape priors are taken into account. In the field of human body shape estimation, research has shown that accurate 3D body estimations can be achieved through optimization, by minimizing error functions on image cues, such as e.g. the silhouette. These methods though, tend to be slow and typically require manual interactions (e.g. for pose estimation). In this talk, we present some recent works that try to overcome such limitations, achieving interactive rates, by learning mappings from 2D image to 3D shape spaces, utilizing data-driven priors, generated from statistically learned parametric shape models. We demonstrate this, either by extracting handcrafted features or directly utilizing CNN-s. Furthermore, we introduce the notion and application of cross-modal or multi-view learning, where abundance of data coming from various views representing the same object at training time, can be leveraged in a semi-supervised setting to boost estimations at test time. Additionally, we show similar applications of the above techniques for the task of 3D garment estimation from a single image.

Organizers: Gerard Pons-Moll

- Endri Dibra

- Aquarium

Estimating 3D shape from monocular 2D images is a challenging and ill-posed problem. Some of these challenges can be alleviated if 3D shape priors are taken into account. In the field of human body shape estimation, research has shown that accurate 3D body estimations can be achieved through optimization, by minimizing error functions on image cues, such as e.g. the silhouette. These methods though, tend to be slow and typically require manual interactions (e.g. for pose estimation). In this talk, we present some recent works that try to overcome such limitations, achieving interactive rates, by learning mappings from 2D image to 3D shape spaces, utilizing data-driven priors, generated from statistically learned parametric shape models. We demonstrate this, either by extracting handcrafted features or directly utilizing CNN-s. Furthermore, we introduce the notion and application of cross-modal or multi-view learning, where abundance of data coming from various views representing the same object at training time, can be leveraged in a semi-supervised setting to boost estimations at test time. Additionally, we show similar applications of the above techniques for the task of 3D garment estimation from a single image.

Organizers: Gerard Pons-Moll

- Sven Dickinson

- Green-House (PS)

Human observers can classify photographs of real-world scenes after only a very brief exposure to the image (Potter & Levy, 1969; Thorpe, Fize, Marlot, et al., 1996; VanRullen & Thorpe, 2001). Line drawings of natural scenes have been shown to capture essential structural information required for successful scene categorization (Walther et al., 2011). Here, we investigate how the spatial relationships between lines and line segments in the line drawings affect scene classification. In one experiment, we tested the effect of removing either the junctions or the middle segments between junctions. Surprisingly, participants performed better when shown the middle segments (47.5%) than when shown the junctions (42.2%). It appeared as if the images with middle segments tended to maintain the most parallel/locally symmetric portions of the contours. In order to test this hypothesis, in a second experiment, we either removed the most symmetric half of the contour pixels or the least symmetric half of the contour pixels using a novel method of measuring the local symmetry of each contour pixel in the image. Participants were much better at categorizing images containing the most symmetric contour pixels (49.7%) than the least symmetric (38.2%). Thus, results from both experiments demonstrate that local contour symmetry is a crucial organizing principle in complex real-world scenes. Joint work with John Wilder (UofT CS, Psych), Morteza Rezanejad (McGill CS), Kaleem Siddiqi (McGill CS), Allan Jepson (UofT CS), and Dirk Bernhardt-Walther (UofT Psych), to be presented at VSS 2017.

Organizers: Ahmed Osman

IS Colloquium

- 29 May 2017 • 12:00 13:00

- Sebastian Nowozin

- Max Planck House Lecture Hall

Probabilistic deep learning methods have recently made great progress for generative and discriminative modeling. I will give a brief overview of recent developments and then present two contributions. The first is on a generalization of generative adversarial networks (GAN), extending their use considerably. GANs can be shown to approximately minimize the Jensen-Shannon divergence between two distributions, the true sampling distribution and the model distribution. We extend GANs to the class of f-divergences which include popular divergences such as the Kullback-Leibler divergence. This enables applications to variational inference and likelihood-free maximum likelihood, as well as enables GAN models to become basic building blocks in larger models. The second contribution is to consider representation learning using variational autoencoder models. To make learned representations of data useful we need ground them in semantic concepts. We propose a generative model that can decompose an observation into multiple separate latent factors, each of which represents a separate concept. Such disentangled representation is useful for recognition and for precise control in generative modeling. We learn our representations using weak supervision in the form of groups of observations where all samples within a group share the same value in a given latent factor. To make such learning feasible we generalize recent methods for amortized probabilistic inference to the dependent case. Joint work with: Ryota Tomioka (MSR Cambridge), Botond Cseke (MSR Cambridge), Diane Bouchacourt (Oxford)

Organizers: Lars Mescheder

- Yael Moses

- Greenhouse (PS)

Dynamic events such as family gatherings, concerts or sports events are often photographed by a group of people. The set of still images obtained this way is rich in dynamic content. We consider the question of whether such a set of still images, rather the traditional video sequences, can be used for analyzing the dynamic content of the scene. This talk will describe several instances of this problem, their solutions and directions for future studies. In particular, we will present a method to extend epipolar geometry to predict location of a moving feature in CrowdCam images. The method assumes that the temporal order of the set of images, namely photo-sequencing, is given. We will briefly describe our method to compute photo-sequencing using geometric considerations and rank aggregation. We will also present a method for identifying the moving regions in a scene, which is a basic component in dynamic scene analysis. Finally, we will consider a new vision of developing collaborative CrowdCam, and a first step toward this goal.

Organizers: Jonas Wulff

- Guido Montúfar

- N4.022 (EI Dept. meeting room / 4th floor, north building)

Deep Learning is one of the most successful machine learning approaches to artificial intelligence. In this talk I discuss the geometry of neural networks as a way to study the success of Deep Learning at a mathematical level and to develop a theoretical basis for making further advances, especially in situations with limited amounts of data and challenging problems in reinforcement learning. I present a few recent results on the representational power of neural networks and then demonstrate how to align this with structures from perception-action problems in order to obtain more efficient learning systems.

Organizers: Jane Walters

- Dino Sejdinovic

Kernel embeddings of distributions and the Maximum Mean Discrepancy (MMD), the resulting distance between distributions, are useful tools for fully nonparametric hypothesis testing and for learning on distributional inputs. I will give an overview of this framework and present some of its recent applications within the context of approximate Bayesian inference. Further, I will discuss a recent modification of MMD which aims to encode invariance to additive symmetric noise and leads to learning on distributions robust to the distributional covariate shift, e.g. where measurement noise on the training data differs from that on the testing data.

Organizers: Philipp Hennig