Since the release of the Kinect, RGB-D cameras have been used in several consumer devices, including smartphones. In this talk, I will present two challenging uses of this technology. With multiple RGB-D cameras, it is possible to reconstruct a 3D scene and visualize it from any point of view. In the first part of the talk, I will show how such a scene can be streamed and rendered as a point cloud in a compelling way and its appearance improved by the use of external cinema cameras. In the second part of the talk, I will present my work on how an RGB-D camera can be used for enabling real-walking in virtual reality by making the user aware of the surrounding obstacles. I present a pipeline to create an occupancy map from a point cloud on the fly on a mobile phone used as a virtual reality headset. This occupancy map can then be used to prevent the user from hitting physical obstacles when walking in the virtual scene.
Organizers: Sergi Pujades
Animals and humans are excellent in conceiving of solutions to physical and geometric problems, for instance in using tools, coming up with creative constructions, or eventually inventing novel mechanisms and machines. Cognitive scientists coined the term intuitive physics in this context. It is a shame we do not yet have good computational models of such capabilities. A main stream of current robotics research focusses on training robots for narrow manipulation skills - often using massive data from physical simulators. Complementary to that we should also try to understand how basic principles underlying physics can directly be used to enable general purpose physical reasoning in robots, rather than sampling data from physical simulations. In this talk I will discuss an approach called Logic-Geometric Programming, which builds a bridge between control theory, AI planning and robot manipulation. It demonstrates strong performance on sequential manipulation problems, but also raises a number of highly interesting fundamental problems, including its probabilistic formulation, reactive execution and learning.
The state-of-the-art robotic systems adopting magnetically actuated ferromagnetic bodies or even whole miniature robots have recently become a fast advancing technological field, especially at the nano and microscale. The mesoscale and above all multiscale magnetically guided robotic systems appear to be the advanced field of study, where it is difficult to reflect different forces, precision and also energy demands. The major goal of our talk is to discuss the challenges in the field of magnetically guided mesoscale and multiscale actuation, followed by the results of our research in the field of magnetic positioning systems and the magnetic soft-robotic grippers.
Organizers: Metin Sitti
Recognition of pain in horses and other animals is important, because pain is a manifestation of disease and decreases animal welfare. Pain diagnostics for humans typically includes self-evaluation and location of the pain with the help of standardized forms, and labeling of the pain by an clinical expert using pain scales. However, animals cannot verbalize their pain as humans can, and the use of standardized pain scales is challenged by the fact that animals as horses and cattle, being prey animals, display subtle and less obvious pain behavior - it is simply beneficial for a prey animal to appear healthy, in order lower the interest from predators. We work together with veterinarians to develop methods for automatic video-based recognition of pain in horses. These methods are typically trained with video examples of behavioral traits labeled with pain level and pain characteristics. This automated, user independent system for recognition of pain behavior in horses will be the first of its kind in the world. A successful system might change the concept for how we monitor and care for our animals.
A dominant trend in manufacturing is the move toward small production volumes and high product variability. It is thus anticipated that future manufacturing automation systems will be characterized by a high degree of autonomy, and must be able to learn new behaviors without explicit programming. Robot Learning, and more generic, Autonomous Manufacturing, is an exciting research field at the intersection of Machine Learning and Automation. The combination of "traditional" control techniques with data-driven algorithms holds the promise of allowing robots to learn new behaviors through experience. This talk introduces selected Siemens research projects in the area of Autonomous Manufacturing.
Active motion of biological and artificial microswimmers is relevant in the real world, in microfluidics, and biological applications but also poses fundamental questions in non-equi- librium statistical physics. Mechanisms of single microswimmers either designed by nature or in the lab need to be understood and a detailed modeling of microorganisms helps to explore their complex cell design and their behavior. It also motivates biomimetic approaches. The emergent collective motion of microswimmers generates appealing dynamic patterns as a consequence of the non-equilibrium.
In this talk, I will present an overview of my Ph.D. research towards articulated human pose estimation from unconstrained images and videos. In the first part of the talk, I will present an approach to jointly model multi-person pose estimation and tracking in a single formulation. The approach represents body joint detections in a video by a spatiotemporal graph and solves an integer linear program to partition the graph into sub-graphs that correspond to plausible body pose trajectories for each person. I will also introduce the PoseTrack dataset and benchmark which is now the de-facto standard for multi-person pose estimation and tracking. In the second half of the talk, I will present a new method for 3D pose estimation from a monocular image through a novel 2.5D pose representation. The new 2.5D representation can be reliably estimated from an RGB image. Furthermore, it allows to exactly reconstruct the absolute 3D body pose up to a scaling factor, which can be estimated additionally if a prior of the body size is given. I will also describe a novel CNN architecture to implicitly learn the heatmaps and depth-maps for human body key-points from a single RGB image.
Organizers: Dimitrios Tzionas
Minimally invasive approaches to vascular disease and cancer have revolutionized medicine. I will discuss novel approaches to vascular bleeding, aneurysm treatment and tumor ablation.
Organizers: Metin Sitti
Many fishes swim efficiently over long distances to find food or during migrations. They also have to accelerate rapidly to escape predators. These two behaviors require different body mechanics: for efficient swimming, fish should be very flexible, but for rapid acceleration, they should be stiffer. Here, I will discuss recent experiments that show that they can use their muscles to tune their effective body mechanics. Control strategies inspired by the muscle activity in fishes may help design better soft robotic devices.
Organizers: Ardian Jusufi
Supervised learning with deep convolutional networks is the workhorse of the majority of computer vision research today. While much progress has been made already, exploiting deep architectures with standard components, enormous datasets, and massive computational power, I will argue that it pays to scrutinize some of the components of modern deep networks. I will begin with looking at the common pooling operation and show how we can replace standard pooling layers with a perceptually-motivated alternative, with consistent gains in accuracy. Next, I will show how we can leverage self-similarity, a well known concept from the study of natural images, to derive non-local layers for various vision tasks that boost the discriminative power. Finally, I will present a lightweight approach to obtaining predictive probabilities in deep networks, allowing to judge the reliability of the prediction.
Organizers: Michael Black
This talk aims to argue for a fine-grained perspective onto human-object interactions, from video sequences. I will present approaches for the understanding of ‘what’ objects one interacts with during daily activities, ‘when’ should we label the temporal boundaries of interactions, ‘which’ semantic labels one can use to describe such interactions and ‘who’ is better when contrasting people perform the same interaction. I will detail my group’s latest works on sub-topics related to: (1) assessing action ‘completion’ – when an interaction is attempted but not completed [BMVC 2018], (2) determining skill or expertise from video sequences [CVPR 2018] and (3) finding unequivocal semantic representations for object interactions [ongoing work]. I will also introduce EPIC-KITCHENS 2018, the recently released largest dataset of object interactions in people’s homes, recorded using wearable cameras. The dataset includes 11.5M frames fully annotated with objects and actions, based on unique annotations from the participants narrating their own videos, thus reflecting true intention. Three open challenges are now available on object detection, action recognition and action anticipation [http://epic-kitchens.github.io]
Organizers: Mohamed Hassan