Conversational agents in the form of virtual agents or social robots are rapidly becoming wide-spread. Humans use non-verbal behaviors to signal their intent, emotions and attitudes in human-human interactions. Conversational agents therefore need this ability as well in order to make an interaction pleasant and efficient. An important part of non-verbal communication is gesticulation: gestures communicate a large share of non-verbal content. Previous systems for gesture production were typically rule-based and could not represent the range of human gestures. Recently the gesture generation field has shifted to data-driven approaches. We follow this line of research by extending the state-of-the-art deep-learning based model. Our model leverages representation learning to enhance speech-gesture mapping. We provide analysis of different representations for the input (speech) and the output (motion) of the network by both objective and subjective evaluations. We also analyze the importance of smoothing of the produced motion and emphasize how challenging it is to evaluate gesture quality. In the future we plan to enrich input signal by taking semantic context (text transcription) as well, make the model probabilistic and evaluate our system on the social robot NAO.
Fingertip skin friction plays a critical role during object manipulation. We will describe a simple and reliable method to estimate the fingertip static coefficient of friction (CF) continuously and quickly during object manipulation, and we will describe a global expression of the CF as a function of the normal force and fingertip moisture. Then we will show how skin hydration modifies the skin deformation dynamics during grip-like contacts. Certain motor behaviours observed during object manipulation could be explained by the effects of skin hydration. Then the biomechanics of the partial slip phenomenon will be described, and we will examine how this partial slip phenomenon is related to the subjective perception of fingertip slip.
Future cities and infrastructure systems will evolve into complex conglomerates where autonomous aerial, aquatic and ground-based robots will coexist with people and cooperate in symbiosis. To create this human-robot ecosystem, robots will need to respond more flexibly, robustly and efficiently than they do today. They will need to be designed with the ability to move across terrain boundaries and physically interact with infrastructure elements to perform sensing and intervention tasks. Taking inspiration from nature, aerial robotic systems can integrate multi-functional morphology, new materials, energy-efficient locomotion principles and advanced perception abilities that will allow them to successfully operate and cooperate in complex and dynamic environments. This talk will describe the scientific fundamentals, design principles and technologies for the development of biologically inspired flying robots with adaptive morphology that can perform monitoring and manufacturing tasks for future infrastructure and building systems. Examples will include flying robots with perching capabilities and origami-based landing systems, drones for aerial construction and repair, and combustion-based jet thrusters for aerial-aquatic vehicles.
Organizers: Metin Sitti
Visual Question Answering is one of the applications of Deep Learning that is pushing towards real Artificial Intelligence. It turns the typical deep learning process around by only defining the task to be carried out after the training has taken place, which changes the task fundamentally. We have developed a range of strategies for incorporating other information sources into deep learning-based methods, and the process taken a step towards developing algorithms which learn how to use other algorithms to solve a problem, rather than solving it directly. This talk thus covers some of the high-level questions about the types of challenges Deep Learning can be applied to, and how we might separate the things its good at from those that it’s not.
Organizers: Siyu Tang
Enabling robots for interaction with humans and unknown environments has been one of the primary goals of robotics research over decades. I will outline how human-centered robot design, nonlinear soft-robotics control inspired by human neuromechanics and physics grounded learning algorithms will let robots become a commodity in our near-future society. In particular, compliant and energy-controlled ultra-lightweight systems capable of complex collision handling enable high-performance human assistance over a wide variety of application domains. Together with novel methods for dynamics and skill learning, flexible and easy-to-use robotic power tools and systems can be designed. Recently, our work has led to the first next generation robot Franka Emika that has recently become commercially available. The system is able to safely interact with humans, execute and even learn sensitive manipulation skills, is affordable and designed as a distributed interconnected system.
Organizers: Eva Laemmerhirt
In this talk I introduce the neural statistician as an approach for meta learning. The neural statistician learns to appropriately summarise datasets through a learnt statistic vector. This can be used for few shot learning, by computing the statistic vectors for the presented data, and using these statistics as context variables for one-shot classification and generation. I will show how we can generalise the neural statistician to a context aware learner that learns to characterise and combine independently learnt contexts. I will also demonstrate an approach for meta-learning data augmentation strategies. Acknowledgments: This work is joint work with Harri Edwards, Antreas Antoniou, and Conor Durkan.
Organizers: Philipp Hennig
The field of transportation is undergoing a seismic change with the coming introduction of autonomous driving. The technologies required to enable computer driven cars involves the latest cutting edge artificial intelligence algorithms along three major thrusts: Sensing, Planning and Mapping. Prof. Amnon Shashua, Co-founder and Chairman of Mobileye, will describe the challenges and the kind of machine learning algorithms involved, but will do that through the perspective of Mobileye’s activity in this domain.
Organizers: Michael Black
The fundamental building block in many learning models is the distance measure that is used. Usually, the linear distance is used for simplicity. Replacing this stiff distance measure with a flexible one could potentially give a better representation of the actual distance between two points. I will present how the normal distribution changes if the distance measure respects the underlying structure of the data. In particular, a Riemannian manifold will be learned based on observations. The geodesic curve can then be computed—a length-minimizing curve under the Riemannian measure. With this flexible distance measure we get a normal distribution that locally adapts to the data. A maximum likelihood estimation scheme is provided for inference of the parameters mean and covariance, and also, a systematic way to choose the parameter defining the Riemannian manifold. Results on synthetic and real world data demonstrate the efficiency of the proposed model to fit non-trivial probability distributions.
Organizers: Philipp Hennig
In this talk I will first outline my different research projects. I will then focus on the EACare project, a quite newly started multi-disciplinary collaboration with the aim to develop an embodied system, capable of carrying out neuropsychological tests to detect early signs of dementia, e.g., due to Alzheimer's disease. The system will use methods from Machine Learning and Social Robotics, and be trained with examples of recorded clinician-patient interactions. The interaction will be developed using a participatory design approach. I describe the scope and method of the project, and report on a first Wizard of Oz prototype.
Creating convincing human facial animation is challenging. Face animation is often hand-crafted by artists separately from body motion. Alternatively, if the face animation is derived from motion capture, it is typically performed while the actor is relatively still. Recombining the isolated face animation with body motion is non-trivial and often results in uncanny results if the body dynamics are not properly reflected on the face (e.g. cheeks wiggling when running). In this talk, I will discuss the challenges of human soft tissue simulation and control. I will then present our method for adding physical effects to facial blendshape animation. Unlike previous methods that try to add physics to face rigs, our method can combine facial animation and rigid body motion consistently while preserving the original animation as closely as possible. Our novel simulation framework uses the original animation as per-frame rest-poses without adding spurious forces. We also propose the concept of blendmaterials to give artists an intuitive means to control the changing material properties due to muscle activation.
Organizers: Timo Bolkart
Performance metrics are a key component of machine learning systems, and are ideally constructed to reflect real world tradeoffs. In contrast, much of the literature simply focuses on algorithms for maximizing accuracy. With the increasing integration of machine learning into real systems, it is clear that accuracy is an insufficient measure of performance for many problems of interest. Unfortunately, unlike accuracy, many real world performance metrics are non-decomposable i.e. cannot be computed as a sum of losses for each instance. Thus, known algorithms and associated analysis are not trivially extended, and direct approaches require expensive combinatorial optimization. I will outline recent results characterizing population optimal classifiers for large families of binary and multilabel classification metrics, including such nonlinear metrics as F-measure and Jaccard measure. Perhaps surprisingly, the prediction which maximizes the utility for a range of such metrics takes a simple form. This results in simple and scalable procedures for optimizing complex metrics in practice. I will also outline how the same analysis gives optimal procedures for selecting point estimates from complex posterior distributions for structured objects such as graphs. Joint work with Nagarajan Natarajan, Bowei Yan, Kai Zhong, Pradeep Ravikumar and Inderjit Dhillon.
Organizers: Mijung Park
Writing and maintaining programs for robots poses some interesting challenges. It is hard to generalize them, as their targets are more than computing platforms. It can be deceptive to see them as input to output mappings, as interesting environments result in unpredictable inputs, and mixing reactive and deliberative behavior make intended outputs hard to define. Given the wide and fragmented landscape of components, from hardware to software, and the parties involved in providing and using them, integration is also a non-trivial aspect. The talk will illustrate the work ongoing at Fraunhofer IPA to tackle these challenges, how Open Source is its common trait, and how this translates into the industrial field thanks to the ROS-Industrial initiative.
Organizers: Vincent Berenz