Since the release of the Kinect, RGB-D cameras have been used in several consumer devices, including smartphones. In this talk, I will present two challenging uses of this technology. With multiple RGB-D cameras, it is possible to reconstruct a 3D scene and visualize it from any point of view. In the first part of the talk, I will show how such a scene can be streamed and rendered as a point cloud in a compelling way and its appearance improved by the use of external cinema cameras. In the second part of the talk, I will present my work on how an RGB-D camera can be used for enabling real-walking in virtual reality by making the user aware of the surrounding obstacles. I present a pipeline to create an occupancy map from a point cloud on the fly on a mobile phone used as a virtual reality headset. This occupancy map can then be used to prevent the user from hitting physical obstacles when walking in the virtual scene.
Organizers: Sergi Pujades
Daniel Renjewski presents research in bipedal gait mechanisms: 'Passive mechanisms for increased power and efficiency in bipedal gait’
Incredible biological capabilities have emerged through evolution. Of special note is the material intelligence that defines the bodies of living things, blurring the line between brain and body. Material robotics research takes the approach of imbuing power, control, sensing, and actuation into all aspects of a (primarily soft) robot body. In this talk, the research topics of material robotics currently underway in the mLab at Oregon State University will be presented. Soft active materials designed and researched in the mLab include liquid metal, biodegradable elastomers, and electroactive fluids. Bioinspired mechanisms include octopus-inspired soft muscles, gecko-inspired adhesives, and snake-like locomotors. Such capabilities, however, introduce new fundamental challenge in making materially-enabled robots. To address these limitation, the mLab is also innovating in techniques to rapidly and scalably manufacture soft materials. Though significant challenges remain to be solved, the development of such soft and materially-enabled components promises to bring robots more and more into our daily lives.
Organizers: Metin Sitti
The definition of art has been debated for more than 1000 years, and continues to be a puzzle. While scientific investigations offer hope of resolving this puzzle, machine learning classifiers that discriminate art from non-art images generally do not provide an explicit definition, and brain imaging and psychological theories are at present too coarse to provide a formal characterization. In this work, rather than approaching the problem using a machine learning approach trained on existing artworks, we hypothesize that art can be defined in terms of preexisting properties of the visual cortex. Specifically, we propose that a broad subset of visual art can be defined as patterns that are exciting to a visual brain. Resting on the finding that artificial neural networks trained on visual tasks can provide predictive models of processing in the visual cortex, our definition is operationalized by using a trained deep net as a surrogate “visual brain”, where “exciting” is defined as the activation energy of particular layers of this net. We find that this definition easily discriminates a variety of art from non-art, and further provides a ranking of art genres that is consistent with our subjective notion of ‘visually exciting’. By applying a deep net visualization technique, we can also validate the definition by generating example images that would be classified as art. The images synthesized under our definition resemble visually exciting art such as Op Art and other human- created artistic patterns.
Organizers: Michael Black
One of the central problems of artificial intelligence is machine perception, i.e., the ability to understand the visual world based on input from sensors such as cameras. In this talk, I will present recent progress with respect to data generation using weak annotations, motion information and synthetic data. I will also discuss our recent results for action recognition, where human tubes and tubelets have shown to be successful. Our tubelets moves away from state-of-the-art frame based approaches and improve classification and localization by relying on joint information from several frames. I also show how to extend this type of method to weakly supervised learning of actions, which allows us to scale to large amounts of data with sparse manual annotation. Furthermore, I discuss several recent extensions, including 3D pose estimation.
Organizers: Ahmed Osman
Actions constitute the way we interact with the world, making motor disabilities such as Parkinson’s disease and stroke devastating. The neurological correlates of the injured brain are challenging to study and correct given the adaptation, redundancy, and distributed nature of our motor system. However, recent studies have used increasingly sophisticated technology to sample from this distributed system, improving our understanding of neural patterns that support movement in healthy brains, or compromise movement in injured brains. One approach to translating these findings to into therapies to restore healthy brain patterns is with closed-loop brain-machine interfaces (BMIs). While closed-loop BMIs have been discussed primarily as assistive technologies the underlying techniques may also be useful for rehabilitation.
Organizers: Katherine J. Kuchenbecker
The recent and ongoing digital world expansion now allows anyone to have access to a tremendous amount of information. However collecting data is not an end in itself and thus techniques must be designed to gain in-depth knowledge from these large data bases.
Organizers: Mara Cascianelli
Quantifying behavior is crucial for many applications in neuroscience. Videography provides easy methods for the observation and recording of animal behavior in diverse settings, yet extracting particular aspects of a behavior for further analysis can be highly time consuming. In motor control studies, humans or other animals are often marked with reflective markers to assist with computer-based tracking, yet markers are intrusive (especially for smaller animals), and the number and location of the markers must be determined a priori. Here, we present a highly efficient method for markerless tracking based on transfer learning with deep neural networks that achieves excellent results with minimal training data. We demonstrate the versatility of this framework by tracking various body parts in a broad collection of experimental settings: mice odor trail-tracking, egg-laying behavior in drosophila, and mouse hand articulation in a skilled forelimb task. For example, during the skilled reaching behavior, individual joints can be automatically tracked (and a confidence score is reported). Remarkably, even when a small number of frames are labeled (≈200), the algorithm achieves excellent tracking performance on test frames that is comparable to human accuracy.
Organizers: Melanie Feldhofer
Today’s robots have motor abilities and sensors that exceed those of humans in many ways: They move more accurately and faster; their sensors see more and at a higher precision and in contrast to humans they can accurately measure even the smallest forces and torques. Robot hands with three, four, or five fingers are commercially available, and, so are advanced dexterous arms. Indeed, modern motion-planning methods have rendered grasp trajectory generation a largely solved problem. Still, no robot to date matches the manipulation skills of industrial assembly workers despite that manipulation of mechanical objects remains essential for the industrial assembly of complex products. So, why are current robots still so bad at manipulation and humans so good?
Organizers: Katherine J. Kuchenbecker
Human shape estimation is an important task for video editing, animation and fashion industry. Predicting 3D human body shape from natural images, however, is highly challenging due to factors such as variation in human bodies, clothing and viewpoint. Prior methods addressing this problem typically attempt to fit parametric body models with certain priors on pose and shape. In this work we argue for an alternative representation and propose BodyNet, a neural network for direct inference of volumetric body shape from a single image. BodyNet is an end-to-end trainable network that benefits from (i) a volumetric 3D loss, (ii) a multi-view re-projection loss, and (iii) intermediate supervision of 2D pose, 2D body part segmentation, and 3D pose. Each of them results in performance improvement as demonstrated by our experiments. To evaluate the method, we fit the SMPL model to our network output and show state-of-the-art results on the SURREAL and Unite the People datasets, outperforming recent approaches. Besides achieving state-of-the-art performance, our method also enables volumetric body-part segmentation.
For many service robots, reactivity to changes in their surroundings is a must. However, developing software suitable for dynamic environments is difficult. Existing robotic middleware allows engineers to design behavior graphs by organizing communication between components. But because these graphs are structurally inflexible, they hardly support the development of complex reactive behavior. To address this limitation, we propose Playful, a software platform that applies reactive programming to the specification of robotic behavior. The front-end of Playful is a scripting language which is simple (only five keywords), yet results in the runtime coordinated activation and deactivation of an arbitrary number of higher-level sensory-motor couplings. When using Playful, developers describe actions of various levels of abstraction via behaviors trees. During runtime an underlying engine applies a mixture of logical constructs to obtain the desired behavior. These constructs include conditional ruling, dynamic prioritization based on resources management and finite state machines. Playful has been successfully used to program an upper-torso humanoid manipulator to perform lively interaction with any human approaching it.