Header logo is


2019


Learning to Explore in Motion and Interaction Tasks
Learning to Explore in Motion and Interaction Tasks

Bogdanovic, M., Righetti, L.

Proceedings 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages: 2686-2692, IEEE, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), November 2019, ISSN: 2153-0866 (conference)

Abstract
Model free reinforcement learning suffers from the high sampling complexity inherent to robotic manipulation or locomotion tasks. Most successful approaches typically use random sampling strategies which leads to slow policy convergence. In this paper we present a novel approach for efficient exploration that leverages previously learned tasks. We exploit the fact that the same system is used across many tasks and build a generative model for exploration based on data from previously solved tasks to improve learning new tasks. The approach also enables continuous learning of improved exploration strategies as novel tasks are learned. Extensive simulations on a robot manipulator performing a variety of motion and contact interaction tasks demonstrate the capabilities of the approach. In particular, our experiments suggest that the exploration strategy can more than double learning speed, especially when rewards are sparse. Moreover, the algorithm is robust to task variations and parameter tuning, making it beneficial for complex robotic problems.

mg

DOI [BibTex]

2019


DOI [BibTex]


Attacking Optical Flow
Attacking Optical Flow

Ranjan, A., Janai, J., Geiger, A., Black, M. J.

In Proceedings International Conference on Computer Vision (ICCV), pages: 2404-2413, IEEE, 2019 IEEE/CVF International Conference on Computer Vision (ICCV), November 2019, ISSN: 2380-7504 (inproceedings)

Abstract
Deep neural nets achieve state-of-the-art performance on the problem of optical flow estimation. Since optical flow is used in several safety-critical applications like self-driving cars, it is important to gain insights into the robustness of those techniques. Recently, it has been shown that adversarial attacks easily fool deep neural networks to misclassify objects. The robustness of optical flow networks to adversarial attacks, however, has not been studied so far. In this paper, we extend adversarial patch attacks to optical flow networks and show that such attacks can compromise their performance. We show that corrupting a small patch of less than 1% of the image size can significantly affect optical flow estimates. Our attacks lead to noisy flow estimates that extend significantly beyond the region of the attack, in many cases even completely erasing the motion of objects in the scene. While networks using an encoder-decoder architecture are very sensitive to these attacks, we found that networks using a spatial pyramid architecture are less affected. We analyse the success and failure of attacking both architectures by visualizing their feature maps and comparing them to classical optical flow techniques which are robust to these attacks. We also demonstrate that such attacks are practical by placing a printed pattern into real scenes.

avg ps

Video Project Page Paper Supplementary Material link (url) DOI [BibTex]

Video Project Page Paper Supplementary Material link (url) DOI [BibTex]


A Learnable Safety Measure
A Learnable Safety Measure

Heim, S., Rohr, A. V., Trimpe, S., Badri-Spröwitz, A.

Conference on Robot Learning, November 2019 (conference) Accepted

dlg ics

Arxiv [BibTex]

Arxiv [BibTex]


Occupancy Flow: 4D Reconstruction by Learning Particle Dynamics
Occupancy Flow: 4D Reconstruction by Learning Particle Dynamics

Niemeyer, M., Mescheder, L., Oechsle, M., Geiger, A.

International Conference on Computer Vision, October 2019 (conference)

Abstract
Deep learning based 3D reconstruction techniques have recently achieved impressive results. However, while state-of-the-art methods are able to output complex 3D geometry, it is not clear how to extend these results to time-varying topologies. Approaches treating each time step individually lack continuity and exhibit slow inference, while traditional 4D reconstruction methods often utilize a template model or discretize the 4D space at fixed resolution. In this work, we present Occupancy Flow, a novel spatio-temporal representation of time-varying 3D geometry with implicit correspondences. Towards this goal, we learn a temporally and spatially continuous vector field which assigns a motion vector to every point in space and time. In order to perform dense 4D reconstruction from images or sparse point clouds, we combine our method with a continuous 3D representation. Implicitly, our model yields correspondences over time, thus enabling fast inference while providing a sound physical description of the temporal dynamics. We show that our method can be used for interpolation and reconstruction tasks, and demonstrate the accuracy of the learned correspondences. We believe that Occupancy Flow is a promising new 4D representation which will be useful for a variety of spatio-temporal reconstruction tasks.

avg

pdf poster suppmat code Project page video blog [BibTex]


Texture Fields: Learning Texture Representations in Function Space
Texture Fields: Learning Texture Representations in Function Space

Oechsle, M., Mescheder, L., Niemeyer, M., Strauss, T., Geiger, A.

International Conference on Computer Vision, October 2019 (conference)

Abstract
In recent years, substantial progress has been achieved in learning-based reconstruction of 3D objects. At the same time, generative models were proposed that can generate highly realistic images. However, despite this success in these closely related tasks, texture reconstruction of 3D objects has received little attention from the research community and state-of-the-art methods are either limited to comparably low resolution or constrained experimental setups. A major reason for these limitations is that common representations of texture are inefficient or hard to interface for modern deep learning techniques. In this paper, we propose Texture Fields, a novel texture representation which is based on regressing a continuous 3D function parameterized with a neural network. Our approach circumvents limiting factors like shape discretization and parameterization, as the proposed texture representation is independent of the shape representation of the 3D object. We show that Texture Fields are able to represent high frequency texture and naturally blend with modern deep learning techniques. Experimentally, we find that Texture Fields compare favorably to state-of-the-art methods for conditional texture reconstruction of 3D objects and enable learning of probabilistic generative models for texturing unseen 3D models. We believe that Texture Fields will become an important building block for the next generation of generative 3D models.

avg

pdf suppmat video poster blog Project Page [BibTex]


no image
Robust Humanoid Locomotion Using Trajectory Optimization and Sample-Efficient Learning

Yeganegi, M. H., Khadiv, M., Moosavian, S. A. A., Zhu, J., Prete, A. D., Righetti, L.

Proceedings International Conference on Humanoid Robots, IEEE, 2019 IEEE-RAS International Conference on Humanoid Robots, October 2019 (conference)

Abstract
Trajectory optimization (TO) is one of the most powerful tools for generating feasible motions for humanoid robots. However, including uncertainties and stochasticity in the TO problem to generate robust motions can easily lead to intractable problems. Furthermore, since the models used in TO have always some level of abstraction, it can be hard to find a realistic set of uncertainties in the model space. In this paper we leverage a sample-efficient learning technique (Bayesian optimization) to robustify TO for humanoid locomotion. The main idea is to use data from full-body simulations to make the TO stage robust by tuning the cost weights. To this end, we split the TO problem into two phases. The first phase solves a convex optimization problem for generating center of mass (CoM) trajectories based on simplified linear dynamics. The second stage employs iterative Linear-Quadratic Gaussian (iLQG) as a whole-body controller to generate full body control inputs. Then we use Bayesian optimization to find the cost weights to use in the first stage that yields robust performance in the simulation/experiment, in the presence of different disturbance/uncertainties. The results show that the proposed approach is able to generate robust motions for different sets of disturbances and uncertainties.

mg

https://arxiv.org/abs/1907.04616 link (url) [BibTex]

https://arxiv.org/abs/1907.04616 link (url) [BibTex]


NoVA: Learning to See in Novel Viewpoints and Domains
NoVA: Learning to See in Novel Viewpoints and Domains

Coors, B., Condurache, A. P., Geiger, A.

In 2019 International Conference on 3D Vision (3DV), pages: 116-125, IEEE, 2019 International Conference on 3D Vision (3DV), September 2019 (inproceedings)

Abstract
Domain adaptation techniques enable the re-use and transfer of existing labeled datasets from a source to a target domain in which little or no labeled data exists. Recently, image-level domain adaptation approaches have demonstrated impressive results in adapting from synthetic to real-world environments by translating source images to the style of a target domain. However, the domain gap between source and target may not only be caused by a different style but also by a change in viewpoint. This case necessitates a semantically consistent translation of source images and labels to the style and viewpoint of the target domain. In this work, we propose the Novel Viewpoint Adaptation (NoVA) model, which enables unsupervised adaptation to a novel viewpoint in a target domain for which no labeled data is available. NoVA utilizes an explicit representation of the 3D scene geometry to translate source view images and labels to the target view. Experiments on adaptation to synthetic and real-world datasets show the benefit of NoVA compared to state-of-the-art domain adaptation approaches on the task of semantic segmentation.

avg

pdf suppmat poster video DOI [BibTex]

pdf suppmat poster video DOI [BibTex]


Trunk Pitch Oscillations for Joint Load Redistribution in Humans and Humanoid Robots
Trunk Pitch Oscillations for Joint Load Redistribution in Humans and Humanoid Robots

Drama, Ö., Badri-Spröwitz, A.

Proceedings International Conference on Humanoid Robots, Humanoids, September 2019 (conference) Accepted

dlg

link (url) [BibTex]

link (url) [BibTex]


The positive side of damping
The positive side of damping

Heim, S., Millard, M., Le Mouel, C., Sproewitz, A.

Proceedings of AMAM, The 9th International Symposium on Adaptive Motion of Animals and Machines, August 2019 (conference) Accepted

dlg

[BibTex]

[BibTex]


Taking a Deeper Look at the Inverse Compositional Algorithm
Taking a Deeper Look at the Inverse Compositional Algorithm

Lv, Z., Dellaert, F., Rehg, J. M., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
In this paper, we provide a modern synthesis of the classic inverse compositional algorithm for dense image alignment. We first discuss the assumptions made by this well-established technique, and subsequently propose to relax these assumptions by incorporating data-driven priors into this model. More specifically, we unroll a robust version of the inverse compositional algorithm and replace multiple components of this algorithm using more expressive models whose parameters we train in an end-to-end fashion from data. Our experiments on several challenging 3D rigid motion estimation tasks demonstrate the advantages of combining optimization with learning-based techniques, outperforming the classic inverse compositional algorithm as well as data-driven image-to-pose regression approaches.

avg

pdf suppmat Video Project Page Poster [BibTex]

pdf suppmat Video Project Page Poster [BibTex]


MOTS: Multi-Object Tracking and Segmentation
MOTS: Multi-Object Tracking and Segmentation

Voigtlaender, P., Krause, M., Osep, A., Luiten, J., Sekar, B. B. G., Geiger, A., Leibe, B.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
This paper extends the popular task of multi-object tracking to multi-object tracking and segmentation (MOTS). Towards this goal, we create dense pixel-level annotations for two existing tracking datasets using a semi-automatic annotation procedure. Our new annotations comprise 65,213 pixel masks for 977 distinct objects (cars and pedestrians) in 10,870 video frames. For evaluation, we extend existing multi-object tracking metrics to this new task. Moreover, we propose a new baseline method which jointly addresses detection, tracking, and segmentation with a single convolutional network. We demonstrate the value of our datasets by achieving improvements in performance when training on MOTS annotations. We believe that our datasets, metrics and baseline will become a valuable resource towards developing multi-object tracking approaches that go beyond 2D bounding boxes.

avg

pdf suppmat Project Page Poster Video Project Page [BibTex]

pdf suppmat Project Page Poster Video Project Page [BibTex]


PointFlowNet: Learning Representations for Rigid Motion Estimation from Point Clouds
PointFlowNet: Learning Representations for Rigid Motion Estimation from Point Clouds

Behl, A., Paschalidou, D., Donne, S., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
Despite significant progress in image-based 3D scene flow estimation, the performance of such approaches has not yet reached the fidelity required by many applications. Simultaneously, these applications are often not restricted to image-based estimation: laser scanners provide a popular alternative to traditional cameras, for example in the context of self-driving cars, as they directly yield a 3D point cloud. In this paper, we propose to estimate 3D motion from such unstructured point clouds using a deep neural network. In a single forward pass, our model jointly predicts 3D scene flow as well as the 3D bounding box and rigid body motion of objects in the scene. While the prospect of estimating 3D scene flow from unstructured point clouds is promising, it is also a challenging task. We show that the traditional global representation of rigid body motion prohibits inference by CNNs, and propose a translation equivariant representation to circumvent this problem. For training our deep network, a large dataset is required. Because of this, we augment real scans from KITTI with virtual objects, realistically modeling occlusions and simulating sensor noise. A thorough comparison with classic and learning-based techniques highlights the robustness of the proposed approach.

avg

pdf suppmat Project Page Poster Video [BibTex]

pdf suppmat Project Page Poster Video [BibTex]


Learning Non-volumetric Depth Fusion using Successive Reprojections
Learning Non-volumetric Depth Fusion using Successive Reprojections

Donne, S., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
Given a set of input views, multi-view stereopsis techniques estimate depth maps to represent the 3D reconstruction of the scene; these are fused into a single, consistent, reconstruction -- most often a point cloud. In this work we propose to learn an auto-regressive depth refinement directly from data. While deep learning has improved the accuracy and speed of depth estimation significantly, learned MVS techniques remain limited to the planesweeping paradigm. We refine a set of input depth maps by successively reprojecting information from neighbouring views to leverage multi-view constraints. Compared to learning-based volumetric fusion techniques, an image-based representation allows significantly more detailed reconstructions; compared to traditional point-based techniques, our method learns noise suppression and surface completion in a data-driven fashion. Due to the limited availability of high-quality reconstruction datasets with ground truth, we introduce two novel synthetic datasets to (pre-)train our network. Our approach is able to improve both the output depth maps and the reconstructed point cloud, for both learned and traditional depth estimation front-ends, on both synthetic and real data.

avg

pdf suppmat Project Page Video Poster blog [BibTex]

pdf suppmat Project Page Video Poster blog [BibTex]


Connecting the Dots: Learning Representations for Active Monocular Depth Estimation
Connecting the Dots: Learning Representations for Active Monocular Depth Estimation

Riegler, G., Liao, Y., Donne, S., Koltun, V., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
We propose a technique for depth estimation with a monocular structured-light camera, \ie, a calibrated stereo set-up with one camera and one laser projector. Instead of formulating the depth estimation via a correspondence search problem, we show that a simple convolutional architecture is sufficient for high-quality disparity estimates in this setting. As accurate ground-truth is hard to obtain, we train our model in a self-supervised fashion with a combination of photometric and geometric losses. Further, we demonstrate that the projected pattern of the structured light sensor can be reliably separated from the ambient information. This can then be used to improve depth boundaries in a weakly supervised fashion by modeling the joint statistics of image and depth edges. The model trained in this fashion compares favorably to the state-of-the-art on challenging synthetic and real-world datasets. In addition, we contribute a novel simulator, which allows to benchmark active depth prediction algorithms in controlled conditions.

avg

pdf suppmat Poster Project Page [BibTex]

pdf suppmat Poster Project Page [BibTex]


Superquadrics Revisited: Learning 3D Shape Parsing beyond Cuboids
Superquadrics Revisited: Learning 3D Shape Parsing beyond Cuboids

Paschalidou, D., Ulusoy, A. O., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
Abstracting complex 3D shapes with parsimonious part-based representations has been a long standing goal in computer vision. This paper presents a learning-based solution to this problem which goes beyond the traditional 3D cuboid representation by exploiting superquadrics as atomic elements. We demonstrate that superquadrics lead to more expressive 3D scene parses while being easier to learn than 3D cuboid representations. Moreover, we provide an analytical solution to the Chamfer loss which avoids the need for computational expensive reinforcement learning or iterative prediction. Our model learns to parse 3D objects into consistent superquadric representations without supervision. Results on various ShapeNet categories as well as the SURREAL human body dataset demonstrate the flexibility of our model in capturing fine details and complex poses that could not have been modelled using cuboids.

avg

Project Page Poster suppmat pdf Video blog handout [BibTex]

Project Page Poster suppmat pdf Video blog handout [BibTex]


Real-Time Dense Mapping for Self-Driving Vehicles using Fisheye Cameras
Real-Time Dense Mapping for Self-Driving Vehicles using Fisheye Cameras

Cui, Z., Heng, L., Yeo, Y. C., Geiger, A., Pollefeys, M., Sattler, T.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2019, IEEE, International Conference on Robotics and Automation, May 2019 (inproceedings)

Abstract
We present a real-time dense geometric mapping algorithm for large-scale environments. Unlike existing methods which use pinhole cameras, our implementation is based on fisheye cameras which have larger field of view and benefit some other tasks including Visual-Inertial Odometry, localization and object detection around vehicles. Our algorithm runs on in-vehicle PCs at 15 Hz approximately, enabling vision-only 3D scene perception for self-driving vehicles. For each synchronized set of images captured by multiple cameras, we first compute a depth map for a reference camera using plane-sweeping stereo. To maintain both accuracy and efficiency, while accounting for the fact that fisheye images have a rather low resolution, we recover the depths using multiple image resolutions. We adopt the fast object detection framework YOLOv3 to remove potentially dynamic objects. At the end of the pipeline, we fuse the fisheye depth images into the truncated signed distance function (TSDF) volume to obtain a 3D map. We evaluate our method on large-scale urban datasets, and results show that our method works well even in complex environments.

avg

pdf video poster Project Page [BibTex]

pdf video poster Project Page [BibTex]


no image
Efficient Humanoid Contact Planning using Learned Centroidal Dynamics Prediction

Lin, Y., Ponton, B., Righetti, L., Berenson, D.

International Conference on Robotics and Automation (ICRA), pages: 5280-5286, IEEE, May 2019 (conference)

mg

DOI [BibTex]

DOI [BibTex]


Leveraging Contact Forces for Learning to Grasp
Leveraging Contact Forces for Learning to Grasp

Merzic, H., Bogdanovic, M., Kappler, D., Righetti, L., Bohg, J.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2019, IEEE, International Conference on Robotics and Automation, May 2019 (inproceedings)

Abstract
Grasping objects under uncertainty remains an open problem in robotics research. This uncertainty is often due to noisy or partial observations of the object pose or shape. To enable a robot to react appropriately to unforeseen effects, it is crucial that it continuously takes sensor feedback into account. While visual feedback is important for inferring a grasp pose and reaching for an object, contact feedback offers valuable information during manipulation and grasp acquisition. In this paper, we use model-free deep reinforcement learning to synthesize control policies that exploit contact sensing to generate robust grasping under uncertainty. We demonstrate our approach on a multi-fingered hand that exhibits more complex finger coordination than the commonly used two- fingered grippers. We conduct extensive experiments in order to assess the performance of the learned policies, with and without contact sensing. While it is possible to learn grasping policies without contact sensing, our results suggest that contact feedback allows for a significant improvement of grasping robustness under object pose uncertainty and for objects with a complex shape.

am mg

video arXiv [BibTex]

video arXiv [BibTex]


Project AutoVision: Localization and 3D Scene Perception for an Autonomous Vehicle with a Multi-Camera System
Project AutoVision: Localization and 3D Scene Perception for an Autonomous Vehicle with a Multi-Camera System

Heng, L., Choi, B., Cui, Z., Geppert, M., Hu, S., Kuan, B., Liu, P., Nguyen, R. M. H., Yeo, Y. C., Geiger, A., Lee, G. H., Pollefeys, M., Sattler, T.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2019, IEEE, International Conference on Robotics and Automation, May 2019 (inproceedings)

Abstract
Project AutoVision aims to develop localization and 3D scene perception capabilities for a self-driving vehicle. Such capabilities will enable autonomous navigation in urban and rural environments, in day and night, and with cameras as the only exteroceptive sensors. The sensor suite employs many cameras for both 360-degree coverage and accurate multi-view stereo; the use of low-cost cameras keeps the cost of this sensor suite to a minimum. In addition, the project seeks to extend the operating envelope to include GNSS-less conditions which are typical for environments with tall buildings, foliage, and tunnels. Emphasis is placed on leveraging multi-view geometry and deep learning to enable the vehicle to localize and perceive in 3D space. This paper presents an overview of the project, and describes the sensor suite and current progress in the areas of calibration, localization, and perception.

avg

pdf [BibTex]

pdf [BibTex]


Quantifying the Robustness of Natural Dynamics: a Viability Approach
Quantifying the Robustness of Natural Dynamics: a Viability Approach

Heim, S., Sproewitz, A.

Proceedings of Dynamic Walking , Dynamic Walking , 2019 (conference) Accepted

dlg

Submission DW2019 [BibTex]

Submission DW2019 [BibTex]


no image
Geometric Image Synthesis

Abu Alhaija, H., Mustikovela, S. K., Geiger, A., Rother, C.

Computer Vision – ACCV 2018, 11366, pages: 85-100, Lecture Notes in Computer Science, (Editors: Jawahar, C. and Li, H. and Mori, G. and Schindler, K. ), Asian Conference on Computer Vision, 2019 (conference)

avg

DOI Project Page [BibTex]

DOI Project Page [BibTex]


Occupancy Networks: Learning 3D Reconstruction in Function Space
Occupancy Networks: Learning 3D Reconstruction in Function Space

Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, 2019 (inproceedings)

Abstract
With the advent of deep neural networks, learning-based approaches for 3D reconstruction have gained popularity. However, unlike for images, in 3D there is no canonical representation which is both computationally and memory efficient yet allows for representing high-resolution geometry of arbitrary topology. Many of the state-of-the-art learning-based 3D reconstruction approaches can hence only represent very coarse 3D geometry or are limited to a restricted domain. In this paper, we propose Occupancy Networks, a new representation for learning-based 3D reconstruction methods. Occupancy networks implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier. In contrast to existing approaches, our representation encodes a description of the 3D output at infinite resolution without excessive memory footprint. We validate that our representation can efficiently encode 3D structure and can be inferred from various kinds of input. Our experiments demonstrate competitive results, both qualitatively and quantitatively, for the challenging tasks of 3D reconstruction from single images, noisy point clouds and coarse discrete voxel grids. We believe that occupancy networks will become a useful tool in a wide variety of learning-based 3D tasks.

avg

Code Video pdf suppmat Project Page blog [BibTex]

Code Video pdf suppmat Project Page blog [BibTex]

2017


The Numerics of GANs
The Numerics of GANs

Mescheder, L., Nowozin, S., Geiger, A.

In Proceedings from the conference "Neural Information Processing Systems 2017., (Editors: Guyon I. and Luxburg U.v. and Bengio S. and Wallach H. and Fergus R. and Vishwanathan S. and Garnett R.), Curran Associates, Inc., Advances in Neural Information Processing Systems 30 (NIPS), December 2017 (inproceedings)

Abstract
In this paper, we analyze the numerics of common algorithms for training Generative Adversarial Networks (GANs). Using the formalism of smooth two-player games we analyze the associated gradient vector field of GAN training objectives. Our findings suggest that the convergence of current algorithms suffers due to two factors: i) presence of eigenvalues of the Jacobian of the gradient vector field with zero real-part, and ii) eigenvalues with big imaginary part. Using these findings, we design a new algorithm that overcomes some of these limitations and has better convergence properties. Experimentally, we demonstrate its superiority on training common GAN architectures and show convergence on GAN architectures that are known to be notoriously hard to train.

avg

pdf Project Page [BibTex]

2017


pdf Project Page [BibTex]


Bounding Boxes, Segmentations and Object Coordinates: How Important is Recognition for 3D Scene Flow Estimation in Autonomous Driving Scenarios?
Bounding Boxes, Segmentations and Object Coordinates: How Important is Recognition for 3D Scene Flow Estimation in Autonomous Driving Scenarios?

Behl, A., Jafari, O. H., Mustikovela, S. K., Alhaija, H. A., Rother, C., Geiger, A.

In Proceedings IEEE International Conference on Computer Vision (ICCV), IEEE, Piscataway, NJ, USA, IEEE International Conference on Computer Vision (ICCV), October 2017 (inproceedings)

Abstract
Existing methods for 3D scene flow estimation often fail in the presence of large displacement or local ambiguities, e.g., at texture-less or reflective surfaces. However, these challenges are omnipresent in dynamic road scenes, which is the focus of this work. Our main contribution is to overcome these 3D motion estimation problems by exploiting recognition. In particular, we investigate the importance of recognition granularity, from coarse 2D bounding box estimates over 2D instance segmentations to fine-grained 3D object part predictions. We compute these cues using CNNs trained on a newly annotated dataset of stereo images and integrate them into a CRF-based model for robust 3D scene flow estimation - an approach we term Instance Scene Flow. We analyze the importance of each recognition cue in an ablation study and observe that the instance segmentation cue is by far strongest, in our setting. We demonstrate the effectiveness of our method on the challenging KITTI 2015 scene flow benchmark where we achieve state-of-the-art performance at the time of submission.

avg

pdf suppmat Poster Project Page [BibTex]

pdf suppmat Poster Project Page [BibTex]


Sparsity Invariant CNNs
Sparsity Invariant CNNs

Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.

International Conference on 3D Vision (3DV) 2017, International Conference on 3D Vision (3DV), October 2017 (conference)

Abstract
In this paper, we consider convolutional neural networks operating on sparse inputs with an application to depth upsampling from sparse laser scan data. First, we show that traditional convolutional networks perform poorly when applied to sparse data even when the location of missing data is provided to the network. To overcome this problem, we propose a simple yet effective sparse convolution layer which explicitly considers the location of missing data during the convolution operation. We demonstrate the benefits of the proposed network architecture in synthetic and real experiments \wrt various baseline approaches. Compared to dense baselines, the proposed sparse convolution network generalizes well to novel datasets and is invariant to the level of sparsity in the data. For our evaluation, we derive a novel dataset from the KITTI benchmark, comprising 93k depth annotated RGB images. Our dataset allows for training and evaluating depth upsampling and depth prediction techniques in challenging real-world settings.

avg

pdf suppmat Project Page Project Page [BibTex]

pdf suppmat Project Page Project Page [BibTex]


OctNetFusion: Learning Depth Fusion from Data
OctNetFusion: Learning Depth Fusion from Data

Riegler, G., Ulusoy, A. O., Bischof, H., Geiger, A.

International Conference on 3D Vision (3DV) 2017, International Conference on 3D Vision (3DV), October 2017 (conference)

Abstract
In this paper, we present a learning based approach to depth fusion, i.e., dense 3D reconstruction from multiple depth images. The most common approach to depth fusion is based on averaging truncated signed distance functions, which was originally proposed by Curless and Levoy in 1996. While this method is simple and provides great results, it is not able to reconstruct (partially) occluded surfaces and requires a large number frames to filter out sensor noise and outliers. Motivated by the availability of large 3D model repositories and recent advances in deep learning, we present a novel 3D CNN architecture that learns to predict an implicit surface representation from the input depth maps. Our learning based method significantly outperforms the traditional volumetric fusion approach in terms of noise reduction and outlier suppression. By learning the structure of real world 3D objects and scenes, our approach is further able to reconstruct occluded regions and to fill in gaps in the reconstruction. We demonstrate that our learning based approach outperforms both vanilla TSDF fusion as well as TV-L1 fusion on the task of volumetric fusion. Further, we demonstrate state-of-the-art 3D shape completion results.

avg

pdf Video 1 Video 2 Project Page Project Page [BibTex]

pdf Video 1 Video 2 Project Page Project Page [BibTex]


Direct Visual Odometry for a Fisheye-Stereo Camera
Direct Visual Odometry for a Fisheye-Stereo Camera

Liu, P., Heng, L., Sattler, T., Geiger, A., Pollefeys, M.

In Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, Piscataway, NJ, USA, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2017 (inproceedings)

Abstract
We present a direct visual odometry algorithm for a fisheye-stereo camera. Our algorithm performs simultaneous camera motion estimation and semi-dense reconstruction. The pipeline consists of two threads: a tracking thread and a mapping thread. In the tracking thread, we estimate the camera pose via semi-dense direct image alignment. To have a wider field of view (FoV) which is important for robotic perception, we use fisheye images directly without converting them to conventional pinhole images which come with a limited FoV. To address the epipolar curve problem, plane-sweeping stereo is used for stereo matching and depth initialization. Multiple depth hypotheses are tracked for selected pixels to better capture the uncertainty characteristics of stereo matching. Temporal motion stereo is then used to refine the depth and remove false positive depth hypotheses. Our implementation runs at an average of 20 Hz on a low-end PC. We run experiments in outdoor environments to validate our algorithm, and discuss the experimental results. We experimentally show that we are able to estimate 6D poses with low drift, and at the same time, do semi-dense 3D reconstruction with high accuracy.

avg

pdf Project Page [BibTex]

pdf Project Page [BibTex]


Augmented Reality Meets Deep Learning for Car Instance Segmentation in Urban Scenes
Augmented Reality Meets Deep Learning for Car Instance Segmentation in Urban Scenes

Alhaija, H. A., Mustikovela, S. K., Mescheder, L., Geiger, A., Rother, C.

In Proceedings of the British Machine Vision Conference 2017, Proceedings of the British Machine Vision Conference, September 2017 (inproceedings)

Abstract
The success of deep learning in computer vision is based on the availability of large annotated datasets. To lower the need for hand labeled images, virtually rendered 3D worlds have recently gained popularity. Unfortunately, creating realistic 3D content is challenging on its own and requires significant human effort. In this work, we propose an alternative paradigm which combines real and synthetic data for learning semantic instance segmentation models. Exploiting the fact that not all aspects of the scene are equally important for this task, we propose to augment real-world imagery with virtual objects of the target category. Capturing real-world images at large scale is easy and cheap, and directly provides real background appearances without the need for creating complex 3D models of the environment. We present an efficient procedure to augment these images with virtual objects. This allows us to create realistic composite images which exhibit both realistic background appearance as well as a large number of complex object arrangements. In contrast to modeling complete 3D environments, our data augmentation approach requires only a few user interactions in combination with 3D shapes of the target object category. We demonstrate the utility of the proposed approach for training a state-of-the-art high-capacity deep model for semantic instance segmentation. In particular, we consider the task of segmenting car instances on the KITTI dataset which we have annotated with pixel-accurate ground truth. Our experiments demonstrate that models trained on augmented imagery generalize better than those trained on synthetic data or models trained on limited amounts of annotated real data.

avg

pdf Project Page [BibTex]

pdf Project Page [BibTex]


Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks
Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks

Mescheder, L., Nowozin, S., Geiger, A.

In Proceedings of the 34th International Conference on Machine Learning, 70, Proceedings of Machine Learning Research, (Editors: Doina Precup, Yee Whye Teh), PMLR, International Conference on Machine Learning (ICML), August 2017 (inproceedings)

Abstract
Variational Autoencoders (VAEs) are expressive latent variable models that can be used to learn complex probability distributions from training data. However, the quality of the resulting model crucially relies on the expressiveness of the inference model. We introduce Adversarial Variational Bayes (AVB), a technique for training Variational Autoencoders with arbitrarily expressive inference models. We achieve this by introducing an auxiliary discriminative network that allows to rephrase the maximum-likelihood-problem as a two-player game, hence establishing a principled connection between VAEs and Generative Adversarial Networks (GANs). We show that in the nonparametric limit our method yields an exact maximum-likelihood assignment for the parameters of the generative model, as well as the exact posterior distribution over the latent variables given an observation. Contrary to competing approaches which combine VAEs with GANs, our approach has a clear theoretical justification, retains most advantages of standard Variational Autoencoders and is easy to implement.

avg

pdf suppmat Project Page arxiv-version Project Page [BibTex]

pdf suppmat Project Page arxiv-version Project Page [BibTex]


Slow Flow: Exploiting High-Speed Cameras for Accurate and Diverse Optical Flow Reference Data
Slow Flow: Exploiting High-Speed Cameras for Accurate and Diverse Optical Flow Reference Data

Janai, J., Güney, F., Wulff, J., Black, M., Geiger, A.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, pages: 1406-1416, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
Existing optical flow datasets are limited in size and variability due to the difficulty of capturing dense ground truth. In this paper, we tackle this problem by tracking pixels through densely sampled space-time volumes recorded with a high-speed video camera. Our model exploits the linearity of small motions and reasons about occlusions from multiple frames. Using our technique, we are able to establish accurate reference flow fields outside the laboratory in natural environments. Besides, we show how our predictions can be used to augment the input images with realistic motion blur. We demonstrate the quality of the produced flow fields on synthetic and real-world datasets. Finally, we collect a novel challenging optical flow dataset by applying our technique on data from a high-speed camera and analyze the performance of the state-of-the-art in optical flow under various levels of motion blur.

avg ps

pdf suppmat Project page Video DOI Project Page [BibTex]

pdf suppmat Project page Video DOI Project Page [BibTex]


OctNet: Learning Deep 3D Representations at High Resolutions
OctNet: Learning Deep 3D Representations at High Resolutions

Riegler, G., Ulusoy, O., Geiger, A.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
We present OctNet, a representation for deep learning with sparse 3D data. In contrast to existing models, our representation enables 3D convolutional networks which are both deep and high resolution. Towards this goal, we exploit the sparsity in the input data to hierarchically partition the space using a set of unbalanced octrees where each leaf node stores a pooled feature representation. This allows to focus memory allocation and computation to the relevant dense regions and enables deeper networks without compromising resolution. We demonstrate the utility of our OctNet representation by analyzing the impact of resolution on several 3D tasks including 3D object classification, orientation estimation and point cloud labeling.

avg ps

pdf suppmat Project Page Video Project Page [BibTex]

pdf suppmat Project Page Video Project Page [BibTex]


A Multi-View Stereo Benchmark with High-Resolution Images and Multi-Camera Videos
A Multi-View Stereo Benchmark with High-Resolution Images and Multi-Camera Videos

Schöps, T., Schönberger, J. L., Galliani, S., Sattler, T., Schindler, K., Pollefeys, M., Geiger, A.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
Motivated by the limitations of existing multi-view stereo benchmarks, we present a novel dataset for this task. Towards this goal, we recorded a variety of indoor and outdoor scenes using a high-precision laser scanner and captured both high-resolution DSLR imagery as well as synchronized low-resolution stereo videos with varying fields-of-view. To align the images with the laser scans, we propose a robust technique which minimizes photometric errors conditioned on the geometry. In contrast to previous datasets, our benchmark provides novel challenges and covers a diverse set of viewpoints and scene types, ranging from natural scenes to man-made indoor and outdoor environments. Furthermore, we provide data at significantly higher temporal and spatial resolution. Our benchmark is the first to cover the important use case of hand-held mobile devices while also providing high-resolution DSLR camera images. We make our datasets and an online evaluation server available at http://www.eth3d.net.

avg

pdf suppmat Project Page Project Page [BibTex]

pdf suppmat Project Page Project Page [BibTex]


Toroidal Constraints for Two Point Localization Under High Outlier Ratios
Toroidal Constraints for Two Point Localization Under High Outlier Ratios

Camposeco, F., Sattler, T., Cohen, A., Geiger, A., Pollefeys, M.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
Localizing a query image against a 3D model at large scale is a hard problem, since 2D-3D matches become more and more ambiguous as the model size increases. This creates a need for pose estimation strategies that can handle very low inlier ratios. In this paper, we draw new insights on the geometric information available from the 2D-3D matching process. As modern descriptors are not invariant against large variations in viewpoint, we are able to find the rays in space used to triangulate a given point that are closest to a query descriptor. It is well known that two correspondences constrain the camera to lie on the surface of a torus. Adding the knowledge of direction of triangulation, we are able to approximate the position of the camera from \emphtwo matches alone. We derive a geometric solver that can compute this position in under 1 microsecond. Using this solver, we propose a simple yet powerful outlier filter which scales quadratically in the number of matches. We validate the accuracy of our solver and demonstrate the usefulness of our method in real world settings.

avg

pdf suppmat Project Page Project Page [BibTex]

pdf suppmat Project Page pdf Project Page [BibTex]


Semantic Multi-view Stereo: Jointly Estimating Objects and Voxels
Semantic Multi-view Stereo: Jointly Estimating Objects and Voxels

Ulusoy, A. O., Black, M. J., Geiger, A.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (inproceedings)

Abstract
Dense 3D reconstruction from RGB images is a highly ill-posed problem due to occlusions, textureless or reflective surfaces, as well as other challenges. We propose object-level shape priors to address these ambiguities. Towards this goal, we formulate a probabilistic model that integrates multi-view image evidence with 3D shape information from multiple objects. Inference in this model yields a dense 3D reconstruction of the scene as well as the existence and precise 3D pose of the objects in it. Our approach is able to recover fine details not captured in the input shapes while defaulting to the input models in occluded regions where image evidence is weak. Due to its probabilistic nature, the approach is able to cope with the approximate geometry of the 3D models as well as input shapes that are not present in the scene. We evaluate the approach quantitatively on several challenging indoor and outdoor datasets.

avg ps

YouTube pdf suppmat Project Page [BibTex]

YouTube pdf suppmat Project Page [BibTex]


Scalable Pneumatic and Tendon Driven Robotic Joint Inspired by Jumping Spiders
Scalable Pneumatic and Tendon Driven Robotic Joint Inspired by Jumping Spiders

Sproewitz, A., Göttler, C., Sinha, A., Caer, C., Öztekin, M. U., Petersen, K., Sitti, M.

In Proceedings 2017 IEEE International Conference on Robotics and Automation (ICRA), pages: 64-70, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

dlg

Video link (url) DOI Project Page [BibTex]

Video link (url) DOI Project Page [BibTex]


Linking {Mechanics} and {Learning}
Linking Mechanics and Learning

Heim, S., Grimminger, F., Drama, Ö., Spröwitz, A.

In Proceedings of Dynamic Walking 2017, 2017 (inproceedings)

dlg

[BibTex]

[BibTex]


Is Growing Good for Learning?
Is Growing Good for Learning?

Heim, S., Spröwitz, A.

Proceedings of the 8th International Symposium on Adaptive Motion of Animals and Machines AMAM2017, 2017 (conference)

dlg

[BibTex]

[BibTex]


no image
Pattern Generation for Walking on Slippery Terrains

Khadiv, M., Moosavian, S. A. A., Herzog, A., Righetti, L.

In 2017 5th International Conference on Robotics and Mechatronics (ICROM), Iran, August 2017 (inproceedings)

Abstract
In this paper, we extend state of the art Model Predictive Control (MPC) approaches to generate safe bipedal walking on slippery surfaces. In this setting, we formulate walking as a trade off between realizing a desired walking velocity and preserving robust foot-ground contact. Exploiting this for- mulation inside MPC, we show that safe walking on various flat terrains can be achieved by compromising three main attributes, i. e. walking velocity tracking, the Zero Moment Point (ZMP) modulation, and the Required Coefficient of Friction (RCoF) regulation. Simulation results show that increasing the walking velocity increases the possibility of slippage, while reducing the slippage possibility conflicts with reducing the tip-over possibility of the contact and vice versa.

mg

link (url) [BibTex]

link (url) [BibTex]

2011


no image
Learning Force Control Policies for Compliant Manipulation

Kalakrishnan, M., Righetti, L., Pastor, P., Schaal, S.

In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 4639-4644, IEEE, San Francisco, USA, sep 2011 (inproceedings)

Abstract
Developing robots capable of fine manipulation skills is of major importance in order to build truly assistive robots. These robots need to be compliant in their actuation and control in order to operate safely in human environments. Manipulation tasks imply complex contact interactions with the external world, and involve reasoning about the forces and torques to be applied. Planning under contact conditions is usually impractical due to computational complexity, and a lack of precise dynamics models of the environment. We present an approach to acquiring manipulation skills on compliant robots through reinforcement learning. The initial position control policy for manipulation is initialized through kinesthetic demonstration. We augment this policy with a force/torque profile to be controlled in combination with the position trajectories. We use the Policy Improvement with Path Integrals (PI2) algorithm to learn these force/torque profiles by optimizing a cost function that measures task success. We demonstrate our approach on the Barrett WAM robot arm equipped with a 6-DOF force/torque sensor on two different manipulation tasks: opening a door with a lever door handle, and picking up a pen off the table. We show that the learnt force control policies allow successful, robust execution of the tasks.

am mg

link (url) DOI [BibTex]

2011


link (url) DOI [BibTex]


no image
Control of legged robots with optimal distribution of contact forces

Righetti, L., Buchli, J., Mistry, M., Schaal, S.

In 2011 11th IEEE-RAS International Conference on Humanoid Robots, pages: 318-324, IEEE, Bled, Slovenia, 2011 (inproceedings)

Abstract
The development of agile and safe humanoid robots require controllers that guarantee both high tracking performance and compliance with the environment. More specifically, the control of contact interaction is of crucial importance for robots that will actively interact with their environment. Model-based controllers such as inverse dynamics or operational space control are very appealing as they offer both high tracking performance and compliance. However, while widely used for fully actuated systems such as manipulators, they are not yet standard controllers for legged robots such as humanoids. Indeed such robots are fundamentally different from manipulators as they are underactuated due to their floating-base and subject to switching contact constraints. In this paper we present an inverse dynamics controller for legged robots that use torque redundancy to create an optimal distribution of contact constraints. The resulting controller is able to minimize, given a desired motion, any quadratic cost of the contact constraints at each instant of time. In particular we show how this can be used to minimize tangential forces during locomotion, therefore significantly improving the locomotion of legged robots on difficult terrains. In addition to the theoretical result, we present simulations of a humanoid and a quadruped robot, as well as experiments on a real quadruped robot that demonstrate the advantages of the controller.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning Motion Primitive Goals for Robust Manipulation

Stulp, F., Theodorou, E., Kalakrishnan, M., Pastor, P., Righetti, L., Schaal, S.

In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 325-331, IEEE, San Francisco, USA, sep 2011 (inproceedings)

Abstract
Applying model-free reinforcement learning to manipulation remains challenging for several reasons. First, manipulation involves physical contact, which causes discontinuous cost functions. Second, in manipulation, the end-point of the movement must be chosen carefully, as it represents a grasp which must be adapted to the pose and shape of the object. Finally, there is uncertainty in the object pose, and even the most carefully planned movement may fail if the object is not at the expected position. To address these challenges we 1) present a simplified, computationally more efficient version of our model-free reinforcement learning algorithm PI2; 2) extend PI2 so that it simultaneously learns shape parameters and goal parameters of motion primitives; 3) use shape and goal learning to acquire motion primitives that are robust to object pose uncertainty. We evaluate these contributions on a manipulation platform consisting of a 7-DOF arm with a 4-DOF hand.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Inverse Dynamics Control of Floating-Base Robots with External Constraints: a Unified View

Righetti, L., Buchli, J., Mistry, M., Schaal, S.

In 2011 IEEE International Conference on Robotics and Automation, pages: 1085-1090, IEEE, Shanghai, China, 2011 (inproceedings)

Abstract
Inverse dynamics controllers and operational space controllers have proved to be very efficient for compliant control of fully actuated robots such as fixed base manipulators. However legged robots such as humanoids are inherently different as they are underactuated and subject to switching external contact constraints. Recently several methods have been proposed to create inverse dynamics controllers and operational space controllers for these robots. In an attempt to compare these different approaches, we develop a general framework for inverse dynamics control and show that these methods lead to very similar controllers. We are then able to greatly simplify recent whole-body controllers based on operational space approaches using kinematic projections, bringing them closer to efficient practical implementations. We also generalize these controllers such that they can be optimal under an arbitrary quadratic cost in the commands.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Operational Space Control of Constrained and Underactuated Systems

Mistry, M., Righetti, L.

In Proceedings of Robotics: Science and Systems, Los Angeles, CA, USA, June 2011 (inproceedings)

Abstract
The operational space formulation (Khatib, 1987), applied to rigid-body manipulators, describes how to decouple task-space and null-space dynamics, and write control equations that correspond only to forces at the end-effector or, alternatively, only to motion within the null-space. We would like to apply this useful theory to modern humanoids and other legged systems, for manipulation or similar tasks, however these systems present additional challenges due to their underactuated floating bases and contact states that can dynamically change. In recent work, Sentis et al. derived controllers for such systems by implementing a task Jacobian projected into a space consistent with the supporting constraints and underactuation (the so called "support consistent reduced Jacobian"). Here, we take a new approach to derive operational space controllers for constrained underactuated systems, by first considering the operational space dynamics within "projected inverse-dynamics" (Aghili, 2005), and subsequently resolving underactuation through the addition of dynamically consistent control torques. Doing so results in a simplified control solution compared with previous results, and importantly yields several new insights into the underlying problem of operational space control in constrained environments: 1) Underactuated systems, such as humanoid robots, cannot in general completely decouple task and null-space dynamics. However, 2) there may exist an infinite number of control solutions to realize desired task-space dynamics, and 3) these solutions involve the addition of dynamically consistent null-space motion or constraint forces (or combinations of both). In light of these findings, we present several possible control solutions, with varying optimization criteria, and highlight some of their practical consequences.

mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Online movement adaptation based on previous sensor experiences

Pastor, P., Righetti, L., Kalakrishnan, M., Schaal, S.

In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 365-371, IEEE, San Francisco, USA, sep 2011 (inproceedings)

Abstract
Personal robots can only become widespread if they are capable of safely operating among humans. In uncertain and highly dynamic environments such as human households, robots need to be able to instantly adapt their behavior to unforseen events. In this paper, we propose a general framework to achieve very contact-reactive motions for robotic grasping and manipulation. Associating stereotypical movements to particular tasks enables our system to use previous sensor experiences as a predictive model for subsequent task executions. We use dynamical systems, named Dynamic Movement Primitives (DMPs), to learn goal-directed behaviors from demonstration. We exploit their dynamic properties by coupling them with the measured and predicted sensor traces. This feedback loop allows for online adaptation of the movement plan. Our system can create a rich set of possible motions that account for external perturbations and perception uncertainty to generate truly robust behaviors. As an example, we present an application to grasping with the WAM robot arm.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2006


no image
Movement generation using dynamical systems : a humanoid robot performing a drumming task

Degallier, S., Santos, C. P., Righetti, L., Ijspeert, A.

In 2006 6th IEEE-RAS International Conference on Humanoid Robots, pages: 512-517, IEEE, Genova, Italy, 2006 (inproceedings)

Abstract
The online generation of trajectories in humanoid robots remains a difficult problem. In this contribution, we present a system that allows the superposition, and the switch between, discrete and rhythmic movements. Our approach uses nonlinear dynamical systems for generating trajectories online and in real time. Our goal is to make use of attractor properties of dynamical systems in order to provide robustness against small perturbations and to enable online modulation of the trajectories. The system is demonstrated on a humanoid robot performing a drumming task.

mg

link (url) DOI [BibTex]

2006


link (url) DOI [BibTex]


no image
Design methodologies for central pattern generators: an application to crawling humanoids

Righetti, L., Ijspeert, A.

In Proceedings of Robotics: Science and Systems, Philadelphia, USA, August 2006 (inproceedings)

mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Programmable central pattern generators: an application to biped locomotion control

Righetti, L., Ijspeert, A.

In Proceedings of the IEEE International Conference on Robotics and Automation, 2006. ICRA 2006., pages: 1585-1590, IEEE, 2006 (inproceedings)

mg

[BibTex]

[BibTex]