Header logo is


2017


Thumb xl fig1
Locomotion of light-driven soft microrobots through a hydrogel via local melting

Palagi, S., Mark, A. G., Melde, K., Qiu, T., Zeng, H., Parmeggiani, C., Martella, D., Wiersma, D. S., Fischer, P.

In 2017 International Conference on Manipulation, Automation and Robotics at Small Scales (MARSS), pages: 1-5, July 2017 (inproceedings)

Abstract
Soft mobile microrobots whose deformation can be directly controlled by an external field can adapt to move in different environments. This is the case for the light-driven microrobots based on liquid-crystal elastomers (LCEs). Here we show that the soft microrobots can move through an agarose hydrogel by means of light-controlled travelling-wave motions. This is achieved by exploiting the inherent rise of the LCE temperature above the melting temperature of the agarose gel, which facilitates penetration of the microrobot through the hydrogel. The locomotion performance is investigated as a function of the travelling-wave parameters, showing that effective propulsion can be obtained by adapting the generated motion to the specific environmental conditions.

pf

DOI [BibTex]

2017


DOI [BibTex]


Thumb xl mosh heroes icon
Method for providing a three dimensional body model

Loper, M., Mahmood, N., Black, M.

July 2017, U.S.~Patent 9,710,964 B2. (misc)

Abstract
A method for providing a three-dimensional body model which may be applied for an animation, based on a moving body, wherein the method comprises providing a parametric three-dimensional body model, which allows shape and pose variations; applying a standard set of body markers; optimizing the set of body markers by generating an additional set of body markers and applying the same for providing 3D coordinate marker signals for capturing shape and pose of the body and dynamics of soft tissue; and automatically providing an animation by processing the 3D coordinate marker signals in order to provide a personalized three-dimensional body model, based on estimated shape and an estimated pose of the body by means of predicted marker locations.

ps

Google Patents MoSh Project [BibTex]


Thumb xl comp 3e 00000 copy
Non-Equilibrium Assembly of Light-Activated Colloidal Mixtures

Singh, D. P., Choudhury, U., Fischer, P., Mark, A. G.

Advanced Materials, 29, pages: 1701328, June 2017, 32 (article)

Abstract
The collective phenomena exhibited by artificial active matter systems present novel routes to fabricating out-of-equilibrium microscale assemblies. Here, the crystallization of passive silica colloids into well-controlled 2D assemblies is shown, which is directed by a small number of self-propelled active colloids. The active colloids are titania–silica Janus particles that are propelled when illuminated by UV light. The strength of the attractive interaction and thus the extent of the assembled clusters can be regulated by the light intensity. A remarkably small number of the active colloids is sufficient to induce the assembly of the dynamic crystals. The approach produces rationally designed colloidal clusters and crystals with controllable sizes, shapes, and symmetries. This multicomponent active matter system offers the possibility of obtaining structures and assemblies that cannot be found in equilibrium systems.

pf

link (url) DOI [BibTex]


Thumb xl kim et al 2017 advanced materials
Nanodiamonds That Swim

Kim, J. T., Choudhury, U., Hyeon-Ho, J., Fischer, P.

Advanced Materials, 29(30):1701024, June 2017, Back Cover (article)

Abstract
Nanodiamonds are emerging as nanoscale quantum probes for bio-sensing and imaging. This necessitates the development of new methods to accurately manipulate their position and orientation in aqueous solutions. The realization of an “active” nanodiamond (ND) swimmer in fluids, composed of a ND crystal containing nitrogen vacancy centers and a light-driven self-thermophoretic micromotor, is reported. The swimmer is propelled by a local temperature gradient created by laser illumination on its metal-coated side. Its locomotion—from translational to rotational motion—is successfully controlled by shape-dependent hydrodynamic interactions. The precise engineering of the swimmer's geometry is achieved by self-assembly combined with physical vapor shadow growth. The optical addressability of the suspended ND swimmers is demonstrated by observing the electron spin resonance in the presence of magnetic fields. Active motion at the nanoscale enables new sensing capabilities combined with active transport including, potentially, in living organisms.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl image  1
Human Shape Estimation using Statistical Body Models

Loper, M. M.

University of Tübingen, May 2017 (thesis)

Abstract
Human body estimation methods transform real-world observations into predictions about human body state. These estimation methods benefit a variety of health, entertainment, clothing, and ergonomics applications. State may include pose, overall body shape, and appearance. Body state estimation is underconstrained by observations; ambiguity presents itself both in the form of missing data within observations, and also in the form of unknown correspondences between observations. We address this challenge with the use of a statistical body model: a data-driven virtual human. This helps resolve ambiguity in two ways. First, it fills in missing data, meaning that incomplete observations still result in complete shape estimates. Second, the model provides a statistically-motivated penalty for unlikely states, which enables more plausible body shape estimates. Body state inference requires more than a body model; we therefore build obser- vation models whose output is compared with real observations. In this thesis, body state is estimated from three types of observations: 3D motion capture markers, depth and color images, and high-resolution 3D scans. In each case, a forward process is proposed which simulates observations. By comparing observations to the results of the forward process, state can be adjusted to minimize the difference between simulated and observed data. We use gradient-based methods because they are critical to the precise estimation of state with a large number of parameters. The contributions of this work include three parts. First, we propose a method for the estimation of body shape, nonrigid deformation, and pose from 3D markers. Second, we present a concise approach to differentiating through the rendering process, with application to body shape estimation. And finally, we present a statistical body model trained from human body scans, with state-of-the-art fidelity, good runtime performance, and compatibility with existing animation packages.

ps

Official Version [BibTex]


Thumb xl toc image
Soft 3D-Printed Phantom of the Human Kidney with Collecting System

Adams, F., Qiu, T., Mark, A., Fritz, B., Kramer, L., Schlager, D., Wetterauer, U., Miernik, A., Fischer, P.

Ann. of Biomed. Eng., 45(4):963-972, April 2017 (article)

Abstract
Organ models are used for planning and simulation of operations, developing new surgical instruments, and training purposes. There is a substantial demand for in vitro organ phantoms, especially in urological surgery. Animal models and existing simulator systems poorly mimic the detailed morphology and the physical properties of human organs. In this paper, we report a novel fabrication process to make a human kidney phantom with realistic anatomical structures and physical properties. The detailed anatomical structure was directly acquired from high resolution CT data sets of human cadaveric kidneys. The soft phantoms were constructed using a novel technique that combines 3D wax printing and polymer molding. Anatomical details and material properties of the phantoms were validated in detail by CT scan, ultrasound, and endoscopy. CT reconstruction, ultrasound examination, and endoscopy showed that the designed phantom mimics a real kidney's detailed anatomy and correctly corresponds to the targeted human cadaver's upper urinary tract. Soft materials with a tensile modulus of 0.8-1.5 MPa as well as biocompatible hydrogels were used to mimic human kidney tissues. We developed a method of constructing 3D organ models from medical imaging data using a 3D wax printing and molding process. This method is cost-effective means for obtaining a reproducible and robust model suitable for surgical simulation and training purposes.

pf

DOI Project Page [BibTex]

DOI Project Page [BibTex]


Thumb xl fig1
Chapter 8 - Micro- and nanorobots in Newtonian and biological viscoelastic fluids

Palagi, S., (Walker) Schamel, D., Qiu, T., Fischer, P.

In Microbiorobotics, pages: 133 - 162, 8, Micro and Nano Technologies, Second edition, Elsevier, Boston, March 2017 (incollection)

Abstract
Swimming microorganisms are a source of inspiration for small scale robots that are intended to operate in fluidic environments including complex biomedical fluids. Nature has devised swimming strategies that are effective at small scales and at low Reynolds number. These include the rotary corkscrew motion that, for instance, propels a flagellated bacterial cell, as well as the asymmetric beat of appendages that sperm cells or ciliated protozoa use to move through fluids. These mechanisms can overcome the reciprocity that governs the hydrodynamics at small scale. The complex molecular structure of biologically important fluids presents an additional challenge for the effective propulsion of microrobots. In this chapter it is shown how physical and chemical approaches are essential in realizing engineered abiotic micro- and nanorobots that can move in biomedically important environments. Interestingly, we also describe a microswimmer that is effective in biological viscoelastic fluids that does not have a natural analogue.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl eururol2017
Wireless micro-robots for endoscopic applications in urology

Adams, F., Qiu, T., Mark, A. G., Melde, K., Palagi, S., Miernik, A., Fischer, P.

In Eur Urol Suppl, 16(3):e1914, March 2017 (inproceedings)

Abstract
Endoscopy is an essential and common method for both diagnostics and therapy in Urology. Current flexible endoscope is normally cable-driven, thus it is hard to be miniaturized and its reachability is restricted as only one bending section near the tip with one degree of freedom (DoF) is allowed. Recent progresses in micro-robotics offer a unique opportunity for medical inspections in minimally invasive surgery. Micro-robots are active devices that has a feature size smaller than one millimeter and can normally be actuated and controlled wirelessly. Magnetically actuated micro-robots have been demonstrated to propel through biological fluids.Here, we report a novel micro robotic arm, which is actuated wirelessly by ultrasound. It works as a miniaturized endoscope with a side length of ~1 mm, which fits through the 3 Fr. tool channel of a cystoscope, and successfully performs an active cystoscopy in a rabbit bladder.

pf

link (url) DOI [BibTex]


Thumb xl toc image
Pattern formation and collective effects in populations of magnetic microswimmers

Vach, P. J., (Walker) Schamel, D., Fischer, P., Fratzl, P., Faivre, D.

J. of Phys. D: Appl. Phys., 50(11):11LT03, Febuary 2017 (article)

Abstract
Self-propelled particles are one prototype of synthetic active matter used to understand complex biological processes, such as the coordination of movement in bacterial colonies or schools of fishes. Collective patterns such as clusters were observed for such systems, reproducing features of biological organization. However, one limitation of this model is that the synthetic assemblies are made of identical individuals. Here we introduce an active system based on magnetic particles at colloidal scales. We use identical but also randomly-shaped magnetic micropropellers and show that they exhibit dynamic and reversible pattern formation.

pf

DOI [BibTex]

DOI [BibTex]


Thumb xl toc image
On-chip enzymatic microbiofuel cell-powered integrated circuits

Mark, A. G., Suraniti, E., Roche, J., Richter, H., Kuhn, A., Mano, N., Fischer, P.

Lab on a Chip, 17(10):1761-1768, Febuary 2017, Recent HOT Article (article)

Abstract
A variety of diagnostic and therapeutic medical technologies rely on long term implantation of an electronic device to monitor or regulate a patient's condition. One proposed approach to powering these devices is to use a biofuel cell to convert the chemical energy from blood nutrients into electrical current to supply the electronics. We present here an enzymatic microbiofuel cell whose electrodes are directly integrated into a digital electronic circuit. Glucose oxidizing and oxygen reducing enzymes are immobilized on microelectrodes of an application specific integrated circuit (ASIC) using redox hydrogels to produce an enzymatic biofuel cell, capable of harvesting electrical power from just a single droplet of 5 mM glucose solution. Optimisation of the fuel cell voltage and power to match the requirements of the electronics allow self-powered operation of the on-board digital circuitry. This study represents a step towards implantable self-powered electronic devices that gather their energy from physiological fluids.

Recent HOT Article.

pf

DOI [BibTex]

DOI [BibTex]


Thumb xl toc image
Strong Rotational Anisotropies Affect Nonlinear Chiral Metamaterials

Hooper, D. C., Mark, A. G., Kuppe, C., Collins, J. T., Fischer, P., Valev, V. K.

Advanced Materials, 29(13):1605110, January 2017 (article)

Abstract
Masked by rotational anisotropies, the nonlinear chiroptical response of a metamaterial is initially completely inaccessible. Upon rotating the sample the chiral information emerges. These results highlight the need for a general method to extract the true chiral contributions to the nonlinear optical signal, which would be hugely valuable in the present context of increasingly complex chiral meta/nanomaterials.

pf

DOI [BibTex]

DOI [BibTex]


Thumb xl early stopping teaser
Early Stopping Without a Validation Set

Mahsereci, M., Balles, L., Lassner, C., Hennig, P.

arXiv preprint arXiv:1703.09580, 2017 (article)

Abstract
Early stopping is a widely used technique to prevent poor generalization performance when training an over-expressive model by means of gradient-based optimization. To find a good point to halt the optimizer, a common practice is to split the dataset into a training and a smaller validation set to obtain an ongoing estimate of the generalization performance. In this paper we propose a novel early stopping criterion which is based on fast-to-compute, local statistics of the computed gradients and entirely removes the need for a held-out validation set. Our experiments show that this is a viable approach in the setting of least-squares and logistic regression as well as neural networks.

ps pn

link (url) Project Page Project Page [BibTex]


Thumb xl appealingavatars
Appealing Avatars from 3D Body Scans: Perceptual Effects of Stylization

Fleming, R., Mohler, B. J., Romero, J., Black, M. J., Breidt, M.

In Computer Vision, Imaging and Computer Graphics Theory and Applications: 11th International Joint Conference, VISIGRAPP 2016, Rome, Italy, February 27 – 29, 2016, Revised Selected Papers, pages: 175-196, Springer International Publishing, 2017 (inbook)

Abstract
Using styles derived from existing popular character designs, we present a novel automatic stylization technique for body shape and colour information based on a statistical 3D model of human bodies. We investigate whether such stylized body shapes result in increased perceived appeal with two different experiments: One focuses on body shape alone, the other investigates the additional role of surface colour and lighting. Our results consistently show that the most appealing avatar is a partially stylized one. Importantly, avatars with high stylization or no stylization at all were rated to have the least appeal. The inclusion of colour information and improvements to render quality had no significant effect on the overall perceived appeal of the avatars, and we observe that the body shape primarily drives the change in appeal ratings. For body scans with colour information, we found that a partially stylized avatar was perceived as most appealing.

ps

publisher site pdf DOI [BibTex]

publisher site pdf DOI [BibTex]


Thumb xl gcpr2017 nugget
Learning to Filter Object Detections

Prokudin, S., Kappler, D., Nowozin, S., Gehler, P.

In Pattern Recognition: 39th German Conference, GCPR 2017, Basel, Switzerland, September 12–15, 2017, Proceedings, pages: 52-62, Springer International Publishing, Cham, 2017 (inbook)

Abstract
Most object detection systems consist of three stages. First, a set of individual hypotheses for object locations is generated using a proposal generating algorithm. Second, a classifier scores every generated hypothesis independently to obtain a multi-class prediction. Finally, all scored hypotheses are filtered via a non-differentiable and decoupled non-maximum suppression (NMS) post-processing step. In this paper, we propose a filtering network (FNet), a method which replaces NMS with a differentiable neural network that allows joint reasoning and re-scoring of the generated set of hypotheses per image. This formulation enables end-to-end training of the full object detection pipeline. First, we demonstrate that FNet, a feed-forward network architecture, is able to mimic NMS decisions, despite the sequential nature of NMS. We further analyze NMS failures and propose a loss formulation that is better aligned with the mean average precision (mAP) evaluation metric. We evaluate FNet on several standard detection datasets. Results surpass standard NMS on highly occluded settings of a synthetic overlapping MNIST dataset and show competitive behavior on PascalVOC2007 and KITTI detection benchmarks.

ps

Paper link (url) DOI Project Page [BibTex]

Paper link (url) DOI Project Page [BibTex]


Thumb xl web image
Data-Driven Physics for Human Soft Tissue Animation

Kim, M., Pons-Moll, G., Pujades, S., Bang, S., Kim, J., Black, M. J., Lee, S.

ACM Transactions on Graphics, (Proc. SIGGRAPH), 36(4):54:1-54:12, 2017 (article)

Abstract
Data driven models of human poses and soft-tissue deformations can produce very realistic results, but they only model the visible surface of the human body and cannot create skin deformation due to interactions with the environment. Physical simulations can generalize to external forces, but their parameters are difficult to control. In this paper, we present a layered volumetric human body model learned from data. Our model is composed of a data-driven inner layer and a physics-based external layer. The inner layer is driven with a volumetric statistical body model (VSMPL). The soft tissue layer consists of a tetrahedral mesh that is driven using the finite element method (FEM). Model parameters, namely the segmentation of the body into layers and the soft tissue elasticity, are learned directly from 4D registrations of humans exhibiting soft tissue deformations. The learned two layer model is a realistic full-body avatar that generalizes to novel motions and external forces. Experiments show that the resulting avatars produce realistic results on held out sequences and react to external forces. Moreover, the model supports the retargeting of physical properties from one avatar when they share the same topology.

ps

video paper link (url) Project Page [BibTex]

video paper link (url) Project Page [BibTex]


Thumb xl phd thesis teaser
Learning Inference Models for Computer Vision

Jampani, V.

MPI for Intelligent Systems and University of Tübingen, 2017 (phdthesis)

Abstract
Computer vision can be understood as the ability to perform 'inference' on image data. Breakthroughs in computer vision technology are often marked by advances in inference techniques, as even the model design is often dictated by the complexity of inference in them. This thesis proposes learning based inference schemes and demonstrates applications in computer vision. We propose techniques for inference in both generative and discriminative computer vision models. Despite their intuitive appeal, the use of generative models in vision is hampered by the difficulty of posterior inference, which is often too complex or too slow to be practical. We propose techniques for improving inference in two widely used techniques: Markov Chain Monte Carlo (MCMC) sampling and message-passing inference. Our inference strategy is to learn separate discriminative models that assist Bayesian inference in a generative model. Experiments on a range of generative vision models show that the proposed techniques accelerate the inference process and/or converge to better solutions. A main complication in the design of discriminative models is the inclusion of prior knowledge in a principled way. For better inference in discriminative models, we propose techniques that modify the original model itself, as inference is simple evaluation of the model. We concentrate on convolutional neural network (CNN) models and propose a generalization of standard spatial convolutions, which are the basic building blocks of CNN architectures, to bilateral convolutions. First, we generalize the existing use of bilateral filters and then propose new neural network architectures with learnable bilateral filters, which we call `Bilateral Neural Networks'. We show how the bilateral filtering modules can be used for modifying existing CNN architectures for better image segmentation and propose a neural network approach for temporal information propagation in videos. Experiments demonstrate the potential of the proposed bilateral networks on a wide range of vision tasks and datasets. In summary, we propose learning based techniques for better inference in several computer vision models ranging from inverse graphics to freely parameterized neural networks. In generative vision models, our inference techniques alleviate some of the crucial hurdles in Bayesian posterior inference, paving new ways for the use of model based machine learning in vision. In discriminative CNN models, the proposed filter generalizations aid in the design of new neural network architectures that can handle sparse high-dimensional data as well as provide a way for incorporating prior knowledge into CNNs.

ps

pdf [BibTex]

pdf [BibTex]


Thumb xl web teaser eg
Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs

(Best Paper, Eurographics 2017)

Marcard, T. V., Rosenhahn, B., Black, M., Pons-Moll, G.

Computer Graphics Forum 36(2), Proceedings of the 38th Annual Conference of the European Association for Computer Graphics (Eurographics), pages: 349-360 , 2017 (article)

Abstract
We address the problem of making human motion capture in the wild more practical by using a small set of inertial sensors attached to the body. Since the problem is heavily under-constrained, previous methods either use a large number of sensors, which is intrusive, or they require additional video input. We take a different approach and constrain the problem by: (i) making use of a realistic statistical body model that includes anthropometric constraints and (ii) using a joint optimization framework to fit the model to orientation and acceleration measurements over multiple frames. The resulting tracker Sparse Inertial Poser (SIP) enables motion capture using only 6 sensors (attached to the wrists, lower legs, back and head) and works for arbitrary human motions. Experiments on the recently released TNT15 dataset show that, using the same number of sensors, SIP achieves higher accuracy than the dataset baseline without using any video data. We further demonstrate the effectiveness of SIP on newly recorded challenging motions in outdoor scenarios such as climbing or jumping over a wall

ps

video pdf Project Page [BibTex]

video pdf Project Page [BibTex]


Thumb xl pami 2017 teaser
Efficient 2D and 3D Facade Segmentation using Auto-Context

Gadde, R., Jampani, V., Marlet, R., Gehler, P.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017 (article)

Abstract
This paper introduces a fast and efficient segmentation technique for 2D images and 3D point clouds of building facades. Facades of buildings are highly structured and consequently most methods that have been proposed for this problem aim to make use of this strong prior information. Contrary to most prior work, we are describing a system that is almost domain independent and consists of standard segmentation methods. We train a sequence of boosted decision trees using auto-context features. This is learned using stacked generalization. We find that this technique performs better, or comparable with all previous published methods and present empirical results on all available 2D and 3D facade benchmark datasets. The proposed method is simple to implement, easy to extend, and very efficient at test-time inference.

ps

arXiv Project Page [BibTex]

arXiv Project Page [BibTex]


Thumb xl web image
ClothCap: Seamless 4D Clothing Capture and Retargeting

Pons-Moll, G., Pujades, S., Hu, S., Black, M.

ACM Transactions on Graphics, (Proc. SIGGRAPH), 36(4):73:1-73:15, ACM, New York, NY, USA, 2017, Two first authors contributed equally (article)

Abstract
Designing and simulating realistic clothing is challenging and, while several methods have addressed the capture of clothing from 3D scans, previous methods have been limited to single garments and simple motions, lack detail, or require specialized texture patterns. Here we address the problem of capturing regular clothing on fully dressed people in motion. People typically wear multiple pieces of clothing at a time. To estimate the shape of such clothing, track it over time, and render it believably, each garment must be segmented from the others and the body. Our ClothCap approach uses a new multi-part 3D model of clothed bodies, automatically segments each piece of clothing, estimates the naked body shape and pose under the clothing, and tracks the 3D deformations of the clothing over time. We estimate the garments and their motion from 4D scans; that is, high-resolution 3D scans of the subject in motion at 60 fps. The model allows us to capture a clothed person in motion, extract their clothing, and retarget the clothing to new body shapes. ClothCap provides a step towards virtual try-on with a technology for capturing, modeling, and analyzing clothing in motion.

ps

video project_page paper link (url) DOI Project Page Project Page [BibTex]

video project_page paper link (url) DOI Project Page Project Page [BibTex]


Thumb xl muvs
Towards Accurate Marker-less Human Shape and Pose Estimation over Time

Huang, Y., Bogo, F., Lassner, C., Kanazawa, A., Gehler, P. V., Romero, J., Akhter, I., Black, M. J.

In International Conference on 3D Vision (3DV), pages: 421-430, 2017 (inproceedings)

Abstract
Existing markerless motion capture methods often assume known backgrounds, static cameras, and sequence specific motion priors, limiting their application scenarios. Here we present a fully automatic method that, given multiview videos, estimates 3D human pose and body shape. We take the recently proposed SMPLify method [12] as the base method and extend it in several ways. First we fit a 3D human body model to 2D features detected in multi-view images. Second, we use a CNN method to segment the person in each image and fit the 3D body model to the contours, further improving accuracy. Third we utilize a generic and robust DCT temporal prior to handle the left and right side swapping issue sometimes introduced by the 2D pose estimator. Validation on standard benchmarks shows our results are comparable to the state of the art and also provide a realistic 3D shape avatar. We also demonstrate accurate results on HumanEva and on challenging monocular sequences of dancing from YouTube.

ps

Code pdf DOI Project Page [BibTex]


Thumb xl auroteaser
Decentralized Simultaneous Multi-target Exploration using a Connected Network of Multiple Robots

Nestmeyer, T., Robuffo Giordano, P., Bülthoff, H. H., Franchi, A.

In pages: 989-1011, Autonomous Robots, 2017 (incollection)

ps

[BibTex]

[BibTex]


Thumb xl coverhand wilson
Capturing Hand-Object Interaction and Reconstruction of Manipulated Objects

Tzionas, D.

University of Bonn, 2017 (phdthesis)

Abstract
Hand motion capture with an RGB-D sensor gained recently a lot of research attention, however, even most recent approaches focus on the case of a single isolated hand. We focus instead on hands that interact with other hands or with a rigid or articulated object. Our framework successfully captures motion in such scenarios by combining a generative model with discriminatively trained salient points, collision detection and physics simulation to achieve a low tracking error with physically plausible poses. All components are unified in a single objective function that can be optimized with standard optimization techniques. We initially assume a-priori knowledge of the object's shape and skeleton. In case of unknown object shape there are existing 3d reconstruction methods that capitalize on distinctive geometric or texture features. These methods though fail for textureless and highly symmetric objects like household articles, mechanical parts or toys. We show that extracting 3d hand motion for in-hand scanning effectively facilitates the reconstruction of such objects and we fuse the rich additional information of hands into a 3d reconstruction pipeline. Finally, although shape reconstruction is enough for rigid objects, there is a lack of tools that build rigged models of articulated objects that deform realistically using RGB-D data. We propose a method that creates a fully rigged model consisting of a watertight mesh, embedded skeleton and skinning weights by employing a combination of deformable mesh tracking, motion segmentation based on spectral clustering and skeletonization based on mean curvature flow.

ps

Thesis link (url) Project Page [BibTex]

2016


Thumb xl smpl
Skinned multi-person linear model

Black, M.J., Loper, M., Mahmood, N., Pons-Moll, G., Romero, J.

December 2016, Application PCT/EP2016/064610 (misc)

Abstract
The invention comprises a learned model of human body shape and pose dependent shape variation that is more accurate than previous models and is compatible with existing graphics pipelines. Our Skinned Multi-Person Linear model (SMPL) is a skinned vertex based model that accurately represents a wide variety of body shapes in natural human poses. The parameters of the model are learned from data including the rest pose template, blend weights, pose-dependent blend shapes, identity- dependent blend shapes, and a regressor from vertices to joint locations. Unlike previous models, the pose-dependent blend shapes are a linear function of the elements of the pose rotation matrices. This simple formulation enables training the entire model from a relatively large number of aligned 3D meshes of different people in different poses. The invention quantitatively evaluates variants of SMPL using linear or dual- quaternion blend skinning and show that both are more accurate than a Blend SCAPE model trained on the same data. In a further embodiment, the invention realistically models dynamic soft-tissue deformations. Because it is based on blend skinning, SMPL is compatible with existing rendering engines and we make it available for research purposes.

ps

Google Patents [BibTex]

2016


Google Patents [BibTex]


Thumb xl toc image
Wireless actuation with functional acoustic surfaces

Qiu, T., Palagi, S., Mark, A. G., Melde, K., Adams, F., Fischer, P.

Appl. Phys. Lett., 109(19):191602, November 2016, APL Editor's pick. APL News. (article)

Abstract
Miniaturization calls for micro-actuators that can be powered wirelessly and addressed individually. Here, we develop functional surfaces consisting of arrays of acoustically resonant microcavities, and we demonstrate their application as two-dimensional wireless actuators. When remotely powered by an acoustic field, the surfaces provide highly directional propulsive forces in fluids through acoustic streaming. A maximal force of similar to 0.45mN is measured on a 4 x 4 mm(2) functional surface. The response of the surfaces with bubbles of different sizes is characterized experimentally. This shows a marked peak around the micro-bubbles' resonance frequency, as estimated by both an analytical model and numerical simulations. The strong frequency dependence can be exploited to address different surfaces with different acoustic frequencies, thus achieving wireless actuation with multiple degrees of freedom. The use of the functional surfaces as wireless ready-to-attach actuators is demonstrated by implementing a wireless and bidirectional miniaturized rotary motor, which is 2.6 x 2.6 x 5 mm(3) in size and generates a stall torque of similar to 0.5 mN.mm. The adoption of micro-structured surfaces as wireless actuators opens new possibilities in the development of miniaturized devices and tools for fluidic environments that are accessible by low intensity ultrasound fields.

pf

link (url) DOI Project Page [BibTex]

link (url) DOI Project Page [BibTex]


Thumb xl toc image
Nanomotors

Alarcon-Correa, M., Walker (Schamel), D., Qiu, T., Fischer, P.

Eur. Phys. J.-Special Topics, 225(11-12):2241-2254, November 2016 (article)

Abstract
This minireview discusses whether catalytically active macromolecules and abiotic nanocolloids, that are smaller than motile bacteria, can self-propel. Kinematic reversibility at low Reynolds number demands that self-propelling colloids must break symmetry. Methods that permit the synthesis and fabrication of Janus nanocolloids are therefore briefly surveyed, as well as means that permit the analysis of the nanocolloids' motion. Finally, recent work is reviewed which shows that nanoagents are small enough to penetrate the complex inhomogeneous polymeric network of biological fluids and gels, which exhibit diverse rheological behaviors.

pf

DOI [BibTex]

DOI [BibTex]


Thumb xl psychscience
Creating body shapes from verbal descriptions by linking similarity spaces

Hill, M. Q., Streuber, S., Hahn, C. A., Black, M. J., O’Toole, A. J.

Psychological Science, 27(11):1486-1497, November 2016, (article)

Abstract
Brief verbal descriptions of bodies (e.g. curvy, long-legged) can elicit vivid mental images. The ease with which we create these mental images belies the complexity of three-dimensional body shapes. We explored the relationship between body shapes and body descriptions and show that a small number of words can be used to generate categorically accurate representations of three-dimensional bodies. The dimensions of body shape variation that emerged in a language-based similarity space were related to major dimensions of variation computed directly from three-dimensional laser scans of 2094 bodies. This allowed us to generate three-dimensional models of people in the shape space using only their coordinates on analogous dimensions in the language-based description space. Human descriptions of photographed bodies and their corresponding models matched closely. The natural mapping between the spaces illustrates the role of language as a concise code for body shape, capturing perceptually salient global and local body features.

ps

pdf [BibTex]

pdf [BibTex]


Thumb xl toc image
Structured light enables biomimetic swimming and versatile locomotion of photoresponsive soft microrobots

Palagi, S., Mark, A. G., Reigh, S. Y., Melde, K., Qiu, T., Zeng, H., Parmeggiani, C., Martella, D., Sanchez-Castillo, A., Kapernaum, N., Giesselmann, F., Wiersma, D. S., Lauga, E., Fischer, P.

Nature Materials, 15(6):647–653, November 2016, Max Planck press release, Nature News & Views. (article)

Abstract
Microorganisms move in challenging environments by periodic changes in body shape. In contrast, current artificial microrobots cannot actively deform, exhibiting at best passive bending under external fields. Here, by taking advantage of the wireless, scalable and spatiotemporally selective capabilities that light allows, we show that soft microrobots consisting of photoactive liquid-crystal elastomers can be driven by structured monochromatic light to perform sophisticated biomimetic motions. We realize continuum yet selectively addressable artificial microswimmers that generate travelling-wave motions to self-propel without external forces or torques, as well as microrobots capable of versatile locomotion behaviours on demand. Both theoretical predictions and experimental results confirm that multiple gaits, mimicking either symplectic or antiplectic metachrony of ciliate protozoa, can be achieved with single microswimmers. The principle of using structured light can be extended to other applications that require microscale actuation with sophisticated spatiotemporal coordination for advanced microrobotic technologies.

pf

Video - Soft photo Micro-Swimmer DOI [BibTex]

Video - Soft photo Micro-Swimmer DOI [BibTex]


Thumb xl smplify
Keep it SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image

Bogo, F., Kanazawa, A., Lassner, C., Gehler, P., Romero, J., Black, M. J.

In Computer Vision – ECCV 2016, pages: 561-578, Lecture Notes in Computer Science, Springer International Publishing, 14th European Conference on Computer Vision, October 2016 (inproceedings)

Abstract
We describe the first method to automatically estimate the 3D pose of the human body as well as its 3D shape from a single unconstrained image. We estimate a full 3D mesh and show that 2D joints alone carry a surprising amount of information about body shape. The problem is challenging because of the complexity of the human body, articulation, occlusion, clothing, lighting, and the inherent ambiguity in inferring 3D from 2D. To solve this, we fi rst use a recently published CNN-based method, DeepCut, to predict (bottom-up) the 2D body joint locations. We then fit (top-down) a recently published statistical body shape model, called SMPL, to the 2D joints. We do so by minimizing an objective function that penalizes the error between the projected 3D model joints and detected 2D joints. Because SMPL captures correlations in human shape across the population, we are able to robustly fi t it to very little data. We further leverage the 3D model to prevent solutions that cause interpenetration. We evaluate our method, SMPLify, on the Leeds Sports, HumanEva, and Human3.6M datasets, showing superior pose accuracy with respect to the state of the art.

ps

pdf Video Sup Mat video Code Project Project Page [BibTex]

pdf Video Sup Mat video Code Project Project Page [BibTex]


Thumb xl gadde
Superpixel Convolutional Networks using Bilateral Inceptions

Gadde, R., Jampani, V., Kiefel, M., Kappler, D., Gehler, P.

In European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, Springer, 14th European Conference on Computer Vision, October 2016 (inproceedings)

Abstract
In this paper we propose a CNN architecture for semantic image segmentation. We introduce a new “bilateral inception” module that can be inserted in existing CNN architectures and performs bilateral filtering, at multiple feature-scales, between superpixels in an image. The feature spaces for bilateral filtering and other parameters of the module are learned end-to-end using standard backpropagation techniques. The bilateral inception module addresses two issues that arise with general CNN segmentation architectures. First, this module propagates information between (super) pixels while respecting image edges, thus using the structured information of the problem for improved results. Second, the layer recovers a full resolution segmentation result from the lower resolution solution of a CNN. In the experiments, we modify several existing CNN architectures by inserting our inception modules between the last CNN (1 × 1 convolution) layers. Empirical results on three different datasets show reliable improvements not only in comparison to the baseline networks, but also in comparison to several dense-pixel prediction techniques such as CRFs, while being competitive in time.

am ps

pdf supplementary poster Project Page Project Page [BibTex]

pdf supplementary poster Project Page Project Page [BibTex]


Thumb xl thumb
Barrista - Caffe Well-Served

Lassner, C., Kappler, D., Kiefel, M., Gehler, P.

In ACM Multimedia Open Source Software Competition, ACM OSSC16, October 2016 (inproceedings)

Abstract
The caffe framework is one of the leading deep learning toolboxes in the machine learning and computer vision community. While it offers efficiency and configurability, it falls short of a full interface to Python. With increasingly involved procedures for training deep networks and reaching depths of hundreds of layers, creating configuration files and keeping them consistent becomes an error prone process. We introduce the barrista framework, offering full, pythonic control over caffe. It separates responsibilities and offers code to solve frequently occurring tasks for pre-processing, training and model inspection. It is compatible to all caffe versions since mid 2015 and can import and export .prototxt files. Examples are included, e.g., a deep residual network implemented in only 172 lines (for arbitrary depths), comparing to 2320 lines in the official implementation for the equivalent model.

am ps

pdf link (url) DOI Project Page [BibTex]

pdf link (url) DOI Project Page [BibTex]


Thumb xl toc image
Capture of 2D Microparticle Arrays via a UV-Triggered Thiol-yne “Click” Reaction

Walker (Schamel), D., Singh, D. P., Fischer, P.

Advanced Materials, 28(44):9846-9850, September 2016 (article)

Abstract
Immobilization of colloidal assemblies onto solid supports via a fast UV-triggered click-reaction is achieved. Transient assemblies of microparticles and colloidal materials can be captured and transferred to solid supports. The technique does not require complex reaction conditions, and is compatible with a variety of particle assembly methods.

pf

DOI [BibTex]


Thumb xl toc image
Magnesium plasmonics for UV applications and chiral sensing

Jeong, H. H., Mark, A. G., Fischer, P.

Chem. Comm., 52(82):12179-12182, September 2016 (article)

Abstract
We demonstrate that chiral magnesium nanoparticles show remarkable plasmonic extinction- and chiroptical-effects in the ultraviolet region. The Mg nanohelices possess an enhanced local surface plasmon resonance (LSPR) sensitivity due to the strong dispersion of most substances in the UV region.

pf

DOI [BibTex]

DOI [BibTex]


Thumb xl cover nature 1j 00008
Holograms for acoustics

Melde, K., Mark, A. G., Qiu, T., Fischer, P.

Nature, 537, pages: 518-522, September 2016, Max Planck press release, Nature News & Views, Nature Video. (article)

Abstract
Holographic techniques are fundamental to applications such as volumetric displays(1), high-density data storage and optical tweezers that require spatial control of intricate optical(2) or acoustic fields(3,4) within a three-dimensional volume. The basis of holography is spatial storage of the phase and/or amplitude profile of the desired wavefront(5,6) in a manner that allows that wavefront to be reconstructed by interference when the hologram is illuminated with a suitable coherent source. Modern computer-generated holography(7) skips the process of recording a hologram from a physical scene, and instead calculates the required phase profile before rendering it for reconstruction. In ultrasound applications, the phase profile is typically generated by discrete and independently driven ultrasound sources(3,4,8-12); however, these can only be used in small numbers, which limits the complexity or degrees of freedom that can be attained in the wavefront. Here we introduce monolithic acoustic holograms, which can reconstruct diffraction-limited acoustic pressure fields and thus arbitrary ultrasound beams. We use rapid fabrication to craft the holograms and achieve reconstruction degrees of freedom two orders of magnitude higher than commercial phased array sources. The technique is inexpensive, appropriate for both transmission and reflection elements, and scales well to higher information content, larger aperture size and higher power. The complex three-dimensional pressure and phase distributions produced by these acoustic holograms allow us to demonstrate new approaches to controlled ultrasonic manipulation of solids in water, and of liquids and solids in air. We expect that acoustic holograms will enable new capabilities in beam-steering and the contactless transfer of power, improve medical imaging, and drive new applications of ultrasound.

pf

Video - Holograms for Sound DOI Project Page [BibTex]

Video - Holograms for Sound DOI Project Page [BibTex]


Thumb xl toc image
A loop-gap resonator for chirality-sensitive nuclear magneto-electric resonance (NMER)

Garbacz, P., Fischer, P., Kraemer, S.

J. Chem. Phys., 145(10):104201, September 2016 (article)

Abstract
Direct detection of molecular chirality is practically impossible by methods of standard nuclear magnetic resonance (NMR) that is based on interactions involving magnetic-dipole and magnetic-field operators. However, theoretical studies provide a possible direct probe of chirality by exploiting an enantiomer selective additional coupling involving magnetic-dipole, magnetic-field, and electric field operators. This offers a way for direct experimental detection of chirality by nuclear magneto-electric resonance (NMER). This method uses both resonant magnetic and electric radiofrequency (RF) fields. The weakness of the chiral interaction though requires a large electric RF field and a small transverse RF magnetic field over the sample volume, which is a non-trivial constraint. In this study, we present a detailed study of the NMER concept and a possible experimental realization based on a loop-gap resonator. For this original device, the basic principle and numerical studies as well as fabrication and measurements of the frequency dependence of the scattering parameter are reported. By simulating the NMER spin dynamics for our device and taking the F-19 NMER signal of enantiomer-pure 1,1,1-trifluoropropan-2-ol, we predict a chirality induced NMER signal that accounts for 1%-5% of the standard achiral NMR signal. Published by AIP Publishing.

pf

DOI [BibTex]

DOI [BibTex]


Thumb xl fig1
Soft continuous microrobots with multiple intrinsic degrees of freedom

Palagi, S., Mark, A. G., Melde, K., Zeng, H., Parmeggiani, C., Martella, D., Wiersma, D. S., Fischer, P.

In 2016 International Conference on Manipulation, Automation and Robotics at Small Scales (MARSS), pages: 1-5, July 2016 (inproceedings)

Abstract
One of the main challenges in the development of microrobots, i.e. robots at the sub-millimeter scale, is the difficulty of adopting traditional solutions for power, control and, especially, actuation. As a result, most current microrobots are directly manipulated by external fields, and possess only a few passive degrees of freedom (DOFs). We have reported a strategy that enables embodiment, remote powering and control of a large number of DOFs in mobile soft microrobots. These consist of photo-responsive materials, such that the actuation of their soft continuous body can be selectively and dynamically controlled by structured light fields. Here we use finite-element modelling to evaluate the effective number of DOFs that are addressable in our microrobots. We also demonstrate that by this flexible approach different actuation patterns can be obtained, and thus different locomotion performances can be achieved within the very same microrobot. The reported results confirm the versatility of the proposed approach, which allows for easy application-specific optimization and online reconfiguration of the microrobot's behavior. Such versatility will enable advanced applications of robotics and automation at the micro scale.

pf

DOI [BibTex]

DOI [BibTex]


Thumb xl screen shot 2016 07 25 at 13.52.05
Non-parametric Models for Structured Data and Applications to Human Bodies and Natural Scenes

Lehrmann, A.

ETH Zurich, July 2016 (phdthesis)

Abstract
The purpose of this thesis is the study of non-parametric models for structured data and their fields of application in computer vision. We aim at the development of context-sensitive architectures which are both expressive and efficient. Our focus is on directed graphical models, in particular Bayesian networks, where we combine the flexibility of non-parametric local distributions with the efficiency of a global topology with bounded treewidth. A bound on the treewidth is obtained by either constraining the maximum indegree of the underlying graph structure or by introducing determinism. The non-parametric distributions in the nodes of the graph are given by decision trees or kernel density estimators. The information flow implied by specific network topologies, especially the resultant (conditional) independencies, allows for a natural integration and control of contextual information. We distinguish between three different types of context: static, dynamic, and semantic. In four different approaches we propose models which exhibit varying combinations of these contextual properties and allow modeling of structured data in space, time, and hierarchies derived thereof. The generative character of the presented models enables a direct synthesis of plausible hypotheses. Extensive experiments validate the developed models in two application scenarios which are of particular interest in computer vision: human bodies and natural scenes. In the practical sections of this work we discuss both areas from different angles and show applications of our models to human pose, motion, and segmentation as well as object categorization and localization. Here, we benefit from the availability of modern datasets of unprecedented size and diversity. Comparisons to traditional approaches and state-of-the-art research on the basis of well-established evaluation criteria allows the objective assessment of our contributions.

ps

pdf [BibTex]


Thumb xl toc image
Active Nanorheology with Plasmonics

Jeong, H. H., Mark, A. G., Lee, T., Alarcon-Correa, M., Eslami, S., Qiu, T., Gibbs, J. G., Fischer, P.

Nano Letters, 16(8):4887-4894, July 2016 (article)

Abstract
Nanoplasmonic systems are valued for their strong optical response and their small size. Most plasmonic sensors and systems to date have been rigid and passive. However, rendering these structures dynamic opens new possibilities for applications. Here we demonstrate that dynamic plasmonic nanoparticles can be used as mechanical sensors to selectively probe the rheological properties of a fluid in situ at the nanoscale and in microscopic volumes. We fabricate chiral magneto-plasmonic nanocolloids that can be actuated by an external magnetic field, which in turn allows for the direct and fast modulation of their distinct optical response. The method is robust and allows nanorheological measurements with a mechanical sensitivity of similar to 0.1 cP, even in strongly absorbing fluids with an optical density of up to OD similar to 3 (similar to 0.1% light transmittance) and in the presence of scatterers (e.g., 50% v/v red blood cells).

pf

DOI [BibTex]

DOI [BibTex]


Thumb xl marss2016
Wireless actuator based on ultrasonic bubble streaming

Qiu, T., Palagi, S., Mark, A. G., Melde, K., Fischer, P.

In 2016 International Conference on Manipulation, Automation and Robotics at Small Scales (MARSS), pages: 1-5, July 2016 (inproceedings)

Abstract
Miniaturized actuators are a key element for the manipulation and automation at small scales. Here, we propose a new miniaturized actuator, which consists of an array of micro gas bubbles immersed in a fluid. Under ultrasonic excitation, the oscillation of micro gas bubbles results in acoustic streaming and provides a propulsive force that drives the actuator. The actuator was fabricated by lithography and fluidic streaming was observed under ultrasound excitation. Theoretical modelling and numerical simulations were carried out to show that lowing the surface tension results in a larger amplitude of the bubble oscillation, and thus leads to a higher propulsive force. Experimental results also demonstrate that the propulsive force increases 3.5 times when the surface tension is lowered by adding a surfactant. An actuator with a 4×4 mm 2 surface area provides a driving force of about 0.46 mN, suggesting that it is possible to be used as a wireless actuator for small-scale robots and medical instruments.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl webteaser
Body Talk: Crowdshaping Realistic 3D Avatars with Words

Streuber, S., Quiros-Ramirez, M. A., Hill, M. Q., Hahn, C. A., Zuffi, S., O’Toole, A., Black, M. J.

ACM Trans. Graph. (Proc. SIGGRAPH), 35(4):54:1-54:14, July 2016 (article)

Abstract
Realistic, metrically accurate, 3D human avatars are useful for games, shopping, virtual reality, and health applications. Such avatars are not in wide use because solutions for creating them from high-end scanners, low-cost range cameras, and tailoring measurements all have limitations. Here we propose a simple solution and show that it is surprisingly accurate. We use crowdsourcing to generate attribute ratings of 3D body shapes corresponding to standard linguistic descriptions of 3D shape. We then learn a linear function relating these ratings to 3D human shape parameters. Given an image of a new body, we again turn to the crowd for ratings of the body shape. The collection of linguistic ratings of a photograph provides remarkably strong constraints on the metric 3D shape. We call the process crowdshaping and show that our Body Talk system produces shapes that are perceptually indistinguishable from bodies created from high-resolution scans and that the metric accuracy is sufficient for many tasks. This makes body “scanning” practical without a scanner, opening up new applications including database search, visualization, and extracting avatars from books.

ps

pdf web tool video talk (ppt) [BibTex]

pdf web tool video talk (ppt) [BibTex]


Thumb xl teaser
DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation

Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P., Schiele, B.

In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages: 4929-4937, IEEE, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2016 (inproceedings)

Abstract
This paper considers the task of articulated human pose estimation of multiple people in real-world images. We propose an approach that jointly solves the tasks of detection and pose estimation: it infers the number of persons in a scene, identifies occluded body parts, and disambiguates body parts between people in close proximity of each other. This joint formulation is in contrast to previous strategies, that address the problem by first detecting people and subsequently estimating their body pose. We propose a partitioning and labeling formulation of a set of body-part hypotheses generated with CNN-based part detectors. Our formulation, an instance of an integer linear program, implicitly performs non-maximum suppression on the set of part candidates and groups them to form configurations of body parts respecting geometric and appearance constraints. Experiments on four different datasets demonstrate state-of-the-art results for both single person and multi person pose estimation.

ps

code pdf supplementary DOI Project Page [BibTex]

code pdf supplementary DOI Project Page [BibTex]


Thumb xl tsaiteaser
Video segmentation via object flow

Tsai, Y., Yang, M., Black, M. J.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2016 (inproceedings)

Abstract
Video object segmentation is challenging due to fast moving objects, deforming shapes, and cluttered backgrounds. Optical flow can be used to propagate an object segmentation over time but, unfortunately, flow is often inaccurate, particularly around object boundaries. Such boundaries are precisely where we want our segmentation to be accurate. To obtain accurate segmentation across time, we propose an efficient algorithm that considers video segmentation and optical flow estimation simultaneously. For video segmentation, we formulate a principled, multiscale, spatio-temporal objective function that uses optical flow to propagate information between frames. For optical flow estimation, particularly at object boundaries, we compute the flow independently in the segmented regions and recompose the results. We call the process object flow and demonstrate the effectiveness of jointly optimizing optical flow and video segmentation using an iterative scheme. Experiments on the SegTrack v2 and Youtube-Objects datasets show that the proposed algorithm performs favorably against the other state-of-the-art methods.

ps

pdf [BibTex]

pdf [BibTex]


Thumb xl capital
Patches, Planes and Probabilities: A Non-local Prior for Volumetric 3D Reconstruction

Ulusoy, A. O., Black, M. J., Geiger, A.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2016 (inproceedings)

Abstract
In this paper, we propose a non-local structured prior for volumetric multi-view 3D reconstruction. Towards this goal, we present a novel Markov random field model based on ray potentials in which assumptions about large 3D surface patches such as planarity or Manhattan world constraints can be efficiently encoded as probabilistic priors. We further derive an inference algorithm that reasons jointly about voxels, pixels and image segments, and estimates marginal distributions of appearance, occupancy, depth, normals and planarity. Key to tractable inference is a novel hybrid representation that spans both voxel and pixel space and that integrates non-local information from 2D image segmentations in a principled way. We compare our non-local prior to commonly employed local smoothness assumptions and a variety of state-of-the-art volumetric reconstruction baselines on challenging outdoor scenes with textureless and reflective surfaces. Our experiments indicate that regularizing over larger distances has the potential to resolve ambiguities where local regularizers fail.

avg ps

YouTube pdf poster suppmat Project Page [BibTex]

YouTube pdf poster suppmat Project Page [BibTex]


Thumb xl ijcv tumb
Capturing Hands in Action using Discriminative Salient Points and Physics Simulation

Tzionas, D., Ballan, L., Srikantha, A., Aponte, P., Pollefeys, M., Gall, J.

International Journal of Computer Vision (IJCV), 118(2):172-193, June 2016 (article)

Abstract
Hand motion capture is a popular research field, recently gaining more attention due to the ubiquity of RGB-D sensors. However, even most recent approaches focus on the case of a single isolated hand. In this work, we focus on hands that interact with other hands or objects and present a framework that successfully captures motion in such interaction scenarios for both rigid and articulated objects. Our framework combines a generative model with discriminatively trained salient points to achieve a low tracking error and with collision detection and physics simulation to achieve physically plausible estimates even in case of occlusions and missing visual data. Since all components are unified in a single objective function which is almost everywhere differentiable, it can be optimized with standard optimization techniques. Our approach works for monocular RGB-D sequences as well as setups with multiple synchronized RGB cameras. For a qualitative and quantitative evaluation, we captured 29 sequences with a large variety of interactions and up to 150 degrees of freedom.

ps

Website pdf link (url) DOI Project Page [BibTex]

Website pdf link (url) DOI Project Page [BibTex]


Thumb xl header
Optical Flow with Semantic Segmentation and Localized Layers

Sevilla-Lara, L., Sun, D., Jampani, V., Black, M. J.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 3889-3898, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2016 (inproceedings)

Abstract
Existing optical flow methods make generic, spatially homogeneous, assumptions about the spatial structure of the flow. In reality, optical flow varies across an image depending on object class. Simply put, different objects move differently. Here we exploit recent advances in static semantic scene segmentation to segment the image into objects of different types. We define different models of image motion in these regions depending on the type of object. For example, we model the motion on roads with homographies, vegetation with spatially smooth flow, and independently moving objects like cars and planes with affine motion plus deviations. We then pose the flow estimation problem using a novel formulation of localized layers, which addresses limitations of traditional layered models for dealing with complex scene motion. Our semantic flow method achieves the lowest error of any published monocular method in the KITTI-2015 flow benchmark and produces qualitatively better flow and segmentation than recent top methods on a wide range of natural videos.

ps

video Kitti Precomputed Data (1.6GB) pdf YouTube Sequences Code Project Page Project Page [BibTex]

video Kitti Precomputed Data (1.6GB) pdf YouTube Sequences Code Project Page Project Page [BibTex]


Thumb xl tes cvpr16 bilateral
Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks

Jampani, V., Kiefel, M., Gehler, P. V.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 4452-4461, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2016 (inproceedings)

Abstract
Bilateral filters have wide spread use due to their edge-preserving properties. The common use case is to manually choose a parametric filter type, usually a Gaussian filter. In this paper, we will generalize the parametrization and in particular derive a gradient descent algorithm so the filter parameters can be learned from data. This derivation allows to learn high dimensional linear filters that operate in sparsely populated feature spaces. We build on the permutohedral lattice construction for efficient filtering. The ability to learn more general forms of high-dimensional filters can be used in several diverse applications. First, we demonstrate the use in applications where single filter applications are desired for runtime reasons. Further, we show how this algorithm can be used to learn the pairwise potentials in densely connected conditional random fields and apply these to different image segmentation tasks. Finally, we introduce layers of bilateral filters in CNNs and propose bilateral neural networks for the use of high-dimensional sparse data. This view provides new ways to encode model structure into network architectures. A diverse set of experiments empirically validates the usage of general forms of filters.

ps

project page code CVF open-access pdf supplementary poster Project Page Project Page [BibTex]

project page code CVF open-access pdf supplementary poster Project Page Project Page [BibTex]


Thumb xl futeaser
Occlusion boundary detection via deep exploration of context

Fu, H., Wang, C., Tao, D., Black, M. J.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2016 (inproceedings)

Abstract
Occlusion boundaries contain rich perceptual information about the underlying scene structure. They also provide important cues in many visual perception tasks such as scene understanding, object recognition, and segmentation. In this paper, we improve occlusion boundary detection via enhanced exploration of contextual information (e.g., local structural boundary patterns, observations from surrounding regions, and temporal context), and in doing so develop a novel approach based on convolutional neural networks (CNNs) and conditional random fields (CRFs). Experimental results demonstrate that our detector significantly outperforms the state-of-the-art (e.g., improving the F-measure from 0.62 to 0.71 on the commonly used CMU benchmark). Last but not least, we empirically assess the roles of several important components of the proposed detector, so as to validate the rationale behind this approach.

ps

pdf [BibTex]

pdf [BibTex]


Thumb xl jun teaser
Semantic Instance Annotation of Street Scenes by 3D to 2D Label Transfer

Xie, J., Kiefel, M., Sun, M., Geiger, A.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2016 (inproceedings)

Abstract
Semantic annotations are vital for training models for object recognition, semantic segmentation or scene understanding. Unfortunately, pixelwise annotation of images at very large scale is labor-intensive and only little labeled data is available, particularly at instance level and for street scenes. In this paper, we propose to tackle this problem by lifting the semantic instance labeling task from 2D into 3D. Given reconstructions from stereo or laser data, we annotate static 3D scene elements with rough bounding primitives and develop a probabilistic model which transfers this information into the image domain. We leverage our method to obtain 2D labels for a novel suburban video dataset which we have collected, resulting in 400k semantic and instance image annotations. A comparison of our method to state-of-the-art label transfer baselines reveals that 3D information enables more efficient annotation while at the same time resulting in improved accuracy and time-coherent labels.

avg ps

pdf suppmat Project Page Project Page [BibTex]

pdf suppmat Project Page Project Page [BibTex]


Thumb xl toc imag
Auxetic Metamaterial Simplifies Soft Robot Design

Mark, A. G., Palagi, S., Qiu, T., Fischer, P.

In 2016 IEEE Int. Conf. on Robotics and Automation (ICRA), pages: 4951-4956, May 2016 (inproceedings)

Abstract
Soft materials are being adopted in robotics in order to facilitate biomedical applications and in order to achieve simpler and more capable robots. One route to simplification is to design the robot's body using `smart materials' that carry the burden of control and actuation. Metamaterials enable just such rational design of the material properties. Here we present a soft robot that exploits mechanical metamaterials for the intrinsic synchronization of two passive clutches which contact its travel surface. Doing so allows it to move through an enclosed passage with an inchworm motion propelled by a single actuator. Our soft robot consists of two 3D-printed metamaterials that implement auxetic and normal elastic properties. The design, fabrication and characterization of the metamaterials are described. In addition, a working soft robot is presented. Since the synchronization mechanism is a feature of the robot's material body, we believe that the proposed design will enable compliant and robust implementations that scale well with miniaturization.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl spie2016
Towards Photo-Induced Swimming: Actuation of Liquid Crystalline Elastomer in Water

cerretti, G., Martella, D., Zeng, H., Parmeggiani, C., Palagi, S., Mark, A. G., Melde, K., Qiu, T., Fischer, P., Wiersma, D.

In Proc. of SPIE 9738, pages: Laser 3D Manufacturing III, 97380T, April 2016 (inproceedings)

Abstract
Liquid Crystalline Elastomers (LCEs) are very promising smart materials that can be made sensitive to different external stimuli, such as heat, pH, humidity and light, by changing their chemical composition. In this paper we report the implementation of a nematically aligned LCE actuator able to undergo large light-induced deformations. We prove that this property is still present even when the actuator is submerged in fresh water. Thanks to the presence of azo-dye moieties, capable of going through a reversible trans-cis photo-isomerization, and by applying light with two different wavelengths we managed to control the bending of such actuator in the liquid environment. The reported results represent the first step towards swimming microdevices powered by light.

pf

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl toc image
Dispersion and shape engineered plasmonic nanosensors

Jeong, H. H., Mark, A. G., Alarcon-Correa, M., Kim, I., Oswald, P., Lee, T. C., Fischer, P.

Nature Communications, 7, pages: 11331, March 2016 (article)

Abstract
Biosensors based on the localized surface plasmon resonance (LSPR) of individual metallic nanoparticles promise to deliver modular, low-cost sensing with high-detection thresholds. However, they continue to suffer from relatively low sensitivity and figures of merit (FOMs). Herein we introduce the idea of sensitivity enhancement of LSPR sensors through engineering of the material dispersion function. Employing dispersion and shape engineering of chiral nanoparticles leads to remarkable refractive index sensitivities (1,091 nmRIU(-1) at lambda = 921 nm) and FOMs (>2,800 RIU-1). A key feature is that the polarization-dependent extinction of the nanoparticles is now characterized by rich spectral features, including bipolar peaks and nulls, suitable for tracking refractive index changes. This sensing modality offers strong optical contrast even in the presence of highly absorbing media, an important consideration for use in complex biological media with limited transmission. The technique is sensitive to surface-specific binding events which we demonstrate through biotin-avidin surface coupling.

pf

link (url) DOI [BibTex]