Header logo is



Learning to Dress 3D People in Generative Clothing
Learning to Dress 3D People in Generative Clothing

Ma, Q., Yang, J., Ranjan, A., Pujades, S., Pons-Moll, G., Tang, S., Black, M. J.

In Computer Vision and Pattern Recognition (CVPR), June 2020 (inproceedings)

Abstract
Three-dimensional human body models are widely used in the analysis of human pose and motion. Existing models, however, are learned from minimally-clothed 3D scans and thus do not generalize to the complexity of dressed people in common images and videos. Additionally, current models lack the expressive power needed to represent the complex non-linear geometry of pose-dependent clothing shape. To address this, we learn a generative 3D mesh model of clothed people from 3D scans with varying pose and clothing. Specifically, we train a conditional Mesh-VAE-GAN to learn the clothing deformation from the SMPL body model, making clothing an additional term on SMPL. Our model is conditioned on both pose and clothing type, giving the ability to draw samples of clothing to dress different body shapes in a variety of styles and poses. To preserve wrinkle detail, our Mesh-VAE-GAN extends patchwise discriminators to 3D meshes. Our model, named CAPE, represents global shape and fine local structure, effectively extending the SMPL body model to clothing. To our knowledge, this is the first generative model that directly dresses 3D human body meshes and generalizes to different poses.

ps

arxiv project page [BibTex]


Generating 3D People in Scenes without People
Generating 3D People in Scenes without People

Zhang, Y., Hassan, M., Neumann, H., Black, M. J., Tang, S.

In Computer Vision and Pattern Recognition (CVPR), June 2020 (inproceedings)

Abstract
We present a fully-automatic system that takes a 3D scene and generates plausible 3D human bodies that are posed naturally in that 3D scene. Given a 3D scene without people, humans can easily imagine how people could interact with the scene and the objects in it. However, this is a challenging task for a computer as solving it requires (1) the generated human bodies should be semantically plausible with the 3D environment, e.g. people sitting on the sofa or cooking near the stove; (2) the generated human-scene interaction should be physically feasible in the way that the human body and scene do not interpenetrate while, at the same time, body-scene contact supports physical interactions. To that end, we make use of the surface-based 3D human model SMPL-X. We first train a conditional variational autoencoder to predict semantically plausible 3D human pose conditioned on latent scene representations, then we further refine the generated 3D bodies using scene constraints to enforce feasible physical interaction. We show that our approach is able to synthesize realistic and expressive 3D human bodies that naturally interact with 3D environment. We perform extensive experiments demonstrating that our generative framework compares favorably with existing methods, both qualitatively and quantitatively. We believe that our scene-conditioned 3D human generation pipeline will be useful for numerous applications; e.g. to generate training data for human pose estimation, in video games and in VR/AR.

ps

PDF link (url) [BibTex]

PDF link (url) [BibTex]


Learning Physics-guided Face Relighting under Directional Light
Learning Physics-guided Face Relighting under Directional Light

Nestmeyer, T., Lalonde, J., Matthews, I., Lehrmann, A. M.

In Conference on Computer Vision and Pattern Recognition, IEEE/CVF, June 2020 (inproceedings) Accepted

Abstract
Relighting is an essential step in realistically transferring objects from a captured image into another environment. For example, authentic telepresence in Augmented Reality requires faces to be displayed and relit consistent with the observer's scene lighting. We investigate end-to-end deep learning architectures that both de-light and relight an image of a human face. Our model decomposes the input image into intrinsic components according to a diffuse physics-based image formation model. We enable non-diffuse effects including cast shadows and specular highlights by predicting a residual correction to the diffuse render. To train and evaluate our model, we collected a portrait database of 21 subjects with various expressions and poses. Each sample is captured in a controlled light stage setup with 32 individual light sources. Our method creates precise and believable relighting results and generalizes to complex illumination conditions and challenging poses, including when the subject is not looking straight at the camera.

ps

Paper [BibTex]

Paper [BibTex]


{VIBE}: Video Inference for Human Body Pose and Shape Estimation
VIBE: Video Inference for Human Body Pose and Shape Estimation

Kocabas, M., Athanasiou, N., Black, M. J.

In Computer Vision and Pattern Recognition (CVPR), June 2020 (inproceedings)

Abstract
Human motion is fundamental to understanding behavior. Despite progress on single-image 3D pose and shape estimation, existing video-based state-of-the-art methodsfail to produce accurate and natural motion sequences due to a lack of ground-truth 3D motion data for training. To address this problem, we propose “Video Inference for Body Pose and Shape Estimation” (VIBE), which makes use of an existing large-scale motion capture dataset (AMASS) together with unpaired, in-the-wild, 2D keypoint annotations. Our key novelty is an adversarial learning framework that leverages AMASS to discriminate between real human motions and those produced by our temporal pose and shape regression networks. We define a temporal network architecture and show that adversarial training, at the sequence level, produces kinematically plausible motion sequences without in-the-wild ground-truth 3D labels. We perform extensive experimentation to analyze the importance of motion and demonstrate the effectiveness of VIBE on challenging 3D pose estimation datasets, achieving state-of-the-art performance. Code and pretrained models are available at https://github.com/mkocabas/VIBE

ps

arXiv code [BibTex]

arXiv code [BibTex]


From Variational to Deterministic Autoencoders
From Variational to Deterministic Autoencoders

Ghosh*, P., Sajjadi*, M. S. M., Vergari, A., Black, M. J., Schölkopf, B.

8th International Conference on Learning Representations (ICLR) , April 2020, *equal contribution (conference) Accepted

Abstract
Variational Autoencoders (VAEs) provide a theoretically-backed framework for deep generative models. However, they often produce “blurry” images, which is linked to their training objective. Sampling in the most popular implementation, the Gaussian VAE, can be interpreted as simply injecting noise to the input of a deterministic decoder. In practice, this simply enforces a smooth latent space structure. We challenge the adoption of the full VAE framework on this specific point in favor of a simpler, deterministic one. Specifically, we investigate how substituting stochasticity with other explicit and implicit regularization schemes can lead to a meaningful latent space without having to force it to conform to an arbitrarily chosen prior. To retrieve a generative mechanism for sampling new data points, we propose to employ an efficient ex-post density estimation step that can be readily adopted both for the proposed deterministic autoencoders as well as to improve sample quality of existing VAEs. We show in a rigorous empirical study that regularized deterministic autoencoding achieves state-of-the-art sample quality on the common MNIST, CIFAR-10 and CelebA datasets.

ei ps

arXiv [BibTex]

arXiv [BibTex]


Gripping apparatus and method of producing a gripping apparatus
Gripping apparatus and method of producing a gripping apparatus

Song, S., Sitti, M., Drotlef, D., Majidi, C.

Google Patents, Febuary 2020, US Patent App. 16/610,209 (patent)

Abstract
The present invention relates to a gripping apparatus comprising a membrane; a flexible housing; with said membrane being fixedly connected to a periphery of the housing. The invention further relates to a method of producing a gripping apparatus.

pi

[BibTex]

[BibTex]


Chained Representation Cycling: Learning to Estimate 3D Human Pose and Shape by Cycling Between Representations
Chained Representation Cycling: Learning to Estimate 3D Human Pose and Shape by Cycling Between Representations

Rueegg, N., Lassner, C., Black, M. J., Schindler, K.

In Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), Febuary 2020 (inproceedings)

Abstract
The goal of many computer vision systems is to transform image pixels into 3D representations. Recent popular models use neural networks to regress directly from pixels to 3D object parameters. Such an approach works well when supervision is available, but in problems like human pose and shape estimation, it is difficult to obtain natural images with 3D ground truth. To go one step further, we propose a new architecture that facilitates unsupervised, or lightly supervised, learning. The idea is to break the problem into a series of transformations between increasingly abstract representations. Each step involves a cycle designed to be learnable without annotated training data, and the chain of cycles delivers the final solution. Specifically, we use 2D body part segments as an intermediate representation that contains enough information to be lifted to 3D, and at the same time is simple enough to be learned in an unsupervised way. We demonstrate the method by learning 3D human pose and shape from un-paired and un-annotated images. We also explore varying amounts of paired data and show that cycling greatly alleviates the need for paired data. While we present results for modeling humans, our formulation is general and can be applied to other vision problems.

ps

pdf [BibTex]

pdf [BibTex]


Learning Multi-Human Optical Flow
Learning Multi-Human Optical Flow

Ranjan, A., Hoffmann, D. T., Tzionas, D., Tang, S., Romero, J., Black, M. J.

International Journal of Computer Vision (IJCV), January 2020 (article)

Abstract
The optical flow of humans is well known to be useful for the analysis of human action. Recent optical flow methods focus on training deep networks to approach the problem. However, the training data used by them does not cover the domain of human motion. Therefore, we develop a dataset of multi-human optical flow and train optical flow networks on this dataset. We use a 3D model of the human body and motion capture data to synthesize realistic flow fields in both single-and multi-person images. We then train optical flow networks to estimate human flow fields from pairs of images. We demonstrate that our trained networks are more accurate than a wide range of top methods on held-out test data and that they can generalize well to real image sequences. The code, trained models and the dataset are available for research.

ps

Paper Publisher Version poster link (url) DOI [BibTex]


Method of actuating a shape changeable member, shape changeable member and actuating system
Method of actuating a shape changeable member, shape changeable member and actuating system

Hu, W., Lum, G. Z., Mastrangeli, M., Sitti, M.

Google Patents, January 2020, US Patent App. 16/477,593 (patent)

Abstract
The present invention relates to a method of actuating a shape changeable member of actuatable material. The invention further relates to a shape changeable member and to a system comprising such a shape changeable member and a magnetic field apparatus.

pi

[BibTex]


Thermal Effects on the Crystallization Kinetics, and Interfacial Adhesion of Single-Crystal Phase-Change Gallium
Thermal Effects on the Crystallization Kinetics, and Interfacial Adhesion of Single-Crystal Phase-Change Gallium

Yunusa, M., Lahlou, A., Sitti, M.

Advanced Materials, Wiley Online Library, 2020 (article)

Abstract
Although substrates play an important role upon crystallization of supercooled liquids, the influences of surface temperature and thermal property have remained elusive. Here, the crystallization of supercooled phase‐change gallium (Ga) on substrates with different thermal conductivity is studied. The effect of interfacial temperature on the crystallization kinetics, which dictates thermo‐mechanical stresses between the substrate and the crystallized Ga, is investigated. At an elevated surface temperature, close to the melting point of Ga, an extended single‐crystal growth of Ga on dielectric substrates due to layering effect and annealing is realized without the application of external fields. Adhesive strength at the interfaces depends on the thermal conductivity and initial surface temperature of the substrates. This insight can be applicable to other liquid metals for industrial applications, and sheds more light on phase‐change memory crystallization.

pi

[BibTex]


no image
Nanoerythrosome-functionalized biohybrid microswimmers

Nicole, Oncay, Yunus, Birgul, Metin Sitti

2020 (article) Accepted

pi

[BibTex]

[BibTex]


Injectable Nanoelectrodes Enable Wireless Deep Brain Stimulation of Native Tissue in Freely Moving Mice
Injectable Nanoelectrodes Enable Wireless Deep Brain Stimulation of Native Tissue in Freely Moving Mice

Kozielski, K. L., Jahanshahi, A., Gilbert, H. B., Yu, Y., Erin, O., Francisco, D., Alosaimi, F., Temel, Y., Sitti, M.

bioRxiv, Cold Spring Harbor Laboratory, 2020 (article)

pi

[BibTex]

[BibTex]


no image
Statistical reprogramming of macroscopic self-assembly with dynamic boundaries

Utku, , Massimo, , Zoey, , Sitti,

2020 (article) Accepted

pi

[BibTex]

[BibTex]


Controlling two-dimensional collective formation and cooperative behavior of magnetic microrobot swarms
Controlling two-dimensional collective formation and cooperative behavior of magnetic microrobot swarms

Dong, X., Sitti, M.

The International Journal of Robotics Research, 2020 (article)

Abstract
Magnetically actuated mobile microrobots can access distant, enclosed, and small spaces, such as inside microfluidic channels and the human body, making them appealing for minimally invasive tasks. Despite their simplicity when scaling down, creating collective microrobots that can work closely and cooperatively, as well as reconfigure their formations for different tasks, would significantly enhance their capabilities such as manipulation of objects. However, a challenge of realizing such cooperative magnetic microrobots is to program and reconfigure their formations and collective motions with under-actuated control signals. This article presents a method of controlling 2D static and time-varying formations among collective self-repelling ferromagnetic microrobots (100 μm to 350 μm in diameter, up to 260 in number) by spatially and temporally programming an external magnetic potential energy distribution at the air–water interface or on solid surfaces. A general design method is introduced to program external magnetic potential energy using ferromagnets. A predictive model of the collective system is also presented to predict the formation and guide the design procedure. With the proposed method, versatile complex static formations are experimentally demonstrated and the programmability and scaling effects of formations are analyzed. We also demonstrate the collective mobility of these magnetic microrobots by controlling them to exhibit bio-inspired collective behaviors such as aggregation, directional motion with arbitrary swarm headings, and rotational swarming motion. Finally, the functions of the produced microrobotic swarm are demonstrated by controlling them to navigate through cluttered environments and complete reconfigurable cooperative manipulation tasks.

pi

DOI [BibTex]


no image
Analytical classical density functionals from an equation learning network

Lin, S., Martius, G., Oettel, M.

The Journal of Chemical Physics, 152(2):021102, 2020, arXiv preprint \url{https://arxiv.org/abs/1910.12752} (article)

al

Preprint_PDF DOI [BibTex]

Preprint_PDF DOI [BibTex]


Characterization and Thermal Management of a DC Motor-Driven Resonant Actuator for Miniature Mobile Robots with Oscillating Limbs
Characterization and Thermal Management of a DC Motor-Driven Resonant Actuator for Miniature Mobile Robots with Oscillating Limbs

Colmenares, D., Kania, R., Liu, M., Sitti, M.

arXiv preprint arXiv:2002.00798, 2020 (article)

Abstract
In this paper, we characterize the performance of and develop thermal management solutions for a DC motor-driven resonant actuator developed for flapping wing micro air vehicles. The actuator, a DC micro-gearmotor connected in parallel with a torsional spring, drives reciprocal wing motion. Compared to the gearmotor alone, this design increased torque and power density by 161.1% and 666.8%, respectively, while decreasing the drawn current by 25.8%. Characterization of the actuator, isolated from nonlinear aerodynamic loading, results in standard metrics directly comparable to other actuators. The micro-motor, selected for low weight considerations, operates at high power for limited duration due to thermal effects. To predict system performance, a lumped parameter thermal circuit model was developed. Critical model parameters for this micro-motor, two orders of magnitude smaller than those previously characterized, were identified experimentally. This included the effects of variable winding resistance, bushing friction, speed-dependent forced convection, and the addition of a heatsink. The model was then used to determine a safe operation envelope for the vehicle and to design a weight-optimal heatsink. This actuator design and thermal modeling approach could be applied more generally to improve the performance of any miniature mobile robot or device with motor-driven oscillating limbs or loads.

pi

[BibTex]


Magnetic Resonance Imaging System--Driven Medical Robotics
Magnetic Resonance Imaging System–Driven Medical Robotics

Erin, O., Boyvat, M., Tiryaki, M. E., Phelan, M., Sitti, M.

Advanced Intelligent Systems, 2, Wiley Online Library, 2020 (article)

Abstract
Magnetic resonance imaging (MRI) system–driven medical robotics is an emerging field that aims to use clinical MRI systems not only for medical imaging but also for actuation, localization, and control of medical robots. Submillimeter scale resolution of MR images for soft tissues combined with the electromagnetic gradient coil–based magnetic actuation available inside MR scanners can enable theranostic applications of medical robots for precise image‐guided minimally invasive interventions. MRI‐driven robotics typically does not introduce new MRI instrumentation for actuation but instead focuses on converting already available instrumentation for robotic purposes. To use the advantages of this technology, various medical devices such as untethered mobile magnetic robots and tethered active catheters have been designed to be powered magnetically inside MRI systems. Herein, the state‐of‐the‐art progress, challenges, and future directions of MRI‐driven medical robotic systems are reviewed.

pi

[BibTex]

[BibTex]


Pros and Cons: Magnetic versus Optical Microrobots
Pros and Cons: Magnetic versus Optical Microrobots

Sitti, M., Wiersma, D. S.

Advanced Materials, Wiley Online Library, 2020 (article)

Abstract
Mobile microrobotics has emerged as a new robotics field within the last decade to create untethered tiny robots that can access and operate in unprecedented, dangerous, or hard‐to‐reach small spaces noninvasively toward disruptive medical, biotechnology, desktop manufacturing, environmental remediation, and other potential applications. Magnetic and optical actuation methods are the most widely used actuation methods in mobile microrobotics currently, in addition to acoustic and biological (cell‐driven) actuation approaches. The pros and cons of these actuation methods are reported here, depending on the given context. They can both enable long‐range, fast, and precise actuation of single or a large number of microrobots in diverse environments. Magnetic actuation has unique potential for medical applications of microrobots inside nontransparent tissues at high penetration depths, while optical actuation is suitable for more biotechnology, lab‐/organ‐on‐a‐chip, and desktop manufacturing types of applications with much less surface penetration depth requirements or with transparent environments. Combining both methods in new robot designs can have a strong potential of combining the pros of both methods. There is still much progress needed in both actuation methods to realize the potential disruptive applications of mobile microrobots in real‐world conditions.

pi

[BibTex]

[BibTex]


Selectively Controlled Magnetic Microrobots with Opposing Helices
Selectively Controlled Magnetic Microrobots with Opposing Helices

Giltinan, J., Katsamba, P., Wang, W., Lauga, E., Sitti, M.

Applied Physics Letters, 116, AIP Publishing LLC, 2020 (article)

pi

[BibTex]

[BibTex]


General Movement Assessment from videos of computed {3D} infant body models is equally effective compared to conventional {RGB} Video rating
General Movement Assessment from videos of computed 3D infant body models is equally effective compared to conventional RGB Video rating

Schroeder, S., Hesse, N., Weinberger, R., Tacke, U., Gerstl, L., Hilgendorff, A., Heinen, F., Arens, M., Bodensteiner, C., Dijkstra, L. J., Pujades, S., Black, M., Hadders-Algra, M.

Early Human Development, 2020 (article)

Abstract
Background: General Movement Assessment (GMA) is a powerful tool to predict Cerebral Palsy (CP). Yet, GMA requires substantial training hampering its implementation in clinical routine. This inspired a world-wide quest for automated GMA. Aim: To test whether a low-cost, marker-less system for three-dimensional motion capture from RGB depth sequences using a whole body infant model may serve as the basis for automated GMA. Study design: Clinical case study at an academic neurodevelopmental outpatient clinic. Subjects: Twenty-nine high-risk infants were recruited and assessed at their clinical follow-up at 2-4 month corrected age (CA). Their neurodevelopmental outcome was assessed regularly up to 12-31 months CA. Outcome measures: GMA according to Hadders-Algra by a masked GMA-expert of conventional and computed 3D body model (“SMIL motion”) videos of the same GMs. Agreement between both GMAs was assessed, and sensitivity and specificity of both methods to predict CP at ≥12 months CA. Results: The agreement of the two GMA ratings was substantial, with κ=0.66 for the classification of definitely abnormal (DA) GMs and an ICC of 0.887 (95% CI 0.762;0.947) for a more detailed GM-scoring. Five children were diagnosed with CP (four bilateral, one unilateral CP). The GMs of the child with unilateral CP were twice rated as mildly abnormal. DA-ratings of both videos predicted bilateral CP well: sensitivity 75% and 100%, specificity 88% and 92% for conventional and SMIL motion videos, respectively. Conclusions: Our computed infant 3D full body model is an attractive starting point for automated GMA in infants at risk of CP.

ps

[BibTex]

[BibTex]


Acoustically powered surface-slipping mobile microrobots
Acoustically powered surface-slipping mobile microrobots

Aghakhani, A., Yasa, O., Wrede, P., Sitti, M.

Proceedings of the National Academy of Sciences, 117, National Acad Sciences, 2020 (article)

Abstract
Untethered synthetic microrobots have significant potential to revolutionize minimally invasive medical interventions in the future. However, their relatively slow speed and low controllability near surfaces typically are some of the barriers standing in the way of their medical applications. Here, we introduce acoustically powered microrobots with a fast, unidirectional surface-slipping locomotion on both flat and curved surfaces. The proposed three-dimensionally printed, bullet-shaped microrobot contains a spherical air bubble trapped inside its internal body cavity, where the bubble is resonated using acoustic waves. The net fluidic flow due to the bubble oscillation orients the microrobot's axisymmetric axis perpendicular to the wall and then propels it laterally at very high speeds (up to 90 body lengths per second with a body length of 25 µm) while inducing an attractive force toward the wall. To achieve unidirectional locomotion, a small fin is added to the microrobot’s cylindrical body surface, which biases the propulsion direction. For motion direction control, the microrobots are coated anisotropically with a soft magnetic nanofilm layer, allowing steering under a uniform magnetic field. Finally, surface locomotion capability of the microrobots is demonstrated inside a three-dimensional circular cross-sectional microchannel under acoustic actuation. Overall, the combination of acoustic powering and magnetic steering can be effectively utilized to actuate and navigate these microrobots in confined and hard-to-reach body location areas in a minimally invasive fashion.

pi

[BibTex]

[BibTex]


Bio-inspired Flexible Twisting Wings Increase Lift and Efficiency of a Flapping Wing Micro Air Vehicle
Bio-inspired Flexible Twisting Wings Increase Lift and Efficiency of a Flapping Wing Micro Air Vehicle

Colmenares, D., Kania, R., Zhang, W., Sitti, M.

arXiv preprint arXiv:2001.11586, 2020 (article)

Abstract
We investigate the effect of wing twist flexibility on lift and efficiency of a flapping-wing micro air vehicle capable of liftoff. Wings used previously were chosen to be fully rigid due to modeling and fabrication constraints. However, biological wings are highly flexible and other micro air vehicles have successfully utilized flexible wing structures for specialized tasks. The goal of our study is to determine if dynamic twisting of flexible wings can increase overall aerodynamic lift and efficiency. A flexible twisting wing design was found to increase aerodynamic efficiency by 41.3%, translational lift production by 35.3%, and the effective lift coefficient by 63.7% compared to the rigid-wing design. These results exceed the predictions of quasi-steady blade element models, indicating the need for unsteady computational fluid dynamics simulations of twisted flapping wings.

pi

[BibTex]

[BibTex]


Cohesive self-organization of mobile microrobotic swarms
Cohesive self-organization of mobile microrobotic swarms

Yigit, B., Alapan, Y., Sitti, M.

arXiv preprint arXiv:1907.05856, 2020 (article)

pi

[BibTex]

[BibTex]


no image
Multifunctional Surface Microrollers for Targeted Cargo Delivery in Physiological Blood Flow

Yunus, , Ugur, , Alp, , Metin,

2020 (article) Accepted

pi

[BibTex]


Bioinspired underwater locomotion of light-driven liquid crystal gels
Bioinspired underwater locomotion of light-driven liquid crystal gels

Shahsavan, H., Aghakhani, A., Zeng, H., Guo, Y., Davidson, Z. S., Priimagi, A., Sitti, M.

Proceedings of the National Academy of Sciences, National Acad Sciences, 2020 (article)

Abstract
Untethered dynamic shape programming and control of soft materials have significant applications in technologies such as soft robots, medical devices, organ-on-a-chip, and optical devices. Here, we present a solution to remotely actuate and move soft materials underwater in a fast, efficient, and controlled manner using photoresponsive liquid crystal gels (LCGs). LCG constructs with engineered molecular alignment show a low and sharp phase-transition temperature and experience considerable density reduction by light exposure, thereby allowing rapid and reversible shape changes. We demonstrate different modes of underwater locomotion, such as crawling, walking, jumping, and swimming, by localized and time-varying illumination of LCGs. The diverse locomotion modes of smart LCGs can provide a new toolbox for designing efficient light-fueled soft robots in fluid-immersed media.

pi

[BibTex]

[BibTex]


Differentiation of blackbox combinatorial solvers
Differentiation of blackbox combinatorial solvers

Vlastelica, M., Paulus, A., Musil, V., Martius, G., Rolı́nek, M.

In International Conference on Learning Representations, ICLR’20, 2020 (incollection)

al

link (url) [BibTex]

link (url) [BibTex]


Additive manufacturing of cellulose-based materials with continuous, multidirectional stiffness gradients
Additive manufacturing of cellulose-based materials with continuous, multidirectional stiffness gradients

Giachini, P., Gupta, S., Wang, W., Wood, D., Yunusa, M., Baharlou, E., Sitti, M., Menges, A.

Science Advances, 6, American Association for the Advancement of Science, 2020 (article)

Abstract
Functionally graded materials (FGMs) enable applications in fields such as biomedicine and architecture, but their fabrication suffers from shortcomings in gradient continuity, interfacial bonding, and directional freedom. In addition, most commercial design software fail to incorporate property gradient data, hindering explorations of the design space of FGMs. Here, we leveraged a combined approach of materials engineering and digital processing to enable extrusion-based multimaterial additive manufacturing of cellulose-based tunable viscoelastic materials with continuous, high-contrast, and multidirectional stiffness gradients. A method to engineer sets of cellulose-based materials with similar compositions, yet distinct mechanical and rheological properties, was established. In parallel, a digital workflow was developed to embed gradient information into design models with integrated fabrication path planning. The payoff of integrating these physical and digital tools is the ability to achieve the same stiffness gradient in multiple ways, opening design possibilities previously limited by the rigid coupling of material and geometry.

pi

[BibTex]

[BibTex]

2013


Branch\&Rank for Efficient Object Detection
Branch&Rank for Efficient Object Detection

Lehmann, A., Gehler, P., VanGool, L.

International Journal of Computer Vision, Springer, December 2013 (article)

Abstract
Ranking hypothesis sets is a powerful concept for efficient object detection. In this work, we propose a branch&rank scheme that detects objects with often less than 100 ranking operations. This efficiency enables the use of strong and also costly classifiers like non-linear SVMs with RBF-TeX kernels. We thereby relieve an inherent limitation of branch&bound methods as bounds are often not tight enough to be effective in practice. Our approach features three key components: a ranking function that operates on sets of hypotheses and a grouping of these into different tasks. Detection efficiency results from adaptively sub-dividing the object search space into decreasingly smaller sets. This is inherited from branch&bound, while the ranking function supersedes a tight bound which is often unavailable (except for rather limited function classes). The grouping makes the system effective: it separates image classification from object recognition, yet combines them in a single formulation, phrased as a structured SVM problem. A novel aspect of branch&rank is that a better ranking function is expected to decrease the number of classifier calls during detection. We use the VOC’07 dataset to demonstrate the algorithmic properties of branch&rank.

ps

pdf link (url) [BibTex]

2013


pdf link (url) [BibTex]


Strong Appearance and Expressive Spatial Models for Human Pose Estimation
Strong Appearance and Expressive Spatial Models for Human Pose Estimation

Pishchulin, L., Andriluka, M., Gehler, P., Schiele, B.

In International Conference on Computer Vision (ICCV), pages: 3487 - 3494 , IEEE, Computer Vision (ICCV), IEEE International Conference on , December 2013 (inproceedings)

Abstract
Typical approaches to articulated pose estimation combine spatial modelling of the human body with appearance modelling of body parts. This paper aims to push the state-of-the-art in articulated pose estimation in two ways. First we explore various types of appearance representations aiming to substantially improve the body part hypotheses. And second, we draw on and combine several recently proposed powerful ideas such as more flexible spatial models as well as image-conditioned spatial models. In a series of experiments we draw several important conclusions: (1) we show that the proposed appearance representations are complementary; (2) we demonstrate that even a basic tree-structure spatial human body model achieves state-of-the-art performance when augmented with the proper appearance representation; and (3) we show that the combination of the best performing appearance model with a flexible image-conditioned spatial model achieves the best result, significantly improving over the state of the art, on the "Leeds Sports Poses'' and "Parse'' benchmarks.

ps

pdf DOI Project Page [BibTex]

pdf DOI Project Page [BibTex]


Extracting Postural Synergies for Robotic Grasping
Extracting Postural Synergies for Robotic Grasping

Romero, J., Feix, T., Ek, C., Kjellstrom, H., Kragic, D.

Robotics, IEEE Transactions on, 29(6):1342-1352, December 2013 (article)

ps

[BibTex]

[BibTex]


Understanding High-Level Semantics by Modeling Traffic Patterns
Understanding High-Level Semantics by Modeling Traffic Patterns

Zhang, H., Geiger, A., Urtasun, R.

In International Conference on Computer Vision, pages: 3056-3063, Sydney, Australia, December 2013 (inproceedings)

Abstract
In this paper, we are interested in understanding the semantics of outdoor scenes in the context of autonomous driving. Towards this goal, we propose a generative model of 3D urban scenes which is able to reason not only about the geometry and objects present in the scene, but also about the high-level semantics in the form of traffic patterns. We found that a small number of patterns is sufficient to model the vast majority of traffic scenes and show how these patterns can be learned. As evidenced by our experiments, this high-level reasoning significantly improves the overall scene estimation as well as the vehicle-to-lane association when compared to state-of-the-art approaches. All data and code will be made available upon publication.

avg ps

pdf [BibTex]

pdf [BibTex]


A Non-parametric {Bayesian} Network Prior of Human Pose
A Non-parametric Bayesian Network Prior of Human Pose

Lehrmann, A. M., Gehler, P., Nowozin, S.

In Proceedings IEEE Conf. on Computer Vision (ICCV), pages: 1281-1288, IEEE International Conference on Computer Vision, December 2013 (inproceedings)

Abstract
Having a sensible prior of human pose is a vital ingredient for many computer vision applications, including tracking and pose estimation. While the application of global non-parametric approaches and parametric models has led to some success, finding the right balance in terms of flexibility and tractability, as well as estimating model parameters from data has turned out to be challenging. In this work, we introduce a sparse Bayesian network model of human pose that is non-parametric with respect to the estimation of both its graph structure and its local distributions. We describe an efficient sampling scheme for our model and show its tractability for the computation of exact log-likelihoods. We empirically validate our approach on the Human 3.6M dataset and demonstrate superior performance to global models and parametric networks. We further illustrate our model's ability to represent and compose poses not present in the training set (compositionality) and describe a speed-accuracy trade-off that allows realtime scoring of poses.

ps

Project page pdf DOI Project Page [BibTex]

Project page pdf DOI Project Page [BibTex]


Towards understanding action recognition
Towards understanding action recognition

Jhuang, H., Gall, J., Zuffi, S., Schmid, C., Black, M. J.

In IEEE International Conference on Computer Vision (ICCV), pages: 3192-3199, IEEE, Sydney, Australia, December 2013 (inproceedings)

Abstract
Although action recognition in videos is widely studied, current methods often fail on real-world datasets. Many recent approaches improve accuracy and robustness to cope with challenging video sequences, but it is often unclear what affects the results most. This paper attempts to provide insights based on a systematic performance evaluation using thoroughly-annotated data of human actions. We annotate human Joints for the HMDB dataset (J-HMDB). This annotation can be used to derive ground truth optical flow and segmentation. We evaluate current methods using this dataset and systematically replace the output of various algorithms with ground truth. This enables us to discover what is important – for example, should we work on improving flow algorithms, estimating human bounding boxes, or enabling pose estimation? In summary, we find that highlevel pose features greatly outperform low/mid level features; in particular, pose over time is critical, but current pose estimation algorithms are not yet reliable enough to provide this information. We also find that the accuracy of a top-performing action recognition framework can be greatly increased by refining the underlying low/mid level features; this suggests it is important to improve optical flow and human detection algorithms. Our analysis and JHMDB dataset should facilitate a deeper understanding of action recognition algorithms.

ps

Website Errata Poster Paper Slides DOI Project Page Project Page Project Page [BibTex]

Website Errata Poster Paper Slides DOI Project Page Project Page Project Page [BibTex]


Mixing Decoded Cursor Velocity and Position from an Offline Kalman Filter Improves Cursor Control in People with Tetraplegia
Mixing Decoded Cursor Velocity and Position from an Offline Kalman Filter Improves Cursor Control in People with Tetraplegia

Homer, M., Harrison, M., Black, M. J., Perge, J., Cash, S., Friehs, G., Hochberg, L.

In 6th International IEEE EMBS Conference on Neural Engineering, pages: 715-718, San Diego, November 2013 (inproceedings)

Abstract
Kalman filtering is a common method to decode neural signals from the motor cortex. In clinical research investigating the use of intracortical brain computer interfaces (iBCIs), the technique enabled people with tetraplegia to control assistive devices such as a computer or robotic arm directly from their neural activity. For reaching movements, the Kalman filter typically estimates the instantaneous endpoint velocity of the control device. Here, we analyzed attempted arm/hand movements by people with tetraplegia to control a cursor on a computer screen to reach several circular targets. A standard velocity Kalman filter is enhanced to additionally decode for the cursor’s position. We then mix decoded velocity and position to generate cursor movement commands. We analyzed data, offline, from two participants across six sessions. Root mean squared error between the actual and estimated cursor trajectory improved by 12.2 ±10.5% (pairwise t-test, p<0.05) as compared to a standard velocity Kalman filter. The findings suggest that simultaneously decoding for intended velocity and position and using them both to generate movement commands can improve the performance of iBCIs.

ps

pdf Project Page [BibTex]

pdf Project Page [BibTex]


Markov Random Field Modeling, Inference & Learning in Computer Vision & Image Understanding: A Survey
Markov Random Field Modeling, Inference & Learning in Computer Vision & Image Understanding: A Survey

Wang, C., Komodakis, N., Paragios, N.

Computer Vision and Image Understanding (CVIU), 117(11):1610-1627, November 2013 (article)

Abstract
In this paper, we present a comprehensive survey of Markov Random Fields (MRFs) in computer vision and image understanding, with respect to the modeling, the inference and the learning. While MRFs were introduced into the computer vision field about two decades ago, they started to become a ubiquitous tool for solving visual perception problems around the turn of the millennium following the emergence of efficient inference methods. During the past decade, a variety of MRF models as well as inference and learning methods have been developed for addressing numerous low, mid and high-level vision problems. While most of the literature concerns pairwise MRFs, in recent years we have also witnessed significant progress in higher-order MRFs, which substantially enhances the expressiveness of graph-based models and expands the domain of solvable problems. This survey provides a compact and informative summary of the major literature in this research topic.

ps

Publishers site pdf [BibTex]

Publishers site pdf [BibTex]


Puppet Flow
Puppet Flow

Zuffi, S., Black, M. J.

(7), Max Planck Institute for Intelligent Systems, October 2013 (techreport)

Abstract
We introduce Puppet Flow (PF), a layered model describing the optical flow of a person in a video sequence. We consider video frames composed by two layers: a foreground layer corresponding to a person, and background. We model the background as an affine flow field. The foreground layer, being a moving person, requires reasoning about the articulated nature of the human body. We thus represent the foreground layer with the Deformable Structures model (DS), a parametrized 2D part-based human body representation. We call the motion field defined through articulated motion and deformation of the DS model, a Puppet Flow. By exploiting the DS representation, Puppet Flow is a parametrized optical flow field, where parameters are the person's pose, gender and body shape.

ps

pdf Project Page Project Page [BibTex]

pdf Project Page Project Page [BibTex]


no image
Dry adhesives and methods for making dry adhesives

Sitti, M., Kim, S.

sep 2013, US Patent App. 14/016,651 (misc)

pi

[BibTex]

[BibTex]


no image
Dry adhesives and methods for making dry adhesives

Sitti, M., Kim, S.

sep 2013, US Patent App. 14/016,683 (misc)

pi

[BibTex]

[BibTex]


Vision meets Robotics: The {KITTI} Dataset
Vision meets Robotics: The KITTI Dataset

Geiger, A., Lenz, P., Stiller, C., Urtasun, R.

International Journal of Robotics Research, 32(11):1231 - 1237 , Sage Publishing, September 2013 (article)

Abstract
We present a novel dataset captured from a VW station wagon for use in mobile robotics and autonomous driving research. In total, we recorded 6 hours of traffic scenarios at 10-100 Hz using a variety of sensor modalities such as high-resolution color and grayscale stereo cameras, a Velodyne 3D laser scanner and a high-precision GPS/IMU inertial navigation system. The scenarios are diverse, capturing real-world traffic situations and range from freeways over rural areas to inner-city scenes with many static and dynamic objects. Our data is calibrated, synchronized and timestamped, and we provide the rectified and raw image sequences. Our dataset also contains object labels in the form of 3D tracklets and we provide online benchmarks for stereo, optical flow, object detection and other tasks. This paper describes our recording platform, the data format and the utilities that we provide.

avg ps

pdf DOI [BibTex]

pdf DOI [BibTex]


no image
Dry adhesives and methods for making dry adhesives

Sitti, M., Kim, S.

sep 2013, US Patent 8,524,092 (misc)

pi

[BibTex]

[BibTex]


Human Pose Calculation from Optical Flow Data
Human Pose Calculation from Optical Flow Data

Black, M., Loper, M., Romero, J., Zuffi, S.

European Patent Application EP 2843621 , August 2013 (patent)

ps

Google Patents [BibTex]

Google Patents [BibTex]


Statistics on Manifolds with Applications to Modeling Shape Deformations
Statistics on Manifolds with Applications to Modeling Shape Deformations

Freifeld, O.

Brown University, August 2013 (phdthesis)

Abstract
Statistical models of non-rigid deformable shape have wide application in many fi elds, including computer vision, computer graphics, and biometry. We show that shape deformations are well represented through nonlinear manifolds that are also matrix Lie groups. These pattern-theoretic representations lead to several advantages over other alternatives, including a principled measure of shape dissimilarity and a natural way to compose deformations. Moreover, they enable building models using statistics on manifolds. Consequently, such models are superior to those based on Euclidean representations. We demonstrate this by modeling 2D and 3D human body shape. Shape deformations are only one example of manifold-valued data. More generally, in many computer-vision and machine-learning problems, nonlinear manifold representations arise naturally and provide a powerful alternative to Euclidean representations. Statistics is traditionally concerned with data in a Euclidean space, relying on the linear structure and the distances associated with such a space; this renders it inappropriate for nonlinear spaces. Statistics can, however, be generalized to nonlinear manifolds. Moreover, by respecting the underlying geometry, the statistical models result in not only more e ffective analysis but also consistent synthesis. We go beyond previous work on statistics on manifolds by showing how, even on these curved spaces, problems related to modeling a class from scarce data can be dealt with by leveraging information from related classes residing in di fferent regions of the space. We show the usefulness of our approach with 3D shape deformations. To summarize our main contributions: 1) We de fine a new 2D articulated model -- more expressive than traditional ones -- of deformable human shape that factors body-shape, pose, and camera variations. Its high realism is obtained from training data generated from a detailed 3D model. 2) We defi ne a new manifold-based representation of 3D shape deformations that yields statistical deformable-template models that are better than the current state-of-the- art. 3) We generalize a transfer learning idea from Euclidean spaces to Riemannian manifolds. This work demonstrates the value of modeling manifold-valued data and their statistics explicitly on the manifold. Specifi cally, the methods here provide new tools for shape analysis.

ps

pdf Project Page [BibTex]


Poselet conditioned pictorial structures
Poselet conditioned pictorial structures

Pishchulin, L., Andriluka, M., Gehler, P., Schiele, B.

In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages: 588 - 595, IEEE, Portland, OR, Conference on Computer Vision and Pattern Recognition (CVRP), June 2013 (inproceedings)

ps

pdf DOI Project Page [BibTex]

pdf DOI Project Page [BibTex]


Occlusion Patterns for Object Class Detection
Occlusion Patterns for Object Class Detection

Pepik, B., Stark, M., Gehler, P., Schiele, B.

In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Portland, OR, June 2013 (inproceedings)

Abstract
Despite the success of recent object class recognition systems, the long-standing problem of partial occlusion re- mains a major challenge, and a principled solution is yet to be found. In this paper we leave the beaten path of meth- ods that treat occlusion as just another source of noise – instead, we include the occluder itself into the modelling, by mining distinctive, reoccurring occlusion patterns from annotated training data. These patterns are then used as training data for dedicated detectors of varying sophistica- tion. In particular, we evaluate and compare models that range from standard object class detectors to hierarchical, part-based representations of occluder/occludee pairs. In an extensive evaluation we derive insights that can aid fur- ther developments in tackling the occlusion challenge.

ps

pdf Project Page [BibTex]

pdf Project Page [BibTex]


Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization
Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization

(CVPR13 Best Paper Runner-Up)

Brubaker, M. A., Geiger, A., Urtasun, R.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2013), pages: 3057-3064, IEEE, Portland, OR, June 2013 (inproceedings)

Abstract
In this paper we propose an affordable solution to self- localization, which utilizes visual odometry and road maps as the only inputs. To this end, we present a probabilis- tic model as well as an efficient approximate inference al- gorithm, which is able to utilize distributed computation to meet the real-time requirements of autonomous systems. Because of the probabilistic nature of the model we are able to cope with uncertainty due to noisy visual odometry and inherent ambiguities in the map ( e.g ., in a Manhattan world). By exploiting freely available, community devel- oped maps and visual odometry measurements, we are able to localize a vehicle up to 3m after only a few seconds of driving on maps which contain more than 2,150km of driv- able roads.

avg ps

pdf supplementary project page [BibTex]

pdf supplementary project page [BibTex]


Human Pose Estimation using Body Parts Dependent Joint Regressors
Human Pose Estimation using Body Parts Dependent Joint Regressors

Dantone, M., Gall, J., Leistner, C., van Gool, L.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 3041-3048, IEEE, Portland, OR, USA, June 2013 (inproceedings)

Abstract
In this work, we address the problem of estimating 2d human pose from still images. Recent methods that rely on discriminatively trained deformable parts organized in a tree model have shown to be very successful in solving this task. Within such a pictorial structure framework, we address the problem of obtaining good part templates by proposing novel, non-linear joint regressors. In particular, we employ two-layered random forests as joint regressors. The first layer acts as a discriminative, independent body part classifier. The second layer takes the estimated class distributions of the first one into account and is thereby able to predict joint locations by modeling the interdependence and co-occurrence of the parts. This results in a pose estimation framework that takes dependencies between body parts already for joint localization into account and is thus able to circumvent typical ambiguities of tree structures, such as for legs and arms. In the experiments, we demonstrate that our body parts dependent joint regressors achieve a higher joint localization accuracy than tree-based state-of-the-art methods.

ps

pdf DOI Project Page [BibTex]

pdf DOI Project Page [BibTex]


A fully-connected layered model of foreground and background flow
A fully-connected layered model of foreground and background flow

Sun, D., Wulff, J., Sudderth, E., Pfister, H., Black, M.

In IEEE Conf. on Computer Vision and Pattern Recognition, (CVPR 2013), pages: 2451-2458, Portland, OR, June 2013 (inproceedings)

Abstract
Layered models allow scene segmentation and motion estimation to be formulated together and to inform one another. Traditional layered motion methods, however, employ fairly weak models of scene structure, relying on locally connected Ising/Potts models which have limited ability to capture long-range correlations in natural scenes. To address this, we formulate a fully-connected layered model that enables global reasoning about the complicated segmentations of real objects. Optimization with fully-connected graphical models is challenging, and our inference algorithm leverages recent work on efficient mean field updates for fully-connected conditional random fields. These methods can be implemented efficiently using high-dimensional Gaussian filtering. We combine these ideas with a layered flow model, and find that the long-range connections greatly improve segmentation into figure-ground layers when compared with locally connected MRF models. Experiments on several benchmark datasets show that the method can recover fine structures and large occlusion regions, with good flow accuracy and much lower computational cost than previous locally-connected layered models.

ps

pdf Supplemental Material Project Page Project Page [BibTex]

pdf Supplemental Material Project Page Project Page [BibTex]