Header logo is


2020


no image
Vision-based Force Estimation for a da Vinci Instrument Using Deep Neural Networks

Lee, Y., Husin, H. M., Forte, M. P., Lee, S., Kuchenbecker, K. J.

Extended abstract presented as an Emerging Technology ePoster at the Annual Meeting of the Society of American Gastrointestinal and Endoscopic Surgeons (SAGES), Cleveland, Ohio, USA, August 2020 (misc) Accepted

hi

[BibTex]

2020


[BibTex]


Learning to Dress 3D People in Generative Clothing
Learning to Dress 3D People in Generative Clothing

Ma, Q., Yang, J., Ranjan, A., Pujades, S., Pons-Moll, G., Tang, S., Black, M. J.

In Computer Vision and Pattern Recognition (CVPR), June 2020 (inproceedings)

Abstract
Three-dimensional human body models are widely used in the analysis of human pose and motion. Existing models, however, are learned from minimally-clothed 3D scans and thus do not generalize to the complexity of dressed people in common images and videos. Additionally, current models lack the expressive power needed to represent the complex non-linear geometry of pose-dependent clothing shape. To address this, we learn a generative 3D mesh model of clothed people from 3D scans with varying pose and clothing. Specifically, we train a conditional Mesh-VAE-GAN to learn the clothing deformation from the SMPL body model, making clothing an additional term on SMPL. Our model is conditioned on both pose and clothing type, giving the ability to draw samples of clothing to dress different body shapes in a variety of styles and poses. To preserve wrinkle detail, our Mesh-VAE-GAN extends patchwise discriminators to 3D meshes. Our model, named CAPE, represents global shape and fine local structure, effectively extending the SMPL body model to clothing. To our knowledge, this is the first generative model that directly dresses 3D human body meshes and generalizes to different poses.

ps

arxiv project page [BibTex]


Generating 3D People in Scenes without People
Generating 3D People in Scenes without People

Zhang, Y., Hassan, M., Neumann, H., Black, M. J., Tang, S.

In Computer Vision and Pattern Recognition (CVPR), June 2020 (inproceedings)

Abstract
We present a fully-automatic system that takes a 3D scene and generates plausible 3D human bodies that are posed naturally in that 3D scene. Given a 3D scene without people, humans can easily imagine how people could interact with the scene and the objects in it. However, this is a challenging task for a computer as solving it requires (1) the generated human bodies should be semantically plausible with the 3D environment, e.g. people sitting on the sofa or cooking near the stove; (2) the generated human-scene interaction should be physically feasible in the way that the human body and scene do not interpenetrate while, at the same time, body-scene contact supports physical interactions. To that end, we make use of the surface-based 3D human model SMPL-X. We first train a conditional variational autoencoder to predict semantically plausible 3D human pose conditioned on latent scene representations, then we further refine the generated 3D bodies using scene constraints to enforce feasible physical interaction. We show that our approach is able to synthesize realistic and expressive 3D human bodies that naturally interact with 3D environment. We perform extensive experiments demonstrating that our generative framework compares favorably with existing methods, both qualitatively and quantitatively. We believe that our scene-conditioned 3D human generation pipeline will be useful for numerous applications; e.g. to generate training data for human pose estimation, in video games and in VR/AR.

ps

PDF link (url) [BibTex]

PDF link (url) [BibTex]


Learning Physics-guided Face Relighting under Directional Light
Learning Physics-guided Face Relighting under Directional Light

Nestmeyer, T., Lalonde, J., Matthews, I., Lehrmann, A. M.

In Conference on Computer Vision and Pattern Recognition, IEEE/CVF, June 2020 (inproceedings) Accepted

Abstract
Relighting is an essential step in realistically transferring objects from a captured image into another environment. For example, authentic telepresence in Augmented Reality requires faces to be displayed and relit consistent with the observer's scene lighting. We investigate end-to-end deep learning architectures that both de-light and relight an image of a human face. Our model decomposes the input image into intrinsic components according to a diffuse physics-based image formation model. We enable non-diffuse effects including cast shadows and specular highlights by predicting a residual correction to the diffuse render. To train and evaluate our model, we collected a portrait database of 21 subjects with various expressions and poses. Each sample is captured in a controlled light stage setup with 32 individual light sources. Our method creates precise and believable relighting results and generalizes to complex illumination conditions and challenging poses, including when the subject is not looking straight at the camera.

ps

Paper [BibTex]

Paper [BibTex]


{VIBE}: Video Inference for Human Body Pose and Shape Estimation
VIBE: Video Inference for Human Body Pose and Shape Estimation

Kocabas, M., Athanasiou, N., Black, M. J.

In Computer Vision and Pattern Recognition (CVPR), June 2020 (inproceedings)

Abstract
Human motion is fundamental to understanding behavior. Despite progress on single-image 3D pose and shape estimation, existing video-based state-of-the-art methodsfail to produce accurate and natural motion sequences due to a lack of ground-truth 3D motion data for training. To address this problem, we propose “Video Inference for Body Pose and Shape Estimation” (VIBE), which makes use of an existing large-scale motion capture dataset (AMASS) together with unpaired, in-the-wild, 2D keypoint annotations. Our key novelty is an adversarial learning framework that leverages AMASS to discriminate between real human motions and those produced by our temporal pose and shape regression networks. We define a temporal network architecture and show that adversarial training, at the sequence level, produces kinematically plausible motion sequences without in-the-wild ground-truth 3D labels. We perform extensive experimentation to analyze the importance of motion and demonstrate the effectiveness of VIBE on challenging 3D pose estimation datasets, achieving state-of-the-art performance. Code and pretrained models are available at https://github.com/mkocabas/VIBE

ps

arXiv code [BibTex]

arXiv code [BibTex]


no image
Unifying lamination parameters with spectral-Tchebychev method for variable-stiffness composite plate design

Serhat, G., Bediz, B., Basdogan, I.

Composite Structures, 242(112183), June 2020 (article)

Abstract
This paper describes an efficient framework for the design and optimization of the variable-stiffness composite plates. Equations of motion are solved using a Tchebychev polynomials-based spectral modeling approach that is extended for the classical laminated plate theory. This approach provides highly significant analysis speed-ups with respect to the conventional finite element method. The proposed framework builds on a variable-stiffness laminate design methodology that utilizes lamination parameters for representing the stiffness properties compactly and master node variables for modeling the stiffness variation through distance-based interpolation. The current study improves the existing method by optimizing the locations of the master nodes in addition to their lamination parameter values. The optimization process is promoted by the computationally efficient spectral-Tchebychev solution method. Case studies are performed for maximizing the fundamental frequencies of the plates with different boundary conditions and aspect ratios. The results show that significant improvements can be rapidly achieved compared to optimal constant-stiffness designs by utilizing the developed framework. In addition, the optimization of master node locations resulted in additional improvements in the optimal response values highlighting the importance of including the node positions within the design variables.

hi

DOI [BibTex]

DOI [BibTex]


From Variational to Deterministic Autoencoders
From Variational to Deterministic Autoencoders

Ghosh*, P., Sajjadi*, M. S. M., Vergari, A., Black, M. J., Schölkopf, B.

8th International Conference on Learning Representations (ICLR) , April 2020, *equal contribution (conference) Accepted

Abstract
Variational Autoencoders (VAEs) provide a theoretically-backed framework for deep generative models. However, they often produce “blurry” images, which is linked to their training objective. Sampling in the most popular implementation, the Gaussian VAE, can be interpreted as simply injecting noise to the input of a deterministic decoder. In practice, this simply enforces a smooth latent space structure. We challenge the adoption of the full VAE framework on this specific point in favor of a simpler, deterministic one. Specifically, we investigate how substituting stochasticity with other explicit and implicit regularization schemes can lead to a meaningful latent space without having to force it to conform to an arbitrarily chosen prior. To retrieve a generative mechanism for sampling new data points, we propose to employ an efficient ex-post density estimation step that can be readily adopted both for the proposed deterministic autoencoders as well as to improve sample quality of existing VAEs. We show in a rigorous empirical study that regularized deterministic autoencoding achieves state-of-the-art sample quality on the common MNIST, CIFAR-10 and CelebA datasets.

ei ps

arXiv [BibTex]

arXiv [BibTex]


A Fabric-Based Sensing System for Recognizing Social Touch
A Fabric-Based Sensing System for Recognizing Social Touch

Burns, R. B., Lee, H., Seifi, H., Kuchenbecker, K. J.

Work-in-progress paper (3 pages) to be presented at the IEEE Haptics Symposium, Washington, DC, USA, March 2020 (misc) Accepted

Abstract
We present a fabric-based piezoresistive tactile sensor system designed to detect social touch gestures on a robot. The unique sensor design utilizes three layers of low-conductivity fabric sewn together on alternating edges to form an accordion pattern and secured between two outer high-conductivity layers. This five-layer design demonstrates a greater resistance range and better low-force sensitivity than previous designs that use one layer of low-conductivity fabric with or without a plastic mesh layer. An individual sensor from our system can presently identify six different communication gestures – squeezing, patting, scratching, poking, hand resting without movement, and no touch – with an average accuracy of 90%. A layer of foam can be added beneath the sensor to make a rigid robot more appealing for humans to touch without inhibiting the system’s ability to register social touch gestures.

hi

Project Page [BibTex]

Project Page [BibTex]


Do Touch Gestures Affect How Electrovibration Feels?
Do Touch Gestures Affect How Electrovibration Feels?

Vardar, Y., Kuchenbecker, K. J.

Hands-on demonstration (1 page) presented at the IEEE Haptics Symposium, Washington, DC, USA, March 2020 (misc) Accepted

hi

[BibTex]

[BibTex]


Learning to Predict Perceptual Distributions of Haptic Adjectives
Learning to Predict Perceptual Distributions of Haptic Adjectives

Richardson, B. A., Kuchenbecker, K. J.

Frontiers in Neurorobotics, 13(116):1-16, Febuary 2020 (article)

Abstract
When humans touch an object with their fingertips, they can immediately describe its tactile properties using haptic adjectives, such as hardness and roughness; however, human perception is subjective and noisy, with significant variation across individuals and interactions. Recent research has worked to provide robots with similar haptic intelligence but was focused on identifying binary haptic adjectives, ignoring both attribute intensity and perceptual variability. Combining ordinal haptic adjective labels gathered from human subjects for a set of 60 objects with features automatically extracted from raw multi-modal tactile data collected by a robot repeatedly touching the same objects, we designed a machine-learning method that incorporates partial knowledge of the distribution of object labels into training; then, from a single interaction, it predicts a probability distribution over the set of ordinal labels. In addition to analyzing the collected labels (10 basic haptic adjectives) and demonstrating the quality of our method's predictions, we hold out specific features to determine the influence of individual sensor modalities on the predictive performance for each adjective. Our results demonstrate the feasibility of modeling both the intensity and the variation of haptic perception, two crucial yet previously neglected components of human haptic perception.

hi

DOI [BibTex]

DOI [BibTex]


Chained Representation Cycling: Learning to Estimate 3D Human Pose and Shape by Cycling Between Representations
Chained Representation Cycling: Learning to Estimate 3D Human Pose and Shape by Cycling Between Representations

Rueegg, N., Lassner, C., Black, M. J., Schindler, K.

In Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), Febuary 2020 (inproceedings)

Abstract
The goal of many computer vision systems is to transform image pixels into 3D representations. Recent popular models use neural networks to regress directly from pixels to 3D object parameters. Such an approach works well when supervision is available, but in problems like human pose and shape estimation, it is difficult to obtain natural images with 3D ground truth. To go one step further, we propose a new architecture that facilitates unsupervised, or lightly supervised, learning. The idea is to break the problem into a series of transformations between increasingly abstract representations. Each step involves a cycle designed to be learnable without annotated training data, and the chain of cycles delivers the final solution. Specifically, we use 2D body part segments as an intermediate representation that contains enough information to be lifted to 3D, and at the same time is simple enough to be learned in an unsupervised way. We demonstrate the method by learning 3D human pose and shape from un-paired and un-annotated images. We also explore varying amounts of paired data and show that cycling greatly alleviates the need for paired data. While we present results for modeling humans, our formulation is general and can be applied to other vision problems.

ps

pdf [BibTex]

pdf [BibTex]


no image
Exercising with Baxter: Preliminary Support for Assistive Social-Physical Human-Robot Interaction

Fitter, N. T., Mohan, M., Kuchenbecker, K. J., Johnson, M. J.

Journal of NeuroEngineering and Rehabilitation, 17(19), Febuary 2020 (article)

Abstract
Background: The worldwide population of older adults will soon exceed the capacity of assisted living facilities. Accordingly, we aim to understand whether appropriately designed robots could help older adults stay active at home. Methods: Building on related literature as well as guidance from experts in game design, rehabilitation, and physical and occupational therapy, we developed eight human-robot exercise games for the Baxter Research Robot, six of which involve physical human-robot contact. After extensive iteration, these games were tested in an exploratory user study including 20 younger adult and 20 older adult users. Results: Only socially and physically interactive games fell in the highest ranges for pleasantness, enjoyment, engagement, cognitive challenge, and energy level. Our games successfully spanned three different physical, cognitive, and temporal challenge levels. User trust and confidence in Baxter increased significantly between pre- and post-study assessments. Older adults experienced higher exercise, energy, and engagement levels than younger adults, and women rated the robot more highly than men on several survey questions. Conclusions: The results indicate that social-physical exercise with a robot is more pleasant, enjoyable, engaging, cognitively challenging, and energetic than similar interactions that lack physical touch. In addition to this main finding, researchers working in similar areas can build on our design practices, our open-source resources, and the age-group and gender differences that we found.

hi

DOI [BibTex]

DOI [BibTex]


Learning Multi-Human Optical Flow
Learning Multi-Human Optical Flow

Ranjan, A., Hoffmann, D. T., Tzionas, D., Tang, S., Romero, J., Black, M. J.

International Journal of Computer Vision (IJCV), January 2020 (article)

Abstract
The optical flow of humans is well known to be useful for the analysis of human action. Recent optical flow methods focus on training deep networks to approach the problem. However, the training data used by them does not cover the domain of human motion. Therefore, we develop a dataset of multi-human optical flow and train optical flow networks on this dataset. We use a 3D model of the human body and motion capture data to synthesize realistic flow fields in both single-and multi-person images. We then train optical flow networks to estimate human flow fields from pairs of images. We demonstrate that our trained networks are more accurate than a wide range of top methods on held-out test data and that they can generalize well to real image sequences. The code, trained models and the dataset are available for research.

ps

Paper Publisher Version poster link (url) DOI [BibTex]


General Movement Assessment from videos of computed {3D} infant body models is equally effective compared to conventional {RGB} Video rating
General Movement Assessment from videos of computed 3D infant body models is equally effective compared to conventional RGB Video rating

Schroeder, S., Hesse, N., Weinberger, R., Tacke, U., Gerstl, L., Hilgendorff, A., Heinen, F., Arens, M., Bodensteiner, C., Dijkstra, L. J., Pujades, S., Black, M., Hadders-Algra, M.

Early Human Development, 2020 (article)

Abstract
Background: General Movement Assessment (GMA) is a powerful tool to predict Cerebral Palsy (CP). Yet, GMA requires substantial training hampering its implementation in clinical routine. This inspired a world-wide quest for automated GMA. Aim: To test whether a low-cost, marker-less system for three-dimensional motion capture from RGB depth sequences using a whole body infant model may serve as the basis for automated GMA. Study design: Clinical case study at an academic neurodevelopmental outpatient clinic. Subjects: Twenty-nine high-risk infants were recruited and assessed at their clinical follow-up at 2-4 month corrected age (CA). Their neurodevelopmental outcome was assessed regularly up to 12-31 months CA. Outcome measures: GMA according to Hadders-Algra by a masked GMA-expert of conventional and computed 3D body model (“SMIL motion”) videos of the same GMs. Agreement between both GMAs was assessed, and sensitivity and specificity of both methods to predict CP at ≥12 months CA. Results: The agreement of the two GMA ratings was substantial, with κ=0.66 for the classification of definitely abnormal (DA) GMs and an ICC of 0.887 (95% CI 0.762;0.947) for a more detailed GM-scoring. Five children were diagnosed with CP (four bilateral, one unilateral CP). The GMs of the child with unilateral CP were twice rated as mildly abnormal. DA-ratings of both videos predicted bilateral CP well: sensitivity 75% and 100%, specificity 88% and 92% for conventional and SMIL motion videos, respectively. Conclusions: Our computed infant 3D full body model is an attractive starting point for automated GMA in infants at risk of CP.

ps

[BibTex]

[BibTex]


Physical Variables Underlying Tactile Stickiness during Fingerpad Detachment
Physical Variables Underlying Tactile Stickiness during Fingerpad Detachment

Nam, S., Vardar, Y., Gueorguiev, D., Kuchenbecker, K. J.

Frontiers in Neuroscience, 2020 (article) Accepted

Abstract
One may notice a relatively wide range of tactile sensations even when touching the same hard, flat surface in similar ways. Little is known about the reasons for this variability, so we decided to investigate how the perceptual intensity of light stickiness relates to the physical interaction between the skin and the surface. We conducted a psychophysical experiment in which nine participants actively pressed their finger on a flat glass plate with a normal force close to 1.5 N and detached it after a few seconds. A custom-designed apparatus recorded the contact force vector and the finger contact area during each interaction as well as pre- and post-trial finger moisture. After detaching their finger, participants judged the stickiness of the glass using a nine-point scale. We explored how sixteen physical variables derived from the recorded data correlate with each other and with the stickiness judgments of each participant. These analyses indicate that stickiness perception mainly depends on the pre-detachment pressing duration, the time taken for the finger to detach, and the impulse in the normal direction after the normal force changes sign; finger-surface adhesion seems to build with pressing time, causing a larger normal impulse during detachment and thus a more intense stickiness sensation. We additionally found a strong between-subjects correlation between maximum real contact area and peak pull-off force, as well as between finger moisture and impulse.

hi

[BibTex]

2002


Inferring hand motion from multi-cell recordings in motor cortex using a {Kalman} filter
Inferring hand motion from multi-cell recordings in motor cortex using a Kalman filter

Wu, W., Black, M. J., Gao, Y., Bienenstock, E., Serruya, M., Donoghue, J. P.

In SAB’02-Workshop on Motor Control in Humans and Robots: On the Interplay of Real Brains and Artificial Devices, pages: 66-73, Edinburgh, Scotland (UK), August 2002 (inproceedings)

ps

pdf [BibTex]

2002


pdf [BibTex]


Bayesian Inference of Visual Motion Boundaries
Bayesian Inference of Visual Motion Boundaries

Fleet, D. J., Black, M. J., Nestares, O.

In Exploring Artificial Intelligence in the New Millennium, pages: 139-174, (Editors: Lakemeyer, G. and Nebel, B.), Morgan Kaufmann Pub., July 2002 (incollection)

Abstract
This chapter addresses an open problem in visual motion analysis, the estimation of image motion in the vicinity of occlusion boundaries. With a Bayesian formulation, local image motion is explained in terms of multiple, competing, nonlinear models, including models for smooth (translational) motion and for motion boundaries. The generative model for motion boundaries explicitly encodes the orientation of the boundary, the velocities on either side, the motion of the occluding edge over time, and the appearance/disappearance of pixels at the boundary. We formulate the posterior probability distribution over the models and model parameters, conditioned on the image sequence. Approximate inference is achieved with a combination of tools: A Bayesian filter provides for online computation; factored sampling allows us to represent multimodal non-Gaussian distributions and to propagate beliefs with nonlinear dynamics from one time to the next; and mixture models are used to simplify the computation of joint prediction distributions in the Bayesian filter. To efficiently represent such a high-dimensional space, we also initialize samples using the responses of a low-level motion-discontinuity detector. The basic formulation and computational model provide a general probabilistic framework for motion estimation with multiple, nonlinear models.

ps

pdf [BibTex]

pdf [BibTex]


no image
Inferring hand motion from multi-cell recordings in motor cortex using a Kalman filter

Wu, W., Black M., Gao, Y., Bienenstock, E., Serruya, M., Donoghue, J.

Program No. 357.5. 2002 Abstract Viewer/Itinerary Planner, Society for Neuroscience, Washington, DC, 2002, Online (conference)

ps

abstract [BibTex]

abstract [BibTex]


Probabilistic inference of hand motion from neural activity in motor cortex
Probabilistic inference of hand motion from neural activity in motor cortex

Gao, Y., Black, M. J., Bienenstock, E., Shoham, S., Donoghue, J.

In Advances in Neural Information Processing Systems 14, pages: 221-228, MIT Press, 2002 (inproceedings)

Abstract
Statistical learning and probabilistic inference techniques are used to infer the hand position of a subject from multi-electrode recordings of neural activity in motor cortex. First, an array of electrodes provides train- ing data of neural firing conditioned on hand kinematics. We learn a non- parametric representation of this firing activity using a Bayesian model and rigorously compare it with previous models using cross-validation. Second, we infer a posterior probability distribution over hand motion conditioned on a sequence of neural test data using Bayesian inference. The learned firing models of multiple cells are used to define a non- Gaussian likelihood term which is combined with a prior probability for the kinematics. A particle filtering method is used to represent, update, and propagate the posterior distribution over time. The approach is com- pared with traditional linear filtering methods; the results suggest that it may be appropriate for neural prosthetic applications.

ps

pdf [BibTex]

pdf [BibTex]


Automatic detection and tracking of human motion with a view-based representation
Automatic detection and tracking of human motion with a view-based representation

Fablet, R., Black, M. J.

In European Conf. on Computer Vision, ECCV 2002, 1, pages: 476-491, LNCS 2353, (Editors: A. Heyden and G. Sparr and M. Nielsen and P. Johansen), Springer-Verlag , 2002 (inproceedings)

Abstract
This paper proposes a solution for the automatic detection and tracking of human motion in image sequences. Due to the complexity of the human body and its motion, automatic detection of 3D human motion remains an open, and important, problem. Existing approaches for automatic detection and tracking focus on 2D cues and typically exploit object appearance (color distribution, shape) or knowledge of a static background. In contrast, we exploit 2D optical flow information which provides rich descriptive cues, while being independent of object and background appearance. To represent the optical flow patterns of people from arbitrary viewpoints, we develop a novel representation of human motion using low-dimensional spatio-temporal models that are learned using motion capture data of human subjects. In addition to human motion (the foreground) we probabilistically model the motion of generic scenes (the background); these statistical models are defined as Gibbsian fields specified from the first-order derivatives of motion observations. Detection and tracking are posed in a principled Bayesian framework which involves the computation of a posterior probability distribution over the model parameters (i.e., the location and the type of the human motion) given a sequence of optical flow observations. Particle filtering is used to represent and predict this non-Gaussian posterior distribution over time. The model parameters of samples from this distribution are related to the pose parameters of a 3D articulated model (e.g. the approximate joint angles and movement direction). Thus the approach proves suitable for initializing more complex probabilistic models of human motion. As shown by experiments on real image sequences, our method is able to detect and track people under different viewpoints with complex backgrounds.

ps

pdf [BibTex]

pdf [BibTex]


A layered motion representation with occlusion and compact spatial support
A layered motion representation with occlusion and compact spatial support

Fleet, D. J., Jepson, A., Black, M. J.

In European Conf. on Computer Vision, ECCV 2002, 1, pages: 692-706, LNCS 2353, (Editors: A. Heyden and G. Sparr and M. Nielsen and P. Johansen), Springer-Verlag , 2002 (inproceedings)

Abstract
We describe a 2.5D layered representation for visual motion analysis. The representation provides a global interpretation of image motion in terms of several spatially localized foreground regions along with a background region. Each of these regions comprises a parametric shape model and a parametric motion model. The representation also contains depth ordering so visibility and occlusion are rightly included in the estimation of the model parameters. Finally, because the number of objects, their positions, shapes and sizes, and their relative depths are all unknown, initial models are drawn from a proposal distribution, and then compared using a penalized likelihood criterion. This allows us to automatically initialize new models, and to compare different depth orderings.

ps

pdf [BibTex]

pdf [BibTex]


Implicit probabilistic models of human motion for synthesis and tracking
Implicit probabilistic models of human motion for synthesis and tracking

Sidenbladh, H., Black, M. J., Sigal, L.

In European Conf. on Computer Vision, 1, pages: 784-800, 2002 (inproceedings)

Abstract
This paper addresses the problem of probabilistically modeling 3D human motion for synthesis and tracking. Given the high dimensional nature of human motion, learning an explicit probabilistic model from available training data is currently impractical. Instead we exploit methods from texture synthesis that treat images as representing an implicit empirical distribution. These methods replace the problem of representing the probability of a texture pattern with that of searching the training data for similar instances of that pattern. We extend this idea to temporal data representing 3D human motion with a large database of example motions. To make the method useful in practice, we must address the problem of efficient search in a large training set; efficiency is particularly important for tracking. Towards that end, we learn a low dimensional linear model of human motion that is used to structure the example motion database into a binary tree. An approximate probabilistic tree search method exploits the coefficients of this low-dimensional representation and runs in sub-linear time. This probabilistic tree search returns a particular sample human motion with probability approximating the true distribution of human motions in the database. This sampling method is suitable for use with particle filtering techniques and is applied to articulated 3D tracking of humans within a Bayesian framework. Successful tracking results are presented, along with examples of synthesizing human motion using the model.

ps

pdf [BibTex]

pdf [BibTex]


Robust parameterized component analysis: Theory and applications to {2D} facial modeling
Robust parameterized component analysis: Theory and applications to 2D facial modeling

De la Torre, F., Black, M. J.

In European Conf. on Computer Vision, ECCV 2002, 4, pages: 653-669, LNCS 2353, Springer-Verlag, 2002 (inproceedings)

ps

pdf [BibTex]

pdf [BibTex]

2001


Dynamic coupled component analysis
Dynamic coupled component analysis

De la Torre, F., Black, M. J.

In IEEE Proc. Computer Vision and Pattern Recognition, CVPR’01, 2, pages: 643-650, IEEE, Kauai, Hawaii, December 2001 (inproceedings)

ps

pdf [BibTex]

2001


pdf [BibTex]


Robust principal component analysis for computer vision
Robust principal component analysis for computer vision

De la Torre, F., Black, M. J.

In Int. Conf. on Computer Vision, ICCV-2001, II, pages: 362-369, Vancouver, BC, USA, 2001 (inproceedings)

ps

pdf Project Page [BibTex]

pdf Project Page [BibTex]


Learning image statistics for {Bayesian} tracking
Learning image statistics for Bayesian tracking

Sidenbladh, H., Black, M. J.

In Int. Conf. on Computer Vision, ICCV-2001, II, pages: 709-716, Vancouver, BC, USA, 2001 (inproceedings)

ps

pdf [BibTex]

pdf [BibTex]


no image
Encoding/decoding of arm kinematics from simultaneously recorded MI neurons

Gao, Y., Bienenstock, E., Black, M., Shoham, S., Serruya, M., Donoghue, J.

Society for Neuroscience Abst. Vol. 27, Program No. 572.14, 2001 (conference)

ps

abstract [BibTex]

abstract [BibTex]


Learning and tracking cyclic human motion
Learning and tracking cyclic human motion

Ormoneit, D., Sidenbladh, H., Black, M. J., Hastie, T.

In Advances in Neural Information Processing Systems 13, NIPS, pages: 894-900, (Editors: Leen, Todd K. and Dietterich, Thomas G. and Tresp, Volker), The MIT Press, 2001 (inproceedings)

ps

pdf [BibTex]

pdf [BibTex]


Robust estimation of multiple surface shapes from occluded textures
Robust estimation of multiple surface shapes from occluded textures

Black, M. J., Rosenholtz, R.

In International Symposium on Computer Vision, pages: 485-490, Miami, FL, November 1995 (inproceedings)

ps

pdf [BibTex]

pdf [BibTex]


no image
The PLAYBOT Project

Tsotsos, J. K., Dickinson, S., Jenkin, M., Milios, E., Jepson, A., Down, B., Amdur, E., Stevenson, S., Black, M., Metaxas, D., Cooperstock, J., Culhane, S., Nuflo, F., Verghese, G., Wai, W., Wilkes, D., Ye, Y.

In Proc. IJCAI Workshop on AI Applications for Disabled People, Montreal, August 1995 (inproceedings)

ps

abstract [BibTex]

abstract [BibTex]


Recognizing facial expressions under rigid and non-rigid facial motions using local parametric models of image motion
Recognizing facial expressions under rigid and non-rigid facial motions using local parametric models of image motion

Black, M. J., Yacoob, Y.

In International Workshop on Automatic Face- and Gesture-Recognition, Zurich, July 1995 (inproceedings)

ps

video abstract [BibTex]

video abstract [BibTex]


Image segmentation using robust mixture models
Image segmentation using robust mixture models

Black, M. J., Jepson, A. D.

US Pat. 5,802,203, June 1995 (patent)

ps

pdf on-line at USPTO [BibTex]

pdf on-line at USPTO [BibTex]


Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion
Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion

Black, M. J., Yacoob, Y.

In Fifth International Conf. on Computer Vision, ICCV’95, pages: 347-381, Boston, MA, June 1995 (inproceedings)

Abstract
This paper explores the use of local parametrized models of image motion for recovering and recognizing the non-rigid and articulated motion of human faces. Parametric flow models (for example affine) are popular for estimating motion in rigid scenes. We observe that within local regions in space and time, such models not only accurately model non-rigid facial motions but also provide a concise description of the motion in terms of a small number of parameters. These parameters are intuitively related to the motion of facial features during facial expressions and we show how expressions such as anger, happiness, surprise, fear, disgust and sadness can be recognized from the local parametric motions in the presence of significant head motion. The motion tracking and expression recognition approach performs with high accuracy in extensive laboratory experiments involving 40 subjects as well as in television and movie sequences.

ps

pdf video publisher site [BibTex]

pdf video publisher site [BibTex]


no image
A computational model for shape from texture for multiple textures

Black, M. J., Rosenholtz, R.

Investigative Ophthalmology and Visual Science Supplement, Vol. 36, No. 4, pages: 2202, March 1995 (conference)

ps

abstract [BibTex]

abstract [BibTex]

1991


Dynamic motion estimation and feature extraction over long image sequences
Dynamic motion estimation and feature extraction over long image sequences

Black, M. J., Anandan, P.

In Proc. IJCAI Workshop on Dynamic Scene Understanding, Sydney, Australia, August 1991 (inproceedings)

ps

[BibTex]

1991


[BibTex]


Robust dynamic motion estimation over time
Robust dynamic motion estimation over time

(IEEE Computer Society Outstanding Paper Award)

Black, M. J., Anandan, P.

In Proc. Computer Vision and Pattern Recognition, CVPR-91,, pages: 296-302, Maui, Hawaii, June 1991 (inproceedings)

Abstract
This paper presents a novel approach to incrementally estimating visual motion over a sequence of images. We start by formulating constraints on image motion to account for the possibility of multiple motions. This is achieved by exploiting the notions of weak continuity and robust statistics in the formulation of the minimization problem. The resulting objective function is non-convex. Traditional stochastic relaxation techniques for minimizing such functions prove inappropriate for the task. We present a highly parallel incremental stochastic minimization algorithm which has a number of advantages over previous approaches. The incremental nature of the scheme makes it truly dynamic and permits the detection of occlusion and disocclusion boundaries.

ps

pdf video abstract [BibTex]

pdf video abstract [BibTex]