Header logo is


2014


Thumb xl grassmann
Grassmann Averages for Scalable Robust PCA

Hauberg, S., Feragen, A., Black, M. J.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 3810 -3817, Columbus, Ohio, USA, IEEE International Conference on Computer Vision and Pattern Recognition, June 2014 (inproceedings)

Abstract
As the collection of large datasets becomes increasingly automated, the occurrence of outliers will increase – "big data" implies "big outliers". While principal component analysis (PCA) is often used to reduce the size of data, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA do not scale beyond small-to-medium sized datasets. To address this, we introduce the Grassmann Average (GA), which expresses dimensionality reduction as an average of the subspaces spanned by the data. Because averages can be efficiently computed, we immediately gain scalability. GA is inherently more robust than PCA, but we show that they coincide for Gaussian data. We exploit that averages can be made robust to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. Robustness can be with respect to vectors (subspaces) or elements of vectors; we focus on the latter and use a trimmed average. The resulting Trimmed Grassmann Average (TGA) is particularly appropriate for computer vision because it is robust to pixel outliers. The algorithm has low computational complexity and minimal memory requirements, making it scalable to "big noisy data." We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie.

ps

pdf code supplementary material tutorial video results video talk poster DOI Project Page [BibTex]

2014


pdf code supplementary material tutorial video results video talk poster DOI Project Page [BibTex]


Thumb xl 3basic posebits
Posebits for Monocular Human Pose Estimation

Pons-Moll, G., Fleet, D. J., Rosenhahn, B.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 2345-2352, Columbus, Ohio, USA, IEEE International Conference on Computer Vision and Pattern Recognition, June 2014 (inproceedings)

Abstract
We advocate the inference of qualitative information about 3D human pose, called posebits, from images. Posebits represent boolean geometric relationships between body parts (e.g., left-leg in front of right-leg or hands close to each other). The advantages of posebits as a mid-level representation are 1) for many tasks of interest, such qualitative pose information may be sufficient (e.g. , semantic image retrieval), 2) it is relatively easy to annotate large image corpora with posebits, as it simply requires answers to yes/no questions; and 3) they help resolve challenging pose ambiguities and therefore facilitate the difficult talk of image-based 3D pose estimation. We introduce posebits, a posebit database, a method for selecting useful posebits for pose estimation and a structural SVM model for posebit inference. Experiments show the use of posebits for semantic image retrieval and for improving 3D pose estimation.

ps

pdf Project Page Project Page [BibTex]

pdf Project Page Project Page [BibTex]


Thumb xl roser
Simultaneous Underwater Visibility Assessment, Enhancement and Improved Stereo

Roser, M., Dunbabin, M., Geiger, A.

IEEE International Conference on Robotics and Automation, pages: 3840 - 3847 , Hong Kong, China, IEEE International Conference on Robotics and Automation, June 2014 (conference)

Abstract
Vision-based underwater navigation and obstacle avoidance demands robust computer vision algorithms, particularly for operation in turbid water with reduced visibility. This paper describes a novel method for the simultaneous underwater image quality assessment, visibility enhancement and disparity computation to increase stereo range resolution under dynamic, natural lighting and turbid conditions. The technique estimates the visibility properties from a sparse 3D map of the original degraded image using a physical underwater light attenuation model. Firstly, an iterated distance-adaptive image contrast enhancement enables a dense disparity computation and visibility estimation. Secondly, using a light attenuation model for ocean water, a color corrected stereo underwater image is obtained along with a visibility distance estimate. Experimental results in shallow, naturally lit, high-turbidity coastal environments show the proposed technique improves range estimation over the original images as well as image quality and color for habitat classification. Furthermore, the recursiveness and robustness of the technique allows real-time implementation onboard an Autonomous Underwater Vehicles for improved navigation and obstacle avoidance performance.

avg ps

pdf DOI [BibTex]

pdf DOI [BibTex]


Thumb xl icmlteaser
Preserving Modes and Messages via Diverse Particle Selection

Pacheco, J., Zuffi, S., Black, M. J., Sudderth, E.

In Proceedings of the 31st International Conference on Machine Learning (ICML-14), 32(1):1152-1160, J. Machine Learning Research Workshop and Conf. and Proc., Beijing, China, International Conference on Machine Learning (ICML), June 2014 (inproceedings)

Abstract
In applications of graphical models arising in domains such as computer vision and signal processing, we often seek the most likely configurations of high-dimensional, continuous variables. We develop a particle-based max-product algorithm which maintains a diverse set of posterior mode hypotheses, and is robust to initialization. At each iteration, the set of hypotheses at each node is augmented via stochastic proposals, and then reduced via an efficient selection algorithm. The integer program underlying our optimization-based particle selection minimizes errors in subsequent max-product message updates. This objective automatically encourages diversity in the maintained hypotheses, without requiring tuning of application-specific distances among hypotheses. By avoiding the stochastic resampling steps underlying particle sum-product algorithms, we also avoid common degeneracies where particles collapse onto a single hypothesis. Our approach significantly outperforms previous particle-based algorithms in experiments focusing on the estimation of human pose from single images.

ps

pdf SupMat link (url) Project Page Project Page [BibTex]

pdf SupMat link (url) Project Page Project Page [BibTex]


Thumb xl schoenbein
Calibrating and Centering Quasi-Central Catadioptric Cameras

Schoenbein, M., Strauss, T., Geiger, A.

IEEE International Conference on Robotics and Automation, pages: 4443 - 4450, Hong Kong, China, IEEE International Conference on Robotics and Automation, June 2014 (conference)

Abstract
Non-central catadioptric models are able to cope with irregular camera setups and inaccuracies in the manufacturing process but are computationally demanding and thus not suitable for robotic applications. On the other hand, calibrating a quasi-central (almost central) system with a central model introduces errors due to a wrong relationship between the viewing ray orientations and the pixels on the image sensor. In this paper, we propose a central approximation to quasi-central catadioptric camera systems that is both accurate and efficient. We observe that the distance to points in 3D is typically large compared to deviations from the single viewpoint. Thus, we first calibrate the system using a state-of-the-art non-central camera model. Next, we show that by remapping the observations we are able to match the orientation of the viewing rays of a much simpler single viewpoint model with the true ray orientations. While our approximation is general and applicable to all quasi-central camera systems, we focus on one of the most common cases in practice: hypercatadioptric cameras. We compare our model to a variety of baselines in synthetic and real localization and motion estimation experiments. We show that by using the proposed model we are able to achieve near non-central accuracy while obtaining speed-ups of more than three orders of magnitude compared to state-of-the-art non-central models.

avg ps

pdf DOI [BibTex]

pdf DOI [BibTex]


Thumb xl pami
3D Traffic Scene Understanding from Movable Platforms

Geiger, A., Lauer, M., Wojek, C., Stiller, C., Urtasun, R.

IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 36(5):1012-1025, published, IEEE, Los Alamitos, CA, May 2014 (article)

Abstract
In this paper, we present a novel probabilistic generative model for multi-object traffic scene understanding from movable platforms which reasons jointly about the 3D scene layout as well as the location and orientation of objects in the scene. In particular, the scene topology, geometry and traffic activities are inferred from short video sequences. Inspired by the impressive driving capabilities of humans, our model does not rely on GPS, lidar or map knowledge. Instead, it takes advantage of a diverse set of visual cues in the form of vehicle tracklets, vanishing points, semantic scene labels, scene flow and occupancy grids. For each of these cues we propose likelihood functions that are integrated into a probabilistic generative model. We learn all model parameters from training data using contrastive divergence. Experiments conducted on videos of 113 representative intersections show that our approach successfully infers the correct layout in a variety of very challenging scenarios. To evaluate the importance of each feature cue, experiments using different feature combinations are conducted. Furthermore, we show how by employing context derived from the proposed method we are able to improve over the state-of-the-art in terms of object detection and object orientation estimation in challenging and cluttered urban environments.

avg ps

pdf link (url) [BibTex]

pdf link (url) [BibTex]


Thumb xl blueman cropped2
Modeling the Human Body in 3D: Data Registration and Human Shape Representation

Tsoli, A.

Brown University, Department of Computer Science, May 2014 (phdthesis)

ps

pdf [BibTex]

pdf [BibTex]


Thumb xl modeltransport
Model transport: towards scalable transfer learning on manifolds - supplemental material

Freifeld, O., Hauberg, S., Black, M. J.

(9), April 2014 (techreport)

Abstract
This technical report is complementary to "Model Transport: Towards Scalable Transfer Learning on Manifolds" and contains proofs, explanation of the attached video (visualization of bases from the body shape experiments), and high-resolution images of select results of individual reconstructions from the shape experiments. It is identical to the supplemental mate- rial submitted to the Conference on Computer Vision and Pattern Recognition (CVPR 2014) on November 2013.

ps

PDF [BibTex]


Thumb xl aistats2014
Probabilistic Solutions to Differential Equations and their Application to Riemannian Statistics

Hennig, P., Hauberg, S.

In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, 33, pages: 347-355, JMLR: Workshop and Conference Proceedings, (Editors: S Kaski and J Corander), Microtome Publishing, Brookline, MA, AISTATS, April 2014 (inproceedings)

Abstract
We study a probabilistic numerical method for the solution of both boundary and initial value problems that returns a joint Gaussian process posterior over the solution. Such methods have concrete value in the statistics on Riemannian manifolds, where non-analytic ordinary differential equations are involved in virtually all computations. The probabilistic formulation permits marginalising the uncertainty of the numerical solution such that statistics are less sensitive to inaccuracies. This leads to new Riemannian algorithms for mean value computations and principal geodesic analysis. Marginalisation also means results can be less precise than point estimates, enabling a noticeable speed-up over the state of the art. Our approach is an argument for a wider point that uncertainty caused by numerical calculations should be tracked throughout the pipeline of machine learning algorithms.

ei ps pn

pdf Youtube Supplements Project page link (url) [BibTex]

pdf Youtube Supplements Project page link (url) [BibTex]


Thumb xl thumb
Multi-View Priors for Learning Detectors from Sparse Viewpoint Data

Pepik, B., Stark, M., Gehler, P., Schiele, B.

International Conference on Learning Representations, International Conference on Learning Representations (ICLR), April 2014 (conference)

Abstract
While the majority of today's object class models provide only 2D bounding boxes, far richer output hypotheses are desirable including viewpoint, fine-grained category, and 3D geometry estimate. However, models trained to provide richer output require larger amounts of training data, preferably well covering the relevant aspects such as viewpoint and fine-grained categories. In this paper, we address this issue from the perspective of transfer learning, and design an object class model that explicitly leverages correlations between visual features. Specifically, our model represents prior distributions over permissible multi-view detectors in a parametric way -- the priors are learned once from training data of a source object class, and can later be used to facilitate the learning of a detector for a target class. As we show in our experiments, this transfer is not only beneficial for detectors based on basic-level category representations, but also enables the robust learning of detectors that represent classes at finer levels of granularity, where training data is typically even scarcer and more unbalanced. As a result, we report largely improved performance in simultaneous 2D object localization and viewpoint estimation on a recent dataset of challenging street scenes.

ps

reviews pdf Project Page [BibTex]

reviews pdf Project Page [BibTex]


Thumb xl homerjournal
Adaptive Offset Correction for Intracortical Brain Computer Interfaces

Homer, M. L., Perge, J. A., Black, M. J., Harrison, M. T., Cash, S. S., Hochberg, L. R.

IEEE Transactions on Neural Systems and Rehabilitation Engineering, 22(2):239-248, March 2014 (article)

Abstract
Intracortical brain computer interfaces (iBCIs) decode intended movement from neural activity for the control of external devices such as a robotic arm. Standard approaches include a calibration phase to estimate decoding parameters. During iBCI operation, the statistical properties of the neural activity can depart from those observed during calibration, sometimes hindering a user’s ability to control the iBCI. To address this problem, we adaptively correct the offset terms within a Kalman filter decoder via penalized maximum likelihood estimation. The approach can handle rapid shifts in neural signal behavior (on the order of seconds) and requires no knowledge of the intended movement. The algorithm, called MOCA, was tested using simulated neural activity and evaluated retrospectively using data collected from two people with tetraplegia operating an iBCI. In 19 clinical research test cases, where a nonadaptive Kalman filter yielded relatively high decoding errors, MOCA significantly reduced these errors (10.6 ± 10.1\%; p < 0.05, pairwise t-test). MOCA did not significantly change the error in the remaining 23 cases where a nonadaptive Kalman filter already performed well. These results suggest that MOCA provides more robust decoding than the standard Kalman filter for iBCIs.

ps

pdf DOI Project Page [BibTex]

pdf DOI Project Page [BibTex]


Thumb xl aggteaser
Model-based Anthropometry: Predicting Measurements from 3D Human Scans in Multiple Poses

Tsoli, A., Loper, M., Black, M. J.

In Proceedings Winter Conference on Applications of Computer Vision, pages: 83-90, IEEE , IEEE Winter Conference on Applications of Computer Vision (WACV) , March 2014 (inproceedings)

Abstract
Extracting anthropometric or tailoring measurements from 3D human body scans is important for applications such as virtual try-on, custom clothing, and online sizing. Existing commercial solutions identify anatomical landmarks on high-resolution 3D scans and then compute distances or circumferences on the scan. Landmark detection is sensitive to acquisition noise (e.g. holes) and these methods require subjects to adopt a specific pose. In contrast, we propose a solution we call model-based anthropometry. We fit a deformable 3D body model to scan data in one or more poses; this model-based fitting is robust to scan noise. This brings the scan into registration with a database of registered body scans. Then, we extract features from the registered model (rather than from the scan); these include, limb lengths, circumferences, and statistical features of global shape. Finally, we learn a mapping from these features to measurements using regularized linear regression. We perform an extensive evaluation using the CAESAR dataset and demonstrate that the accuracy of our method outperforms state-of-the-art methods.

ps

pdf DOI Project Page Project Page [BibTex]

pdf DOI Project Page Project Page [BibTex]


Thumb xl freelymoving2
A freely-moving monkey treadmill model

Foster, J., Nuyujukian, P., Freifeld, O., Gao, H., Walker, R., Ryu, S., Meng, T., Murmann, B., Black, M., Shenoy, K.

J. of Neural Engineering, 11(4):046020, 2014 (article)

Abstract
Objective: Motor neuroscience and brain-machine interface (BMI) design is based on examining how the brain controls voluntary movement, typically by recording neural activity and behavior from animal models. Recording technologies used with these animal models have traditionally limited the range of behaviors that can be studied, and thus the generality of science and engineering research. We aim to design a freely-moving animal model using neural and behavioral recording technologies that do not constrain movement. Approach: We have established a freely-moving rhesus monkey model employing technology that transmits neural activity from an intracortical array using a head-mounted device and records behavior through computer vision using markerless motion capture. We demonstrate the excitability and utility of this new monkey model, including the fi rst recordings from motor cortex while rhesus monkeys walk quadrupedally on a treadmill. Main results: Using this monkey model, we show that multi-unit threshold-crossing neural activity encodes the phase of walking and that the average ring rate of the threshold crossings covaries with the speed of individual steps. On a population level, we find that neural state-space trajectories of walking at diff erent speeds have similar rotational dynamics in some dimensions that evolve at the step rate of walking, yet robustly separate by speed in other state-space dimensions. Significance: Freely-moving animal models may allow neuroscientists to examine a wider range of behaviors and can provide a flexible experimental paradigm for examining the neural mechanisms that underlie movement generation across behaviors and environments. For BMIs, freely-moving animal models have the potential to aid prosthetic design by examining how neural encoding changes with posture, environment, and other real-world context changes. Understanding this new realm of behavior in more naturalistic settings is essential for overall progress of basic motor neuroscience and for the successful translation of BMIs to people with paralysis.

ps

pdf Supplementary DOI Project Page [BibTex]

pdf Supplementary DOI Project Page [BibTex]


Thumb xl simulated annealing
Simulated Annealing

Gall, J.

In Encyclopedia of Computer Vision, pages: 737-741, 0, (Editors: Ikeuchi, K. ), Springer Verlag, 2014, to appear (inbook)

ps

[BibTex]

[BibTex]


Thumb xl ijcvflow2
A Quantitative Analysis of Current Practices in Optical Flow Estimation and the Principles behind Them

Sun, D., Roth, S., Black, M. J.

International Journal of Computer Vision (IJCV), 106(2):115-137, 2014 (article)

Abstract
The accuracy of optical flow estimation algorithms has been improving steadily as evidenced by results on the Middlebury optical flow benchmark. The typical formulation, however, has changed little since the work of Horn and Schunck. We attempt to uncover what has made recent advances possible through a thorough analysis of how the objective function, the optimization method, and modern implementation practices influence accuracy. We discover that "classical'' flow formulations perform surprisingly well when combined with modern optimization and implementation techniques. One key implementation detail is the median filtering of intermediate flow fields during optimization. While this improves the robustness of classical methods it actually leads to higher energy solutions, meaning that these methods are not optimizing the original objective function. To understand the principles behind this phenomenon, we derive a new objective function that formalizes the median filtering heuristic. This objective function includes a non-local smoothness term that robustly integrates flow estimates over large spatial neighborhoods. By modifying this new term to include information about flow and image boundaries we develop a method that can better preserve motion details. To take advantage of the trend towards video in wide-screen format, we further introduce an asymmetric pyramid downsampling scheme that enables the estimation of longer range horizontal motions. The methods are evaluated on Middlebury, MPI Sintel, and KITTI datasets using the same parameter settings.

ps

pdf full text code [BibTex]

pdf full text code [BibTex]

2011


Thumb xl iccv2011homepageimage notext small
Home 3D body scans from noisy image and range data

Weiss, A., Hirshberg, D., Black, M.

In Int. Conf. on Computer Vision (ICCV), pages: 1951-1958, IEEE, Barcelona, November 2011 (inproceedings)

Abstract
The 3D shape of the human body is useful for applications in fitness, games and apparel. Accurate body scanners, however, are expensive, limiting the availability of 3D body models. We present a method for human shape reconstruction from noisy monocular image and range data using a single inexpensive commodity sensor. The approach combines low-resolution image silhouettes with coarse range data to estimate a parametric model of the body. Accurate 3D shape estimates are obtained by combining multiple monocular views of a person moving in front of the sensor. To cope with varying body pose, we use a SCAPE body model which factors 3D body shape and pose variations. This enables the estimation of a single consistent shape while allowing pose to vary. Additionally, we describe a novel method to minimize the distance between the projected 3D body contour and the image silhouette that uses analytic derivatives of the objective function. We propose a simple method to estimate standard body measurements from the recovered SCAPE model and show that the accuracy of our method is competitive with commercial body scanning systems costing orders of magnitude more.

ps

pdf YouTube poster Project Page Project Page [BibTex]

2011


pdf YouTube poster Project Page Project Page [BibTex]


Thumb xl lugano11small
Evaluating the Automated Alignment of 3D Human Body Scans

Hirshberg, D. A., Loper, M., Rachlin, E., Tsoli, A., Weiss, A., Corner, B., Black, M. J.

In 2nd International Conference on 3D Body Scanning Technologies, pages: 76-86, (Editors: D’Apuzzo, Nicola), Hometrica Consulting, Lugano, Switzerland, October 2011 (inproceedings)

Abstract
The statistical analysis of large corpora of human body scans requires that these scans be in alignment, either for a small set of key landmarks or densely for all the vertices in the scan. Existing techniques tend to rely on hand-placed landmarks or algorithms that extract landmarks from scans. The former is time consuming and subjective while the latter is error prone. Here we show that a model-based approach can align meshes automatically, producing alignment accuracy similar to that of previous methods that rely on many landmarks. Specifically, we align a low-resolution, artist-created template body mesh to many high-resolution laser scans. Our alignment procedure employs a robust iterative closest point method with a regularization that promotes smooth and locally rigid deformation of the template mesh. We evaluate our approach on 50 female body models from the CAESAR dataset that vary significantly in body shape. To make the method fully automatic, we define simple feature detectors for the head and ankles, which provide initial landmark locations. We find that, if body poses are fairly similar, as in CAESAR, the fully automated method provides dense alignments that enable statistical analysis and anthropometric measurement.

ps

pdf slides DOI Project Page [BibTex]

pdf slides DOI Project Page [BibTex]


Thumb xl sigalijcv11
Loose-limbed People: Estimating 3D Human Pose and Motion Using Non-parametric Belief Propagation

Sigal, L., Isard, M., Haussecker, H., Black, M. J.

International Journal of Computer Vision, 98(1):15-48, Springer Netherlands, May 2011 (article)

Abstract
We formulate the problem of 3D human pose estimation and tracking as one of inference in a graphical model. Unlike traditional kinematic tree representations, our model of the body is a collection of loosely-connected body-parts. In particular, we model the body using an undirected graphical model in which nodes correspond to parts and edges to kinematic, penetration, and temporal constraints imposed by the joints and the world. These constraints are encoded using pair-wise statistical distributions, that are learned from motion-capture training data. Human pose and motion estimation is formulated as inference in this graphical model and is solved using Particle Message Passing (PaMPas). PaMPas is a form of non-parametric belief propagation that uses a variation of particle filtering that can be applied over a general graphical model with loops. The loose-limbed model and decentralized graph structure allow us to incorporate information from "bottom-up" visual cues, such as limb and head detectors, into the inference process. These detectors enable automatic initialization and aid recovery from transient tracking failures. We illustrate the method by automatically tracking people in multi-view imagery using a set of calibrated cameras and present quantitative evaluation using the HumanEva dataset.

ps

pdf publisher's site link (url) Project Page Project Page [BibTex]

pdf publisher's site link (url) Project Page Project Page [BibTex]


Thumb xl pointclickimagewide
Point-and-Click Cursor Control With an Intracortical Neural Interface System by Humans With Tetraplegia

Kim, S., Simeral, J. D., Hochberg, L. R., Donoghue, J. P., Friehs, G. M., Black, M. J.

IEEE Transactions on Neural Systems and Rehabilitation Engineering, 19(2):193-203, April 2011 (article)

Abstract
We present a point-and-click intracortical neural interface system (NIS) that enables humans with tetraplegia to volitionally move a 2D computer cursor in any desired direction on a computer screen, hold it still and click on the area of interest. This direct brain-computer interface extracts both discrete (click) and continuous (cursor velocity) signals from a single small population of neurons in human motor cortex. A key component of this system is a multi-state probabilistic decoding algorithm that simultaneously decodes neural spiking activity and outputs either a click signal or the velocity of the cursor. The algorithm combines a linear classifier, which determines whether the user is intending to click or move the cursor, with a Kalman filter that translates the neural population activity into cursor velocity. We present a paradigm for training the multi-state decoding algorithm using neural activity observed during imagined actions. Two human participants with tetraplegia (paralysis of the four limbs) performed a closed-loop radial target acquisition task using the point-and-click NIS over multiple sessions. We quantified point-and-click performance using various human-computer interaction measurements for pointing devices. We found that participants were able to control the cursor motion accurately and click on specified targets with a small error rate (< 3% in one participant). This study suggests that signals from a small ensemble of motor cortical neurons (~40) can be used for natural point-and-click 2D cursor control of a personal computer.

ps

pdf publishers's site pub med link (url) Project Page [BibTex]

pdf publishers's site pub med link (url) Project Page [BibTex]


Thumb xl middleburyimagesmall
A Database and Evaluation Methodology for Optical Flow

Baker, S., Scharstein, D., Lewis, J. P., Roth, S., Black, M. J., Szeliski, R.

International Journal of Computer Vision, 92(1):1-31, March 2011 (article)

Abstract
The quantitative evaluation of optical flow algorithms by Barron et al. (1994) led to significant advances in performance. The challenges for optical flow algorithms today go beyond the datasets and evaluation methods proposed in that paper. Instead, they center on problems associated with complex natural scenes, including nonrigid motion, real sensor noise, and motion discontinuities. We propose a new set of benchmarks and evaluation methods for the next generation of optical flow algorithms. To that end, we contribute four types of data to test different aspects of optical flow algorithms: (1) sequences with nonrigid motion where the ground-truth flow is determined by tracking hidden fluorescent texture, (2) realistic synthetic sequences, (3) high frame-rate video used to study interpolation error, and (4) modified stereo sequences of static scenes. In addition to the average angular error used by Barron et al., we compute the absolute flow endpoint error, measures for frame interpolation error, improved statistics, and results at motion discontinuities and in textureless regions. In October 2007, we published the performance of several well-known methods on a preliminary version of our data to establish the current state of the art. We also made the data freely available on the web at http://vision.middlebury.edu/flow/ . Subsequently a number of researchers have uploaded their results to our website and published papers using the data. A significant improvement in performance has already been achieved. In this paper we analyze the results obtained to date and draw a large number of conclusions from them.

ps

pdf pdf from publisher Middlebury Flow Evaluation Website [BibTex]

pdf pdf from publisher Middlebury Flow Evaluation Website [BibTex]


Thumb xl problem
Recovering Intrinsic Images with a Global Sparsity Prior on Reflectance

Gehler, P., Rother, C., Kiefel, M., Zhang, L., Schölkopf, B.

In Advances in Neural Information Processing Systems 24, pages: 765-773, (Editors: Shawe-Taylor, John and Zemel, Richard S. and Bartlett, Peter L. and Pereira, Fernando C. N. and Weinberger, Kilian Q.), Curran Associates, Inc., Red Hook, NY, USA, Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
We address the challenging task of decoupling material properties from lighting properties given a single image. In the last two decades virtually all works have concentrated on exploiting edge information to address this problem. We take a different route by introducing a new prior on reflectance, that models reflectance values as being drawn from a sparse set of basis colors. This results in a Random Field model with global, latent variables (basis colors) and pixel-accurate output reflectance values. We show that without edge information high-quality results can be achieved, that are on par with methods exploiting this source of information. Finally, we are able to improve on state-of-the-art results by integrating edge information into our model. We believe that our new approach is an excellent starting point for future developments in this field.

ei ps

website + code pdf poster Project Page Project Page [BibTex]

website + code pdf poster Project Page Project Page [BibTex]


Thumb xl fosterembs2011
Combining wireless neural recording and video capture for the analysis of natural gait

Foster, J., Freifeld, O., Nuyujukian, P., Ryu, S., Black, M. J., Shenoy, K.

In Proc. 5th Int. IEEE EMBS Conf. on Neural Engineering, pages: 613-616, IEEE, 2011 (inproceedings)

ps

pdf Project Page [BibTex]

pdf Project Page [BibTex]


Thumb xl 1000dayimagesmall
Neural control of cursor trajectory and click by a human with tetraplegia 1000 days after implant of an intracortical microelectrode array

(J. Neural Engineering Highlights of 2011 Collection. JNE top 10 cited papers of 2010-2011.)

Simeral, J. D., Kim, S., Black, M. J., Donoghue, J. P., Hochberg, L. R.

J. of Neural Engineering, 8(2):025027, 2011 (article)

Abstract
The ongoing pilot clinical trial of the BrainGate neural interface system aims in part to assess the feasibility of using neural activity obtained from a small-scale, chronically implanted, intracortical microelectrode array to provide control signals for a neural prosthesis system. Critical questions include how long implanted microelectrodes will record useful neural signals, how reliably those signals can be acquired and decoded, and how effectively they can be used to control various assistive technologies such as computers and robotic assistive devices, or to enable functional electrical stimulation of paralyzed muscles. Here we examined these questions by assessing neural cursor control and BrainGate system characteristics on five consecutive days 1000 days after implant of a 4 × 4 mm array of 100 microelectrodes in the motor cortex of a human with longstanding tetraplegia subsequent to a brainstem stroke. On each of five prospectively-selected days we performed time-amplitude sorting of neuronal spiking activity, trained a population-based Kalman velocity decoding filter combined with a linear discriminant click state classifier, and then assessed closed-loop point-and-click cursor control. The participant performed both an eight-target center-out task and a random target Fitts metric task which was adapted from a human-computer interaction ISO standard used to quantify performance of computer input devices. The neural interface system was further characterized by daily measurement of electrode impedances, unit waveforms and local field potentials. Across the five days, spiking signals were obtained from 41 of 96 electrodes and were successfully decoded to provide neural cursor point-and-click control with a mean task performance of 91.3% ± 0.1% (mean ± s.d.) correct target acquisition. Results across five consecutive days demonstrate that a neural interface system based on an intracortical microelectrode array can provide repeatable, accurate point-and-click control of a computer interface to an individual with tetraplegia 1000 days after implantation of this sensor.

ps

pdf pdf from publisher link (url) Project Page [BibTex]


Thumb xl andriluka2011
Benchmark datasets for pose estimation and tracking

Andriluka, M., Sigal, L., Black, M. J.

In Visual Analysis of Humans: Looking at People, pages: 253-274, (Editors: Moesland and Hilton and Kr"uger and Sigal), Springer-Verlag, London, 2011 (incollection)

ps

publisher's site Project Page [BibTex]

publisher's site Project Page [BibTex]


Thumb xl dagm2011imagesmall
Shape and pose-invariant correspondences using probabilistic geodesic surface embedding

Tsoli, A., Black, M. J.

In 33rd Annual Symposium of the German Association for Pattern Recognition (DAGM), 6835, pages: 256-265, Lecture Notes in Computer Science, (Editors: Mester, Rudolf and Felsberg, Michael), Springer, 2011 (inproceedings)

Abstract
Correspondence between non-rigid deformable 3D objects provides a foundation for object matching and retrieval, recognition, and 3D alignment. Establishing 3D correspondence is challenging when there are non-rigid deformations or articulations between instances of a class. We present a method for automatically finding such correspondences that deals with significant variations in pose, shape and resolution between pairs of objects.We represent objects as triangular meshes and consider normalized geodesic distances as representing their intrinsic characteristics. Geodesic distances are invariant to pose variations and nearly invariant to shape variations when properly normalized. The proposed method registers two objects by optimizing a joint probabilistic model over a subset of vertex pairs between the objects. The model enforces preservation of geodesic distances between corresponding vertex pairs and inference is performed using loopy belief propagation in a hierarchical scheme. Additionally our method prefers solutions in which local shape information is consistent at matching vertices. We quantitatively evaluate our method and show that is is more accurate than a state of the art method.

ps

pdf talk Project Page [BibTex]

pdf talk Project Page [BibTex]


Thumb xl srf2011 2
Steerable random fields for image restoration and inpainting

Roth, S., Black, M. J.

In Markov Random Fields for Vision and Image Processing, pages: 377-387, (Editors: Blake, A. and Kohli, P. and Rother, C.), MIT Press, 2011 (incollection)

Abstract
This chapter introduces the concept of a Steerable Random Field (SRF). In contrast to traditional Markov random field (MRF) models in low-level vision, the random field potentials of a SRF are defined in terms of filter responses that are steered to the local image structure. This steering uses the structure tensor to obtain derivative responses that are either aligned with, or orthogonal to, the predominant local image structure. Analysis of the statistics of these steered filter responses in natural images leads to the model proposed here. Clique potentials are defined over steered filter responses using a Gaussian scale mixture model and are learned from training data. The SRF model connects random fields with anisotropic regularization and provides a statistical motivation for the latter. Steering the random field to the local image structure improves image denoising and inpainting performance compared with traditional pairwise MRFs.

ps

publisher site [BibTex]

publisher site [BibTex]