Perceiving Systems – Max Planck Institute for Intelligent Systems

We provide implementations for the Grassmann Average, the Trimmed Grassmann Average, and the Grassmann Median. We the simplest is the Matlab implementation used in the CVPR 2014 paper, but we also provide a faster C++ implementation, which can be used either directly from C++ or through a Matlab wrapper interface. The code is available for download below. Any feedback is much appreciated and can be send to Søren Hauberg.

C++ Implementation

A highly parallel C++ implementation of the Grassmann averages are available at the Tübingen MPI-IS Github. This includes both a C++ library and a Matlab interface. This is the recommended implementation of the algorithms.

Matlab Implementation

The simplest implementation is done in pure Matlab, with a C++ implementation of trimmed averages. This implementation was used for the CVPR 2014 paper. With the release of the pure C++ implementation, this code is mostly useful for understanding the basic algorithms.

Example Code

As a simple example, we will first generate samples from a random Gaussian distribution, and then estimate its leading component.

 D = 2; % we consider a two-dimensional problem
  N = 100; % we will generate 100 observations

  %% Generate a random Covariance matrix
  tmp = randn(D);
  Sigma = tmp.' * tmp;

  %% Sample from the corresponding Gaussian
  X = mvnrnd(zeros(D, 1), Sigma, N);

  %% Estimate the leading component
  comp = grassmann_average(X, 1); % the second input is the number of component to estimate

  %% Plot the results
  plot(X(:, 1), X(:, 2), 'ko', 'markerfacecolor', [255,153,51]./255);
  axis equal
  hold on
  plot(3*[-comp(1), comp(1)], 3*[-comp(2), comp(2)], 'k', 'linewidth', 2)
  hold off
  axis off

This produces a plot like the figure below:

Download

The initial version of the Matlab code is now available: grassmann_averages-0.3.zip. For questions or comments please contact Søren Hauberg.

Members

Perceiving Systems

Søren Hauberg

Post doc. at the Section for Cognitive Systems at the Technical University of Denmark.

Perceiving Systems

Michael Black

Emeritus / Acting Director

Software Workshop

Raffi Enficiaud

Senior Research Engineer @ Software Workshop

Publications

Perceiving Systems Software Workshop Article Scalable Robust Principal Component Analysis using Grassmann Averages Hauberg, S., Feragen, A., Enficiaud, R., Black, M. IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), December 2015

Abstract ›

In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note that in a zero-mean dataset, each observation spans a one-dimensional subspace, giving a point on the Grassmann manifold. We show that the average subspace corresponds to the leading principal component for Gaussian data. We provide a simple algorithm for computing this Grassmann Average (GA), and show that the subspace estimate is less sensitive to outliers than PCA for general distributions. Because averages can be efficiently computed, we immediately gain scalability. We exploit robust averaging to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. The resulting Trimmed Grassmann Average (TGA) is appropriate for computer vision because it is robust to pixel outliers. The algorithm has linear computational complexity and minimal memory requirements. We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie; a task beyond any current method. Source code is available online.

preprint pdf from publisher supplemental BibTeX

Perceiving Systems Conference Paper Grassmann Averages for Scalable Robust PCA Hauberg, S., Feragen, A., Black, M. J. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 3810 -3817, Columbus, Ohio, USA, IEEE International Conference on Computer Vision and Pattern Recognition, June 2014

Abstract ›

As the collection of large datasets becomes increasingly automated, the occurrence of outliers will increase – "big data" implies "big outliers". While principal component analysis (PCA) is often used to reduce the size of data, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA do not scale beyond small-to-medium sized datasets. To address this, we introduce the Grassmann Average (GA), which expresses dimensionality reduction as an average of the subspaces spanned by the data. Because averages can be efficiently computed, we immediately gain scalability. GA is inherently more robust than PCA, but we show that they coincide for Gaussian data. We exploit that averages can be made robust to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. Robustness can be with respect to vectors (subspaces) or elements of vectors; we focus on the latter and use a trimmed average. The resulting Trimmed Grassmann Average (TGA) is particularly appropriate for computer vision because it is robust to pixel outliers. The algorithm has low computational complexity and minimal memory requirements, making it scalable to "big noisy data." We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie.

pdf code supplementary material tutorial video results video talk poster DOI BibTeX

Perceiving Systems Article A framework for robust subspace learning De la Torre, F., Black, M. J. International Journal of Computer Vision, 54(1-3):117-142, August 2003

Abstract ›

Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multi-linear models. These models have been widely used for the representation of shape, appearance, motion, etc., in computer vision applications. Methods for learning linear models can be seen as a special case of subspace fitting. One draw-back of previous learning methods is that they are based on least squares estimation techniques and hence fail to account for “outliers” which are common in realistic training sets. We review previous approaches for making linear learning methods robust to outliers and present a new method that uses an intra-sample outlier process to account for pixel outliers. We develop the theory of Robust Subspace Learning (RSL) for linear models within a continuous optimization framework based on robust M-estimation. The framework applies to a variety of linear learning problems in computer vision including eigen-analysis and structure from motion. Several synthetic and natural examples are used to develop and illustrate the theory and applications of robust subspace learning in computer vision.

pdf code pdf from publisher BibTeX

Perceiving Systems Conference Paper Robust principal component analysis for computer vision De la Torre, F., Black, M. J. In Int. Conf. on Computer Vision, ICCV-2001, II:362-369, Vancouver, BC, USA, 2001

Abstract ›

Principal Component Analysis (PCA) has been widely used for the representation of shape, appearance, and motion. One drawback of typical PCA methods is that they are least squares estimation techniques and hence fail to account for “outliers” which are common in realistic training sets. In computer vision applications, outliers typically occur within a sample (image) due to pixels that are corrupted by noise, alignment errors, or occlusion. We review previous approaches for making PCA robust to outliers and present a new method that uses an intra-sample outlier process to account for pixel outliers. We develop the theory of Robust Principal Component Analysis (RPCA) and describe a robust M-estimation algorithm for learning linear multivariate representations of high dimensional data such as images. Quantitative comparisons with traditional PCA and previous robust algorithms illustrate the benefits of RPCA when outliers are present. Details of the algorithm are described and a software implementation is being made publicly available.

pdf BibTeX