Perceiving Systems Software Workshop Article 2015

Scalable Robust Principal Component Analysis using Grassmann Averages

preprint pdf from publisher supplemental
Thumb ticker sm hauberg face2
Perceiving Systems
Post doc. at the Section for Cognitive Systems at the Technical University of Denmark.
Thumb ticker sm re  1
Software Workshop
Senior Research Engineer @ Software Workshop
Thumb ticker sm headshot2021
Perceiving Systems
Director
Grassmanteaser

In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note that in a zero-mean dataset, each observation spans a one-dimensional subspace, giving a point on the Grassmann manifold. We show that the average subspace corresponds to the leading principal component for Gaussian data. We provide a simple algorithm for computing this Grassmann Average (GA), and show that the subspace estimate is less sensitive to outliers than PCA for general distributions. Because averages can be efficiently computed, we immediately gain scalability. We exploit robust averaging to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. The resulting Trimmed Grassmann Average (TGA) is appropriate for computer vision because it is robust to pixel outliers. The algorithm has linear computational complexity and minimal memory requirements. We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie; a task beyond any current method. Source code is available online.

Author(s): Hauberg, S. and Feragen, A. and Enficiaud, R. and Black, M.J.
Links:
Journal: IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI)
Year: 2015
Month: December
Project(s):
Bibtex Type: Article (article)
Electronic Archiving: grant_archive

BibTex

@article{Hauberg:PAMI:2015,
  title = {Scalable Robust Principal Component Analysis using {Grassmann} Averages},
  journal = {IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI)},
  abstract = {In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note that in a zero-mean dataset, each observation spans a one-dimensional subspace, giving a point on the Grassmann manifold. We show that the average subspace corresponds to the leading principal component for Gaussian data. We provide a simple algorithm for computing this Grassmann Average (GA), and show that the subspace estimate is less sensitive to outliers than PCA for general distributions. Because averages can be efficiently computed, we immediately gain scalability. We exploit robust averaging to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. The resulting Trimmed Grassmann Average (TGA) is appropriate for computer vision because it is robust to pixel outliers. The algorithm has linear computational complexity and minimal memory requirements. We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie; a task beyond any current method. Source code is available online.},
  month = dec,
  year = {2015},
  slug = {hauberg-pami-2015},
  author = {Hauberg, S. and Feragen, A. and Enficiaud, R. and Black, M.J.},
  month_numeric = {12}
}