Manifold Denoising as Preprocessing for Finding Natural Representations of Data
PDF WebA natural representation of data are the parameters which generated the data. If the parameter space is continuous we can regard it as a manifold. In practice we usually do not know this manifold but we just have some representation of the data, often in a very high-dimensional feature space. Since the number of internal parameters does not change with the representation, the data will effectively lie on a low-dimensional submanifold in feature space. Due to measurement errors this data is usually corrupted by noise which particularly in high-dimensional feature spaces makes it almost impossible to find the manifold structure. This paper reviews a method called Manifold Denoising which projects the data onto the submanifold using a diffusion process on a graph generated by the data. We will demonstrate that the method is capable of dealing with non-trival high-dimensional noise. Moreover we will show that using the method as a preprocessing step one can significantly improve the results of a semi-supervised learning algorithm.
Author(s): | Hein, M. and Maier, M. |
Links: | |
Book Title: | AAAI-07 |
Journal: | Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07) |
Pages: | 1646-1649 |
Year: | 2007 |
Month: | July |
Day: | 0 |
Publisher: | AAAI Press |
Bibtex Type: | Conference Paper (inproceedings) |
Address: | Menlo Park, CA, USA |
Event Name: | Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07) |
Event Place: | Vancouver, BC, Canada |
Digital: | 0 |
Electronic Archiving: | grant_archive |
Institution: | Association for the Advancement of Artificial Intelligence |
Language: | en |
Organization: | Max-Planck-Gesellschaft |
School: | Biologische Kybernetik |
BibTex
@inproceedings{4588, title = {Manifold Denoising as Preprocessing for Finding Natural Representations of Data}, journal = {Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07)}, booktitle = {AAAI-07}, abstract = {A natural representation of data are the parameters which generated the data. If the parameter space is continuous we can regard it as a manifold. In practice we usually do not know this manifold but we just have some representation of the data, often in a very high-dimensional feature space. Since the number of internal parameters does not change with the representation, the data will effectively lie on a low-dimensional submanifold in feature space. Due to measurement errors this data is usually corrupted by noise which particularly in high-dimensional feature spaces makes it almost impossible to find the manifold structure. This paper reviews a method called Manifold Denoising which projects the data onto the submanifold using a diffusion process on a graph generated by the data. We will demonstrate that the method is capable of dealing with non-trival high-dimensional noise. Moreover we will show that using the method as a preprocessing step one can significantly improve the results of a semi-supervised learning algorithm.}, pages = {1646-1649}, publisher = {AAAI Press}, organization = {Max-Planck-Gesellschaft}, institution = {Association for the Advancement of Artificial Intelligence}, school = {Biologische Kybernetik}, address = {Menlo Park, CA, USA}, month = jul, year = {2007}, slug = {4588}, author = {Hein, M. and Maier, M.}, month_numeric = {7} }