Empirische Inferenz Conference Paper 2000

v-Arc: Ensemble Learning in the Presence of Outliers

PDF Web
Thumb ticker sm l1170153
Empirische Inferenz
  • Director
no image
Empirische Inferenz

AdaBoost and other ensemble methods have successfully been applied to a number of classification tasks, seemingly defying problems of overfitting. AdaBoost performs gradient descent in an error function with respect to the margin, asymptotically concentrating on the patterns which are hardest to learn. For very noisy problems, however, this can be disadvantageous. Indeed, theoretical analysis has shown that the margin distribution, as opposed to just the minimal margin, plays a crucial role in understanding this phenomenon. Loosely speaking, some outliers should be tolerated if this has the benefit of substantially increasing the margin on the remaining points. We propose a new boosting algorithm which allows for the possibility of a pre-specified fraction of points to lie in the margin area or even on the wrong side of the decision boundary.

Author(s): Rätsch, G. and Schölkopf, B. and Smola, AJ. and Müller, K-R. and Onoda, T. and Mika, S.
Links:
Book Title: Advances in Neural Information Processing Systems 12
Journal: Advances in Neural Information Processing Systems
Pages: 561-567
Year: 2000
Month: June
Day: 0
Editors: SA Solla and TK Leen and K-R M{\"u}ller
Publisher: MIT Press
Bibtex Type: Conference Paper (inproceedings)
Address: Cambridge, MA, USA
Event Name: 13th Annual Neural Information Processing Systems Conference (NIPS 1999)
Event Place: Denver, CO, USA
Digital: 0
Electronic Archiving: grant_archive
ISBN: 0-262-11245-0
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik

BibTex

@inproceedings{818,
  title = {v-Arc: Ensemble Learning in the Presence of Outliers},
  journal = {Advances in Neural Information Processing Systems},
  booktitle = {Advances in Neural Information Processing Systems 12},
  abstract = {AdaBoost and other ensemble methods have successfully been applied to a number of classification tasks, seemingly defying problems of overfitting. AdaBoost performs gradient descent in an error function with respect to the margin, asymptotically concentrating on the patterns which are hardest to learn. For very noisy problems, however, this can be disadvantageous. Indeed, theoretical analysis has shown that the margin distribution, as opposed to just the minimal margin, plays a crucial role in understanding this phenomenon. Loosely speaking, some outliers should be tolerated if this has the benefit of substantially increasing the margin on the remaining points. We propose a new boosting algorithm which allows for the possibility of a pre-specified fraction of points to lie in the margin area or even on the wrong side of the decision boundary. },
  pages = {561-567},
  editors = {SA Solla and TK Leen and K-R M{\"u}ller},
  publisher = {MIT Press},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  address = {Cambridge, MA, USA},
  month = jun,
  year = {2000},
  slug = {818},
  author = {R{\"a}tsch, G. and Sch{\"o}lkopf, B. and Smola, AJ. and M{\"u}ller, K-R. and Onoda, T. and Mika, S.},
  month_numeric = {6}
}