Distance-based classification with Lipschitz functions

PDF PostScript

Empirical Inference

Olivier Bousquet

Statistical Learning Theory

Ulrike von Luxburg

Professor, University of Tübingen
Max Planck Fellow

The goal of this article is to develop a framework for large margin classification in metric spaces. We want to find a generalization of linear decision functions for metric spaces and define a corresponding notion of margin such that the decision function separates the training points with a large margin. It will turn out that using Lipschitz functions as decision functions, the inverse of the Lipschitz constant can be interpreted as the size of a margin. In order to construct a clean mathematical setup we isometrically embed the given metric space into a Banach space and the space of Lipschitz functions into its dual space. Our approach leads to a general large margin algorithm for classification in metric spaces. To analyze this algorithm, we first prove a representer theorem. It states that there exists a solution which can be expressed as linear combination of distances to sets of training points. Then we analyze the Rademacher complexity of some Lipschitz function classes. The generality of the Lipschitz approach can be seen from the fact that several well-known algorithms are special cases of the Lipschitz algorithm, among them the support vector machine, the linear programming machine, and the 1-nearest neighbor classifier.

Author(s):	von Luxburg, U. and Bousquet, O.
Links:	PDF PostScript
Journal:	Learning Theory and Kernel Machines, Proceedings of the 16th Annual Conference on Computational Learning Theory
Pages:	314-328
Year:	2003
Day:	0
Editors:	Sch{\"o}lkopf, B. and M.K. Warmuth

BibTeX Type:	Conference Paper (inproceedings)

Event Name:	Learning Theory and Kernel Machines, Proceedings of the 16th Annual Conference on Computational Learning Theory

Digital:	0
Electronic Archiving:	grant_archive
Organization:	Max-Planck-Gesellschaft
School:	Biologische Kybernetik

BibTeX

@inproceedings{2261,
  title = {Distance-based classification with Lipschitz functions},
  journal = {Learning Theory and Kernel Machines, Proceedings of the 16th Annual Conference on Computational Learning Theory},
  abstract = {The goal of this article is to develop a framework for large margin
  classification in metric spaces.  We want to find a generalization of
  linear decision functions for metric spaces and define a corresponding
  notion of margin such that the decision function separates the
  training points with a large margin. It will turn out that using
  Lipschitz functions as decision functions, the inverse of the Lipschitz
  constant can be interpreted as the size of a margin. In order to
  construct a clean mathematical setup we isometrically embed the given
  metric space into a Banach space and the space of Lipschitz functions
  into its dual space.  Our approach leads to a general large margin
  algorithm for classification in metric spaces. To analyze this
  algorithm, we first prove a representer theorem. It states that there
  exists a solution which can be expressed as linear combination of
  distances to sets of training points. Then we analyze the Rademacher
  complexity of some Lipschitz function classes. The generality of the
  Lipschitz approach can be seen from the fact that several well-known
  algorithms are special cases of the Lipschitz algorithm, among them
  the support vector machine, the linear programming machine, and
  the 1-nearest neighbor classifier.},
  pages = {314-328},
  editors = {Sch{\"o}lkopf, B. and M.K. Warmuth},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  year = {2003},
  author = {von Luxburg, U. and Bousquet, O.}
}