An Efficient Method for Gradient-Based Adaptation of Hyperparameters in SVM Models
PDF WebWe consider the task of tuning hyperparameters in SVM models based on minimizing a smooth performance validation function, e.g., smoothed k-fold cross-validation error, using non-linear optimization techniques. The key computation in this approach is that of the gradient of the validation function with respect to hyperparameters. We show that for large-scale problems involving a wide choice of kernel-based models and validation functions, this computation can be very efficiently done; often within just a fraction of the training time. Empirical results show that a near-optimal set of hyperparameters can be identified by our approach with very few training rounds and gradient computations.
| Author(s): | Keerthi, SS. and Sindhwani, V. and Chapelle, O. |
| Links: | |
| Book Title: | Advances in Neural Information Processing Systems 19 |
| Journal: | Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference |
| Pages: | 673-680 |
| Year: | 2007 |
| Month: | September |
| Day: | 0 |
| Editors: | Sch{\"o}lkopf, B. , J. Platt, T. Hofmann |
| Publisher: | MIT Press |
| BibTeX Type: | Conference Paper (inproceedings) |
| Address: | Cambridge, MA, USA |
| Event Name: | Twentieth Annual Conference on Neural Information Processing Systems (NIPS 2006) |
| Event Place: | Vancouver, BC, Canada |
| Digital: | 0 |
| Electronic Archiving: | grant_archive |
| ISBN: | 0-262-19568-2 |
| Language: | en |
| Organization: | Max-Planck-Gesellschaft |
| School: | Biologische Kybernetik |
BibTeX
@inproceedings{5371,
title = {An Efficient Method for Gradient-Based Adaptation of Hyperparameters in SVM Models},
journal = {Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference},
booktitle = {Advances in Neural Information Processing Systems 19},
abstract = {We consider the task of tuning hyperparameters in SVM models based on minimizing a smooth performance validation function, e.g., smoothed k-fold cross-validation error,
using non-linear optimization techniques. The key computation in this approach is that of the gradient of the validation function with respect to hyperparameters. We show that for large-scale problems involving a wide choice of kernel-based models and validation functions, this computation can be very efficiently done; often within just a fraction of the training time. Empirical results show that a near-optimal set of hyperparameters can be identified by our approach with very few training rounds and gradient computations.},
pages = {673-680},
editors = {Sch{\"o}lkopf, B. , J. Platt, T. Hofmann},
publisher = {MIT Press},
organization = {Max-Planck-Gesellschaft},
school = {Biologische Kybernetik},
address = {Cambridge, MA, USA},
month = sep,
year = {2007},
author = {Keerthi, SS. and Sindhwani, V. and Chapelle, O.},
month_numeric = {9}
}
