Machine Learning approaches to protein ranking: discriminative, semi-supervised, scalable algorithms
PDFA key tool in protein function discovery is the ability to rank databases of proteins given a query amino acid sequence. The most successful method so far is a web-based tool called PSI-BLAST which uses heuristic alignment of a profile built using the large unlabeled database. It has been shown that such use of global information via an unlabeled data improves over a local measure derived from a basic pairwise alignment such as performed by PSI-BLAST's predecessor, BLAST. In this article we look at ways of leveraging techniques from the field of machine learning for the problem of ranking. We show how clustering and semi-supervised learning techniques, which aim to capture global structure in data, can significantly improve over PSI-BLAST.
| Author(s): | Weston, J. and Leslie, C. and Elisseeff, A. and Noble, WS. |
| Links: | |
| Number (issue): | 111 |
| Year: | 2003 |
| Month: | June |
| Day: | 0 |
| BibTeX Type: | Technical Report (techreport) |
| Electronic Archiving: | grant_archive |
| Institution: | Max Planck Institute for Biological Cybernetics, Tübingen, Germany |
BibTeX
@techreport{2300,
title = {Machine Learning approaches to protein ranking: discriminative, semi-supervised, scalable algorithms},
abstract = {A key tool in protein function discovery is the ability to rank databases of proteins given a query amino acid sequence. The most successful method so far is a web-based tool called PSI-BLAST which uses heuristic alignment of a profile built using the large unlabeled database. It has been shown that such use of global information via an unlabeled data improves over a local measure derived from a basic pairwise alignment such as performed by PSI-BLAST's predecessor, BLAST. In this article we
look at ways of leveraging techniques from the field of machine learning for the problem of ranking. We show how clustering and semi-supervised learning techniques, which aim to capture global structure in data, can significantly improve over PSI-BLAST.},
number = {111},
institution = {Max Planck Institute for Biological Cybernetics, T{\"u}bingen, Germany},
month = jun,
year = {2003},
author = {Weston, J. and Leslie, C. and Elisseeff, A. and Noble, WS.},
month_numeric = {6}
}
