Discriminative frequent subgraph mining with optimality guarantees
WebThe goal of frequent subgraph mining is to detect subgraphs that frequently occur in a dataset of graphs. In classification settings, one is often interested in discovering discriminative frequent subgraphs, whose presence or absence is indicative of the class membership of a graph. In this article, we propose an approach to feature selection on frequent subgraphs, called CORK, that combines two central advantages. First, it optimizes a submodular quality criterion, which means that we can yield a near-optimal solution using greedy feature selection. Second, our submodular quality function criterion can be integrated into gSpan, the state-of-the-art tool for frequent subgraph mining, and help to prune the search space for discriminative frequent subgraphs even during frequent subgraph mining.
| Author(s): | Thoma, M. and Cheng, H. and Gretton, A. and Han, J. and Kriegel, H-P. and Smola, AJ. and Song, L. and Yu, PS. and Yan, X. and Borgwardt, KM. |
| Links: | |
| Journal: | Journal of Statistical Analysis and Data Mining |
| Volume: | 3 |
| Number (issue): | 5 |
| Pages: | 302–318 |
| Year: | 2010 |
| Month: | October |
| Day: | 0 |
| BibTeX Type: | Article (article) |
| DOI: | 10.1002/sam.10084 |
| Electronic Archiving: | grant_archive |
BibTeX
@article{ThomaCGHKSSYYB2010,
title = {Discriminative frequent subgraph mining with optimality guarantees},
journal = {Journal of Statistical Analysis and Data Mining},
abstract = {The goal of frequent subgraph mining is to detect subgraphs that frequently occur in a dataset of graphs. In classification settings, one is often interested in discovering discriminative frequent subgraphs, whose presence or absence is indicative of the class membership of a graph. In this article, we propose an approach to feature selection on frequent subgraphs, called CORK, that combines two central advantages. First, it optimizes a submodular quality criterion, which means that we can yield a near-optimal solution using greedy feature selection. Second, our submodular quality function criterion can be integrated into gSpan, the state-of-the-art tool for frequent subgraph mining, and help to prune the search space for discriminative frequent subgraphs even during frequent subgraph mining.},
volume = {3},
number = {5},
pages = {302–318},
month = oct,
year = {2010},
author = {Thoma, M. and Cheng, H. and Gretton, A. and Han, J. and Kriegel, H-P. and Smola, AJ. and Song, L. and Yu, PS. and Yan, X. and Borgwardt, KM.},
doi = {10.1002/sam.10084},
month_numeric = {10}
}
