Metropolis Algorithms for Representative Subgraph Sampling

Institute Homepage

Institute Homepage DE Sign In

Back

Empirical Inference Conference Paper 2008

Web

Empirical Inference

Karsten Borgwardt

While data mining in chemoinformatics studied graph data with dozens of nodes, systems biology and the Internet are now generating graph data with thousands and millions of nodes. Hence data mining faces the algorithmic challenge of coping with this significant increase in graph size: Classic algorithms for data analysis are often too expensive and too slow on large graphs. While one strategy to overcome this problem is to design novel efficient algorithms, the other is to 'reduce' the size of the large graph by sampling. This is the scope of this paper: We will present novel Metropolis algorithms for sampling a 'representative' small subgraph from the original large graph, with 'representative' describing the requirement that the sample shall preserve crucial graph properties of the original graph. In our experiments, we improve over the pioneering work of Leskovec and Faloutsos (KDD 2006), by producing representative subgraph samples that are both smaller and of higher quality than those produced by other methods from the literature.

Author(s):	Hübler, C. and Kriegel, H-P. and Borgwardt, K. and Ghahramani, Z.
Links:	Web
Pages:	283-292
Year:	2008
Month:	December
Day:	0
Editors:	Giannotti, F.
Publisher:	IEEE

Bibtex Type:	Conference Paper (inproceedings)

Address:	Piscataway, NJ, USA
DOI:	10.1109/ICDM.2008.124
Event Name:	Eighth IEEE International Conference on Data Mining (ICDM ’08)
Event Place:	Pisa, Italy

Digital:	0
Electronic Archiving:	grant_archive
ISBN:	978-0-7695-3502-9

BibTex

@inproceedings{HublerKBG2008,
  title = {Metropolis Algorithms for Representative Subgraph Sampling},
  abstract = {While data mining in chemoinformatics studied graph data with dozens of nodes, systems biology and the Internet are now generating graph data with thousands and millions of nodes. Hence data mining faces the algorithmic challenge of coping with this significant increase in graph size: Classic algorithms for data analysis are often too expensive and too slow on large graphs. While one strategy to overcome this problem is to design novel efficient algorithms, the other is to 'reduce' the size of the large graph by sampling. This is the scope of this paper: We will present novel Metropolis algorithms for sampling a 'representative' small subgraph from the original large graph, with 'representative' describing the requirement that the sample shall preserve crucial graph properties of the original graph. In our experiments, we improve over the pioneering work of Leskovec and Faloutsos (KDD 2006), by producing representative subgraph samples that are both smaller and of higher quality than those produced by other methods from the literature.},
  pages = {283-292},
  editors = {Giannotti, F.},
  publisher = {IEEE},
  address = {Piscataway, NJ, USA},
  month = dec,
  year = {2008},
  slug = {hublerkbg2008},
  author = {H{\"u}bler, C. and Kriegel, H-P. and Borgwardt, K. and Ghahramani, Z.},
  month_numeric = {12}
}

Research

Departments

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives