Physics for Inference and Optimization Article 2020

Sampling on networks: estimating spectral centrality measures and their impact in evaluating other relevant network measures

Code Preprint pdf
no image
Physics for Inference and Optimization
Thumb ticker sm 20240718 de bacco caterina 2 lowres
Physics for Inference and Optimization
Max Planck Research Group Leader

We perform an extensive analysis of how sampling impacts the estimate of several relevant network measures. In particular, we focus on how a sampling strategy optimized to recover a particular spectral centrality measure impacts other topological quantities. Our goal is on one hand to extend the analysis of the behavior of TCEC [Ruggeri2019], a theoretically-grounded sampling method for eigenvector centrality estimation. On the other hand, to demonstrate more broadly how sampling can impact the estimation of relevant network properties like centrality measures different than the one aimed at optimizing, community structure and node attribute distribution. Finally, we adapt the theoretical framework behind TCEC for the case of PageRank centrality and propose a sampling algorithm aimed at optimizing its estimation. We show that, while the theoretical derivation can be suitably adapted to cover this case, the resulting algorithm suffers of a high computational complexity that requires further approximations compared to the eigenvector centrality case.

Author(s): Ruggeri, Nicolò and De Bacco, Caterina
Links:
Journal: Applied Network Science
Volume: 5:81
Year: 2020
Month: October
Bibtex Type: Article (article)
DOI: https://doi.org/10.1007/s41109-020-00324-9
State: Published
Electronic Archiving: grant_archive

BibTex

@article{tcec_ext,
  title = {Sampling on networks: estimating spectral centrality measures and their impact in evaluating other relevant network measures},
  journal = {Applied Network Science},
  abstract = {We perform an extensive analysis of how sampling impacts the estimate of several relevant network measures. In particular, we focus on how a sampling strategy optimized to recover a particular spectral centrality measure impacts other topological quantities. Our goal is on one hand to extend the analysis of the behavior of TCEC [Ruggeri2019], a theoretically-grounded sampling method for eigenvector centrality estimation. On the other hand, to demonstrate more broadly how sampling can impact the estimation of relevant network properties like centrality measures different than the one aimed at optimizing, community structure and node attribute distribution. Finally, we adapt the theoretical framework behind TCEC for the case of PageRank centrality and propose a sampling algorithm aimed at optimizing its estimation. We show that, while the theoretical derivation can be suitably adapted to cover this case, the resulting algorithm suffers of a high computational complexity that requires further approximations compared to the eigenvector centrality case.},
  volume = {5:81},
  month = oct,
  year = {2020},
  slug = {tcec_ext},
  author = {Ruggeri, Nicolò and De Bacco, Caterina},
  month_numeric = {10}
}