Graphs are used to model interactions in a variety of contexts, and there is a growing need to quickly assess the structure of a graph. Some of the most useful graph metrics, especially those measuring social cohesion, are based on triangles. Despite the importance of these triadic measures, associated algorithms can be extremely expensive. We discuss the method of wedge sampling. This versatile technique allows for the fast and accurate approximation of all current variants of clustering coefficients and enables rapid uniform sampling of the triangles of a graph. Our methods come with provable and practical time-approximation tradeoffs for all computations. We provide extensive results that show our methods are orders of magnitude faster than the state-of-the-art, while providing nearly the accuracy of full enumeration. Our results will enable more wide-scale adoption of triadic measures for analysis of extremely large graphs, as demonstrated on several real-world examples.
triangle counting, directed triangle counting, clustering coefficient, Hoeffding’s inequality
Winner of SDM13 Best Paper Prize!
@inproceedings{SePiKo13a,
author = {C. Seshadhri and Ali Pinar and Tamara G. Kolda},
title = {Triadic Measures on Graphs: The Power of Wedge Sampling},
booktitle = {SDM13: Proceedings of the 2013 SIAM International Conference on Data Mining},
venue = {Austin, TX},
eventdate = {2013-05-02/2013-05-04},
pages = {10--18},
year = {2013},
doi = {10.1137/1.9781611972832.2},
}