Seed-centric Approach to Detect Overlapping Communities in Networks
Abstract
Community Detection is a trivial task in Network Analysis and Data Mining. A widely accepted definition of a Community or a Cluster is the group of vertices that are densely connected within and sparsely connected to the other communities. Uncovering the community structure of the network provides useful insights about the structural and functional characteristics of the network of any kind, in consideration. Community Detection is a simpler and computationally less-challenged task, if communities are assumed to be disjoint and non-overlapping. On the contrary, most of the real-world networks exhibit the overlapping community structure rather than the assumed disjoint community structure. Hence the problem of Disjoint Community Detection is now being perceived as Overlapping-Community Detection.Several algorithms have been developed to detect Overlapping Communities in a network. The state-of-the-art is rich in the way it offers different classes of algorithms, yet there is a lot of scope for further exploration of the problem and devising different algorithms based on different metrics. This Thesis is inspired by this idea and we propose a new technique to determine the initial set of nodes, called the seed nodes, to detect the overlapping communities in a network. We further devise a multi-stage algorithm, which starts with detecting the seed nodes in the initial stage and converges with the detection of overlapping communities of the network, in the last stage.Our approach is unique in its choice of the graph/network metric, which is taken into consideration to detect the seed nodes of the network. Our contributions are two-fold. The first being, the ability to determine highly-important nodes of the network based on the Betweenness Centrality metric of node. The second being, detecting densely overlapping communities, thereby optimizing the density measure of a community. Experiments show that our algorithm is successful in scaling to large real-world networks and is efficient in detection of overlapping communities. The algorithm is compared to other state-of-the- art algorithms and its performance is evaluated.
Collections
- OSU Theses [15752]