December 5th, 2023
Community detection is one of most prominent tasks in the analysis of complex networks such as social networks, biological networks, and the world wide web. A community is loosely defined as a group of nodes that are more densely connected to each other than to the rest of the network. These could be, for instance, interest groups, or websites on similar topics. Community detection has been receiving massive attention in networks science. The most common approach to the problem is `modularity maximization’, proposed by Girvan and Newman in 2004. In this work we present a new framework that rephrases the task of community detection as a nearest neighbor search on a high-dimensional hypersphere. This yields a new class of community detection methods, which we call the `projection methods’. A projection method consists of two steps: 1) The networks is mapped to a query vector on a high-dimensional hypersphere. 2) The query vector is projected to the set of clusterings, where each clustering, too, is a vector on the hypersphere; this projection step is done by minimizing the distance between the query vector and the clustering, over the set of all clusterings. We prove that projection methods generalize many popular community detection methods, including modularity maximization. This has many practical implications. For example, modularity maximization and many other methods suffer from the same `granularity problem’: they have some parameter that controls granularity of the obtained clustering, but no method to predict the obtained granularity from the value of this parameter. In projection methods, the granularity of the obtained clustering is close to that of the query vector. With this insight, we provide a general heuristic for choosing a query vector to address the granularity problem. Another advantage of the projection methods is that a query vector may explicitly depend on various network characteristics (edges, wedges, triangles), thus seamlessly including these in community detection.
Bio: Nelly Litvak received her MSc from the Lobachevsky University of Nizhny Novgorod, Russia, in 1995 and her PhD from Eindhoven University of Technology in 2002. She joined the University of Twente as assistant professor in 2002, where she became full professor in 2018. From September 2023, she is a full professor at Eindhoven University of Technology. Nelly’s background is in applied probability and stochastic operations research. Currently her main research interest is in random graphs, complex networks, and algorithms for complex networks, such as social networks, and the world wide web. Applications include web ranking, community detection, prediction of changes in the world wide web, and prediction of spreading of infections such as COVID-19. The main goal of this research is to explain structures and processes in real-world networks, using the mathematical theory of stochastic processes, random graphs, randomized algorithms, and machine learning. She is a recipient of several grants and awards including Stieltjes Prize for the best PhD thesis in Mathematics, and the Google Faculty Research Award. Nelly serves or has served on the editorial boards of Internet Mathematics, Queueing Systems, Stochastic Processes and their Applications and Frontiers in Complex Systems. Besides her research, Nelly is passionate about novel education, and popularization of mathematics. She wrote several non-fiction books for general public about mathematics and education.