Scalable Propagation Algorithms for Node Labeling in Bipartite Graphs
MetadataShow full item record
Bipartite graphs are graphs whose vertices can be partitioned into 2 different independent sets. Many interesting machine learning problems either present themselves naturally as problems of node labeling in bipartite graphs (like user modeling in social networks) or can be converted into this form (like document classification). In both cases the nodes of the graph represent different entities with specific labels. Some of the labels are known a priori while others have to be learned. Label propagation (LPA) is an approach in which labeled nodes in a graph propagate their labels towards neighbors iteratively to assign labels to unknown nodes. While LPA is fast for node labeling, it doesn't distinguish between different kinds of nodes in the graph, while our proposed algorithm can distinguish between 2 different kinds of nodes in the graph, and adapts itself automatically to the underlying characteristics of the network without previous knowledge. Given the large size of the typical datasets to which Label Propagation Algorithms are applicable, we pay attention to ensuring and demonstrating the scalability of the developed node labeling algorithms. To allow our proposed techniques to efficiently process graphs with millions of nodes and edges, we designed implementations in Grappa, a distributed shared memory system. We performed experiments to measure the scalability and the accuracy of our node labeling algorithms on different sizes of datasets from Facebook.