Struct-MMSB: Mixed Membership Stochastic Blockmodels with Interpretable Structured Priors
Introduction
Modeling the complex and intricate interactions existing within a community is an important network science problem that has gained attention in the last decade. Perhaps one of the most commonly and widely used network generation and community detection model is mixed membership stochastic blockmodel (MMSB), owing to its flexibility in modeling different kinds of networks and communities. MMSB models the node’s membership in latent groups using a mixed-membership distribution, wherein the node can be part of multiple latent groups. This membership is used to generate the network structure.
In this article, we introduce a scalable and general purpose MMSB, Struct-MMSB, by enhancing MMSB with a structured prior using a recently developed graphical model, hinge-loss Markov random fields (HL-MRFs). Our approach, Struct-MMSB, is inspired from latent topic networks (LTN), a general-purpose latent Dirichlet allocation (LDA) using structured priors. Table 1 gives a comparison of our model Struct-MMSB with other popular variants of MMSB. Our model possesses the capability to encode: 1) the relational dependencies among membership distributions, 2) additional node features, 3) multi-relational information, and their impact on the membership distributions using a probabilistic programming templating language, thus remaining interpretable and easy to specify custom network relationships. Further, our approach also incorporates the ability to encode meaningful latent variables, which can be learned as a complex combination of observed features, membership distributions, and group-group interaction probabilities, thus catering to many latent-variable modeling scenarios in computational social science problems.
Next, we present algorithms for inference and learning in Struct- MMSB. We formulate an expectation maximization (EM) inference for inferring the expected value of latent variables, mixed-membership distributions of nodes in groups, and the group-group interaction probabilities. Then, we develop a scalable inference method using stochastic EM and show how to effectively incorporate relational dependencies, while remaining scalable to large networks. We then present a way to learn the weights of the first-order logical rules by maximizing the likelihood of the weights without additional ground truth data, thus allowing the model to learn the predictive ability of the logical rules in the data.
We demonstrate the versatility of our model in modeling different network modeling scenarios on data from six real-world networks and show that our model on average achieves a 15% better log-likelihood and a better prediction performance than the state-of-the-art MMSB variant, Copula-MMSB, IRM, and MMSB and their multi-relational and online variants.
Struct-MMSB: Key Idea
Our goal in designing Struct-MMSB is to create a lucid, easy-to- specify, and expressive generative model.
To construct Struct-MMSB, we replace the Dirichlet priors and Beta priors in the MMSB generative process with HL-MRF potential function ψ. Figure 1 shows the plate diagram of Struct-MMSB. The HL-MRF priors are indicated by ψπ, which captures dependencies between the membership distribution π of two nodes in the graph and ψB, which captures dependencies between groups in the group interaction matrix B.
Below, we provide some example dependencies that can be encoded using Struct-MMSB.
Correlated Nodes
Latent characteristics Modeled as Latent Variables
Inter-group and Intra-group Links
Experimental Evaluation
We conduct experiments to answer the questions:
1) How our models perform on different network modeling scenarios?
2) How informative are the latent variables in our intuitive HL-MRF priors?
Case 1: Feature-based Similarity
In this modeling scenario, we specify HL-MRF priors to utilize feature commonality between nodes to model their community membership. The priors in Table 3 capture that if two nodes have many common features, then there is a greater chance of these nodes being grouped together. The first rule in Table 3 means if node p and node q have the same features (multiple instantiations of Rule 1), then we infer that node p and node q are similar. similarity(p,q) is a latent variable that helps us model the degree of similarity between two nodes. The second rule captures that if two nodes are similar, then we can infer that they have similar membership distributions.
We evaluate this model on two Facebook Ego datasets: 1) Ego-414 dataset containing 159 nodes and 3386 links, and 2) Facebook Ego-686 dataset containing 170 nodes and 3312 links. Table 4 shows that our model achieves a better training and test log-likelihood and AUC on two Facebook ego datasets when compared with standard MMSB, Copula-MMSB, and IRM.
Our latent variables, apart from providing the model with modeling power, also bring interpretability to the model. Figures 2(a) and 2(b) illustrate the correlation between latent variable similarity and the number of common features and membership distributions. This conforms with our structured prior that a commonality in features between a pair of nodes can indicate that they belong to the same communities. We also observe that our latent variable values act as a proxy to the relationship between similarity in features and their membership distributions and helps us interpret them better. For example, in Ego-414 dataset, we get the value for the latent variable similarity for nodes 74 and 88 to be 0.714 and we observe that they both have similar membership distributions after training, having a high value for the same community.
Subscribe to Our monthly newsletter for exciting data science news and learnings
Subscribe to our monthly newsletter DSEduBytes, a collection of musings on data science.