Node based analysis with network statistics

The function sem.net can be used to fit a SEM model with network data using node statistics as variables. User-specified network statistics will be calculated and used as variables instead of the networks themselves in the SEM.

The following choices of network statistics can be used:

degree: Degree is a centrality measure that counts actors/nodes a specific node is connected to.
betweenness: Betweenness is a centrality measure that counts how many shortest path an actor is crossed by through a random choice. It measures how much an individual control the spread of information.
closeness: Closeness is a measure of how efficiently a node spreads information and can be calculated by the average inverse distance from a node to all other nodes.
evcent: The eigenvector centrality is a measure of transitive influence of each node, meaning that a node with high eigenvector centrality tends to connect with other nodes with high eigenvector centrality (Ruhnau, 2000).
stresscent: Stress centrality is similar to betweenness centrality as it also measures the control of spread. However, while betweenness centrality measures through a random fraction of shortest paths, stress centrality takes into account all shortest paths (Szczepanski et al., 2012).
infocent: Information centrality is defined as the reduction in network efficiency if a target node is removed. It is a measure of node effectiveness in spreading information (Latora and Marchiori, 2007).
ivi: Integrated value of influence is a measure that combines different centrality measures (Salavaty et al., 2020a)
hubeness.score: Hubeness score is a component of IVI and measures a node’s influence in its surrounding environment.
spreading.score: Spreading score is another component of IVI and measures a node’s spreading potential.
clusterRank: Cluster rank is a measure of clustering that takes into account a node, its neighbors, and their clustering coefficients.

Simulated Data Example

To begin with, a random simulated dataset can be used to demonstrate the usage of the node-based network statistics approach. The code below generate a simulated network net with four non-network covariates x1 - x4 which loads on two latent variables lv1, lv2.

set.seed(100) 
nsamp = 100 # sample size
net <- ifelse(matrix(rnorm(nsamp^2), nsamp, nsamp) > 1, 1, 0) # simulate network
mean(net) # density of simulated network

# simulate non-network variables
lv1 <- rnorm(nsamp)
lv2 <- rnorm(nsamp)
nonnet <- data.frame(x1 = lv1*0.5 + rnorm(nsamp),
                     x2 = lv1*0.8 + rnorm(nsamp),
                     x3 = lv2*0.5 + rnorm(nsamp),
                     x4 = lv2*0.8 + rnorm(nsamp))

With the simulated data, we can define a model string with lavaan syntax that specifies the measurement model as well as the relationship between the network and the non-network variables. In this case, we are using net as a mediator between the two latent variables. Since data are generated randomly, the effects should be small overall.

model <-'
  lv1 =~ x1 + x2
  lv2 =~ x3 + x4
  net ~ lv2
  lv1 ~ net + lv2
'

Arguments passed to the sem.net function includes the model, the dataset, and the network statistics of interest. Note that data here should be a list with two elements, one being the named list of all network variables and one being the dataframe containing non-network variables. A summary function can be used to look at the output, and the function path.networksem can be used to look at mediation effects.

data = list(network = list(net = net), nonnetwork = nonnet)
set.seed(100)
res <- sem.net(model = model, data = data, netstats = c('degree'))
summary(res)
path.networksem(res, "lv2", c("net.degree"), "lv1")

The output of should look like the following.

> summary(res) 
The SEM output:
lavaan 0.6.15 ended normally after 54 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        12

  Number of observations                           100

Model Test User Model:
                                                      
  Test statistic                                 1.230
  Degrees of freedom                                 3
  P-value (Chi-square)                           0.746

Model Test Baseline Model:

  Test statistic                                24.987
  Degrees of freedom                                10
  P-value                                        0.005

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    1.000
  Tucker-Lewis Index (TLI)                       1.394

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)               -913.294
  Loglikelihood unrestricted model (H1)       -912.679
                                                      
  Akaike (AIC)                                1850.588
  Bayesian (BIC)                              1881.850
  Sample-size adjusted Bayesian (SABIC)       1843.951

Root Mean Square Error of Approximation:

  RMSEA                                          0.000
  90 Percent confidence interval - lower         0.000
  90 Percent confidence interval - upper         0.118
  P-value H_0: RMSEA <= 0.050                    0.810
  P-value H_0: RMSEA >= 0.080                    0.120

Standardized Root Mean Square Residual:

  SRMR                                           0.026

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)
  lv2 =~                                              
    x4                1.000                           
    x3                2.035    2.162    0.941    0.347
  lv1 =~                                              
    x2                1.000                           
    x1                1.056    0.789    1.338    0.181

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)
  lv1 ~                                               
    lv2              -0.441    0.300   -1.470    0.142
  net.degree ~                                        
    lv2              -0.934    1.163   -0.804    0.422
  lv1 ~                                               
    net.degree       -0.011    0.020   -0.569    0.569

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)
   .x4                1.350    0.293    4.603    0.000
   .x3                0.215    0.923    0.233    0.816
   .x2                1.002    0.299    3.357    0.001
   .x1                1.047    0.328    3.190    0.001
   .net.degree       22.292    3.164    7.046    0.000
    lv2               0.214    0.249    0.860    0.390
   .lv1               0.302    0.264    1.142    0.253

> path.networksem(res, "lv2", c("net.degree"), "lv1")
  predictor   mediator outcome     apath       bpath   indirect indirect_se  indirect_z
1       lv2 net.degree     lv1 -0.934393 -0.01126621 0.01052707    1.086552 0.009688509

Empirical Data Example

Using the friendship network data, a model with 5 personality traits and two networks' effect on happiness can be fitted using the code below. In this case, degree, betweenness, closeness are used as network statistics.

# load data
load("data/cf_data_book.RData")  ## load the list cf_data 

## data - non-network variables
non_network <- as.data.frame(cf_data$cf_nodal_cov)
dim(non_network)

## network - network variables (friends network and wechat network)
## note that the names of the networks are used in model specification
network <- list()
network$friends <- cf_data$cf_friend_network
network$wechat <- cf_data$cf_wetchat_network

model <-'
  Extroversion =~ personality1 + personality6
                + personality11 + personality16
  Conscientiousness =~ personality2 + personality7
                + personality12 + personality17
  Neuroticism  =~ personality3 + personality8
                + personality13 + personality18
  Openness =~ personality4 + personality9
                + personality14 + personality19
  Agreeableness =~ personality5 + personality10 +
                personality15 + personality20
  Happiness =~ happy1 + happy2 + happy3 + happy4
  friends ~ Extroversion + Conscientiousness + Neuroticism + 
  Openness + Agreeableness
  Happiness ~ friends + wechat 
'

## run sem.net
data = list(
  nonnetwork = non_network,
  network = network
)

set.seed(100)
res <- sem.net(model=model, data=data, 
               netstats=c("degree", "betweenness", "closeness"),
               netstats.rescale = T,
               netstats.options=list("degree"=list("cmode"="freeman"))) 

## results
summary(res)

The output of the analysis is given below:


lavaan 0.6-18 ended normally after 453 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        82

  Number of observations                           165

Model Test User Model:
                                                      
  Test statistic                               844.769
  Degrees of freedom                               377
  P-value (Chi-square)                           0.000

Model Test Baseline Model:

  Test statistic                              1795.826
  Degrees of freedom                               432
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.657
  Tucker-Lewis Index (TLI)                       0.607

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)              -6286.542
  Loglikelihood unrestricted model (H1)      -5864.157
                                                      
  Akaike (AIC)                               12737.084
  Bayesian (BIC)                             12991.771
  Sample-size adjusted Bayesian (SABIC)      12732.159

Root Mean Square Error of Approximation:

  RMSEA                                          0.087
  90 Percent confidence interval - lower         0.079
  90 Percent confidence interval - upper         0.095
  P-value H_0: RMSEA <= 0.050                    0.000
  P-value H_0: RMSEA >= 0.080                    0.922

Standardized Root Mean Square Residual:

  SRMR                                           0.116

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                       Estimate  Std.Err  z-value  P(>|z|)
  Happiness =~                                            
    happy4                1.000                           
    happy3               -4.283    3.684   -1.162    0.245
    happy2               -6.682    5.698   -1.173    0.241
    happy1               -6.955    5.932   -1.172    0.241
  Agreeableness =~                                        
    personality20         1.000                           
    personality15        -1.200    0.905   -1.326    0.185
    personality10        -4.293    2.506   -1.713    0.087
    personality5         -4.462    2.606   -1.712    0.087
  Openness =~                                             
    personality19         1.000                           
    personality14         0.784    0.165    4.748    0.000
    personality9         -0.224    0.106   -2.110    0.035
    personality4         -0.097    0.108   -0.898    0.369
  Neuroticism =~                                          
    personality18         1.000                           
    personality13        -0.532    0.148   -3.603    0.000
    personality8         -0.808    0.176   -4.602    0.000
    personality3         -0.378    0.136   -2.778    0.005
  Conscientiousness =~                                    
    personality17         1.000                           
    personality12        -0.693    0.214   -3.235    0.001
    personality7         -0.508    0.219   -2.319    0.020
    personality2          1.108    0.265    4.187    0.000
  Extroversion =~                                         
    personality16         1.000                           
    personality11         0.609    0.136    4.493    0.000
    personality6         -0.508    0.123   -4.116    0.000
    personality1         -0.521    0.119   -4.377    0.000

Regressions:
                        Estimate  Std.Err  z-value  P(>|z|)
  friends.degree ~                                         
    Extroversion           2.355    1.126    2.091    0.037
  friends.betweenness ~                                    
    Extroversion           2.119    1.048    2.023    0.043
  friends.closeness ~                                      
    Extroversion           2.175    1.026    2.119    0.034
  friends.degree ~                                         
    Conscientisnss        -8.447    5.060   -1.670    0.095
  friends.betweenness ~                                    
    Conscientisnss        -7.827    4.706   -1.663    0.096
  friends.closeness ~                                      
    Conscientisnss        -7.720    4.609   -1.675    0.094
  friends.degree ~                                         
    Neuroticism           -1.282    1.364   -0.940    0.347
  friends.betweenness ~                                    
    Neuroticism           -1.252    1.272   -0.985    0.325
  friends.closeness ~                                      
    Neuroticism           -1.324    1.248   -1.061    0.289
  friends.degree ~                                         
    Openness              -1.355    1.483   -0.914    0.361
  friends.betweenness ~                                    
    Openness              -1.204    1.377   -0.875    0.382
  friends.closeness ~                                      
    Openness              -1.162    1.348   -0.862    0.389
  friends.degree ~                                         
    Agreeableness        -16.541   15.253   -1.084    0.278
  friends.betweenness ~                                    
    Agreeableness        -15.697   14.299   -1.098    0.272
  friends.closeness ~                                      
    Agreeableness        -14.400   13.668   -1.054    0.292
  Happiness ~                                              
    friends.degree        -0.047    0.051   -0.931    0.352
    frinds.btwnnss         0.007    0.025    0.292    0.771
    friends.clsnss         0.062    0.059    1.045    0.296
    wechat.degree          0.013    0.037    0.351    0.725
    wechat.btwnnss         0.050    0.049    1.027    0.305
    wechat.closnss        -0.064    0.060   -1.063    0.288

Covariances:
                       Estimate  Std.Err  z-value  P(>|z|)
  Agreeableness ~~                                        
    Openness              0.015    0.018    0.866    0.386
    Neuroticism           0.043    0.029    1.479    0.139
    Conscientisnss       -0.072    0.044   -1.643    0.100
    Extroversion         -0.011    0.020   -0.554    0.579
  Openness ~~                                             
    Neuroticism           0.330    0.074    4.446    0.000
    Conscientisnss       -0.166    0.059   -2.806    0.005
    Extroversion          0.089    0.080    1.111    0.266
  Neuroticism ~~                                          
    Conscientisnss       -0.153    0.058   -2.648    0.008
    Extroversion          0.212    0.082    2.588    0.010
  Conscientiousness ~~                                    
    Extroversion          0.174    0.070    2.490    0.013

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)
   .happy4            2.702    0.298    9.066    0.000
   .happy3            1.226    0.147    8.353    0.000
   .happy2            0.577    0.139    4.146    0.000
   .happy1            0.507    0.145    3.496    0.000
   .personality20     1.107    0.123    8.979    0.000
   .personality15     1.195    0.134    8.945    0.000
   .personality10     0.617    0.115    5.359    0.000
   .personality5      0.742    0.130    5.705    0.000
   .personality19     0.244    0.125    1.948    0.051
   .personality14     0.680    0.107    6.372    0.000
   .personality9      0.854    0.095    8.982    0.000
   .personality4      0.963    0.106    9.067    0.000
   .personality18     0.498    0.104    4.790    0.000
   .personality13     0.920    0.109    8.469    0.000
   .personality8      0.965    0.125    7.694    0.000
   .personality3      0.893    0.102    8.768    0.000
   .personality17     0.707    0.088    8.051    0.000
   .personality12     1.042    0.119    8.753    0.000
   .personality7      1.286    0.144    8.940    0.000
   .personality2      1.193    0.143    8.337    0.000
   .personality16     0.595    0.152    3.917    0.000
   .personality11     1.125    0.140    8.023    0.000
   .personality6      1.043    0.126    8.305    0.000
   .personality1      0.902    0.111    8.122    0.000
   .friends.degree    0.074    0.026    2.872    0.004
   .frinds.btwnnss    0.236    0.034    6.912    0.000
   .friends.clsnss    0.170    0.029    5.849    0.000
   .Happiness         0.024    0.040    0.587    0.557
    Agreeableness     0.030    0.034    0.874    0.382
    Openness          0.652    0.155    4.209    0.000
    Neuroticism       0.495    0.129    3.822    0.000
    Conscientisnss    0.248    0.082    3.038    0.002
    Extroversion      0.843    0.199    4.240    0.000

The multiple mediation from Agreeableness to friendship network to Happiness can be calculated using the following code.

> path.networksem(res, 'Agreeableness', 
                        c('friends.degree', 'friends.betweenness', 'friends.closeness'), 
                        'Happiness')

      predictor            mediator   outcome     apath        bpath   indirect
1 Agreeableness      friends.degree Happiness -16.54130 -0.047133471  0.7796491
2 Agreeableness friends.betweenness Happiness -15.69767  0.007403778 -0.1162220
3 Agreeableness   friends.closeness Happiness -14.40081  0.061957757 -0.8922416
  indirect_se    indirect_z
1    252.3110  0.0030900323
2    224.4727 -0.0005177557
3    196.8378 -0.0045328765

The model used here is shown in the diagram below. The model has the following features:

We use two networks - friendship and WeChat networks.
Three network statistics are used - degree, closeness, and betweenness.
Friendship network is used as mediators.

How to install BigSEM?

SEM with networks - background

Example datasets

Node based analysis with network statistics

Node based analysis with latent space model

Edge based analysis with edge values

Edge based analysis with latent space model

Use of Web App for SEM with Networks

Example data

Text Sentiment

Text Embedding and Encoders

Use of the R package TextSEM

Use of Web App

Video tutorials text data analysis

Node based analysis with network statistics

Simulated Data Example

Empirical Data Example