Skip to main content

SEM with networks - background

Network data can be integrated into the SEM framework in different ways. We focus on two main approaches here. The first approach extracts the information from a network based on each participant and then use that information as variable(s) in a SEM model. In this method, each participant (node) in the network is the basic unit for analysis. The second approach extracts information from a network based on each relationship present. In this method, each pair of participants or nodes are used as the basic unit for analysis.

In our software, we propose and implement four types of models. 

We denote a network through a square adjacency matrix $\mathbf{M}=[m_{ij}]$ with each $m_{ij}$ denoting the connection between subject $i$ and subject $j$. Based on the adjacency matrix, many node-based network statistics can be defined \citep{wasserman1994social}. For example, the
statistic degree is a centrality measure that simply counts how many
subjects a subject connects to in the network. The statistic
betweenness measures the extent to which a subject lies on the paths
between other subjects. Subjects with high betweenness influence how the
information flows in the network. Both degree and betweenness quantify
the importance of a subject in a network. For example, for our
friendship network, if a student has a larger degree, he or she is more
popular in the network. In this proposal, we use
$\mathbf{t}_{i}(\mathbf{M})$ to represent a vector of network statistics
for subject $i$.

When using network statistics in the SEM framework, $\mathbf{n} = \mathbf{t}_{i}(\mathbf{M})$ as shown in Equation \ref{eq:bw-1}. Because the network statistics are node based, the dimension of the resulting network statistics data will match the non-network data, and they can be combined to be used in SEM Although the notations we have been using are already for individuals, we will add the subscript $i$ to further distinguish for the node-based model (with subscript $i$) and the edge-based model (with subscript $ij$) from this point on.

\begin{equation}
\left(\begin{array}{c}
\boldsymbol{\eta}_{i}\\
\mathbf{t}_{i}^{+}
\end{array}\right)=\boldsymbol{\beta}\left(\begin{array}{c}
\boldsymbol{\eta}_{i}\\
\mathbf{t}_{i}^{+}
\end{array}\right)+\boldsymbol{\gamma}\left(\begin{array}{c}
\boldsymbol{\xi}_{i}\\
\mathbf{t}_{i}^{-}
\end{array}\right)\label{eq:bw-1}
\end{equation}

Network nodes as analysis units

In this method, each participant is treated as the basic unit of analysis. Therefore, the sample size is equal the sample size $n$. We use two approaches here: (1) we extract information as network statistics from a network, and (2) we extract information through a latent space model.

Use network statistics

Use latent space model