library(ergm)
# Simulating a Bernoulli network
<- matrix(rbinom(100^2, 1, 0.1), 100, 100)
Y diag(Y) <- NA
cbind(
logit = glm(as.vector(Y) ~ 1, family = binomial) |> coef(),
ergm = ergm(Y ~ edges) |> coef(),
avg = mean(Y, na.rm = TRUE) |> qlogis()
)## logit ergm avg
## (Intercept) -2.230205 -2.230205 -2.230205
Introduction to ERGMs
Have you ever wondered if the friend of your friend is your friend? Or if the people you consider your friends feel the same about you, or if age is related to whether you seek advice from others? All these (and many others certainly more creative) questions can be answered using Exponential-Family Random Graph Models.
1 ERGMs in a nutshell
Exponential-family Random Graph Models (ERG Models or ERGMs) generalize many network models. These come in various flavors, including directed, undirected, bipartite, and multi-layer networks. ERGMs also feature dynamic models, i.e., evolution of networks over time. Overall, the main goal of ERG models is to make statistical inferences about the prevalence of specific structures in observed graphs. Although we can use it to predict individual ties, other models based on machine learning algorithms may be more appropriate for the task. Here are a few example types of questions and tasks ERG models can be used to address:
First and foremost, estimating network models for testing hypotheses such as “Are matching gender individuals more likely to befriend each other?”, “Are individuals of gender xyz more likely to send more ties than others?”, “Does age play a role in triadic closure?”.
Conditional sampling from existing networks. In permutation tests, network scientists often generate null hypotheses by sampling conditioning on a fix network statistic, for example, the degree sequence or density. With ERGMs, we can go beyond and have a sampling distribution that conditions multiple features simulatenously.
Simulating networks for ABMs. Using existing ERGM fits, we can generate more realistic networks for Agent-Based Model studies.
2 Introduction
Exponential-Family Random Graph Models, known as ERGMs, ERG Models, or p^* models (Holland and Leinhardt (1981), Frank and Strauss (1986), Wasserman and Pattison (1996), Snijders et al. (2006), Robins et al. (2007), and many others) are a large family of statistical distributions for random graphs. In ERGMs, the focus is on the processes that give origin to the network rather than the individual ties.1
The most common representation of ERGMs is the following:
P_{\mathcal{Y}, \bm{\theta}}(\bm{Y}=\bm{y}) = \exp\left(\bm{\theta}^{\mathbf{t}} s(y)\right)\kappa(\bm{\theta})^{-1}
where \bm{Y} is a random graph, \bm{y}\in \mathcal{Y} is a particular realization of Y, \bm{\theta} is a vector of parameters, s(\bm{y}) is a vector of sufficient statistics, and \kappa(\bm{\theta}) is a normalizing constant. The normalizing constant is defined as:
\kappa(\bm{\theta}) = \sum_{\bm{y} \in \mathcal{Y}} \exp\left(\bm{\theta}^{\mathbf{t}} s(\bm{y})\right)
From the statistical point of view, the normalizing constant makes this model attractive; only cases where \mathcal{Y} is small enough to be enumerated are feasible (G. G. Vega Yon, Slaughter, and Haye 2021). Because of that reason, estimating ERGMs is a challenging task.
In simpler terms, ERG models resemble Logistic regressions. The term \bm{\theta}^{\mathbf{t}} s(\bm{y}) is the sum of the product between the parameters and the statistics (like you would see in a Logistic regression,) and its key component is the composition of the vector of sufficient statistics, s(\bm{y}). The latter is what defines the model.
The vector of sufficient statistics dictates the network’s data-generating process. Like the mean and variance characterize the normal distribution, sufficient statistics (and corresponding parameters) characterize the ERG distribution. Figure 1 shows some examples of sufficient statistics with their corresponding parametrizations.
The ergm
package has many terms we can use to fit these models. You can explore the available terms in the manual ergm.terms
in the ergm
package. Take a moment to explore the manual and see the different terms available.
3 Dyadic independence (p1)
The simplest ERG model we can fit is a Bernoulli (Erdos-Renyi) model. Here, the only statistic is the edge count. Now, if we can write the function computing sufficient statistics as a sum over all edges, i.e., s(\bm{y}) = \sum_{ij}s'(y_{ij}), the model reduces to a Logistic regression.
When we see this, we say that dyads are independent of each other, also called p1 models (Holland and Leinhardt 1981). As a second example, imagine that we are interested in assessing whether gender homophily (the tendency between individuals of the same gender to connect) is present in a network. ERG models are the right tool for this task. Moreover, if we assume that gender homophily is the only mechanism that governs the network, the problem reduces to a Logistic regression:
P_{\mathcal{Y}, \bm{\theta}}(y_{ij} = 1) = \text{Logit}^{-1}\left(\theta_{edges} + \theta_{homophily}\mathbb{1}\left(X_i = X_j\right)\right)
where \mathbb{1}\left(\cdot\right) is the indicator function, and X_i equals one if the ith individual is female, and zero otherwise. To see this, let’s simulate some data. The simulate_formula
function will be used in the ergm
package. We only need a network, a model, and some arbitrary coefficients. We start with an indicator vector for gender (1: female, male, otherwise):
# Simulating data
set.seed(7731)
<- rbinom(100, 1, 0.5)
X head(X)
[1] 0 1 0 0 1 0
Next, we use the functions from the ergm
and network
packages to simulate a network with homophily:
# Simulating the network with homophily
library(ergm)
# A network with 100 vertices
<- network(100)
Y
# Adding the vertex attribute with the %v% operator
%v% "X" <- X
Y
# Simulating the network
<- ergm::simulate_formula(
Y # The model
~ edges + nodematch("X"),
Y # The coefficients
coef = c(-3, 2)
)
In this case, the network tends to be sparse (negative edges coefficient) and present homophilic ties (positive node match coefficient). Using the sna
package (also from the statnet
suite), we can plot the network:
library(sna)
gplot(Y, vertex.col = c("green", "red")[Y %v% "X" + 1])
Or we can also use the netplot
package (G. Vega Yon and Bischoff 2024):
library(netplot)
nplot(Y, vertex.color = ~ X)
Using the ERGM package, we can fit the model using code very similar to the one used to simulate the network:
<- ergm(Y ~ edges + nodematch("X")) fit_homophily
We will check the results later and compare them against the following: the Logit model. The following code chunk implements the logistic regression model for this particular model. This example only proves that ERGMs are a generalization of Logistic regression.
Code
<- 100
n <- summary_formula(Y ~ edges + nodematch("X"))
sstats <- as.matrix(Y)
Y_mat diag(Y_mat) <- NA
# To speed computations, we pre-compute this value
<- outer(X, X, "==") |> as.numeric()
X_vec
# Function to compute the negative likelihood
<- \(theta) {
obj
# Compute the probability according to the value of Y
<- ifelse(
p == 1,
Y_mat
# If Y = 1
plogis(theta[1] + X_vec * theta[2]),
# If Y = 0
1 - plogis(theta[1] + X_vec * theta[2])
)
# The -loglikelihood
-sum(log(p[!is.na(p)]))
}
And, using the optim
function, we can fit the model:
# Fitting the model
<- optim(c(0,0), obj, hessian = TRUE) fit_homophily_logit
Now that we have the values, let’s compare them:
# The coefficients
cbind(
theta_ergm = coef(fit_homophily),
theta_logit = fit_homophily_logit$par,
sd_ergm = vcov(fit_homophily) |> diag() |> sqrt(),
sd_logit = sqrt(diag(solve(fit_homophily_logit$hessian)))
)
theta_ergm theta_logit sd_ergm sd_logit
edges -2.973331 -2.973829 0.06659648 0.06661195
nodematch.X 1.926039 1.926380 0.07395580 0.07397026
As you can see, both models yielded the exact estimates because they are the same! Before continuing, let’s review a couple of essential results in ERGMs.
4 The most important results
If we were able to say what two of the most important results in ERGMs are, I would say the following: (a) conditioning on the rest of the graph, the probability of a tie distributes Logistic (Bernoulli), and (b) the ratio between two loglikelihoods can be approximated through simulation.
4.1 The logistic distribution
Let’s start by stating the result: Conditioning on all graphs that are not y_{ij}, the probability of a tie Y_{ij} is distributed Logistic; formally:
P_{\mathcal{Y}, \bm{\theta}}(Y_{ij}=1 | \bm{y}_{-ij}) = \frac{1}{1 + \exp \left(\bm{\theta}^{\mathbf{t}}\delta_{ij}(\bm{y}){}\right)},
where \delta_{ij}(\bm{y}){}\equiv s_{ij}^+(\bm{y}) - s_{ij}^-(\bm{y}) is the change statistic, and s_{ij}^+(\bm{y}) and s_{ij}^-(\bm{y}) are the statistics of the graph with and without the tie Y_{ij}, respectively.
The importance of this result is two-fold: (a) we can use this equation to interpret fitted models in the context of a single graph (like using odds,) and (b) we can use this equation to simulate from the model, without touching the normalizing constant.
4.2 The ratio of loglikelihoods
The second significant result is that the ratio of loglikelihoods can be approximated through simulation. It is based on the following observation by Geyer and Thompson (1992):
\frac{\kappa(\bm{\theta})}{\kappa(\bm{\theta}_0)} = \mathbb{E}_{\mathcal{Y}, \bm{\theta}_0}\left((\bm{\theta} - \bm{\theta}_0)s(\bm{y})^{\mathbf{t}}\right),
Using the latter, we can approximate the following loglikelihood ratio:
\begin{align*} l(\bm{\theta}) - l(\bm{\theta}_0) = & (\bm{\theta} - \bm{\theta}_0)^{\mathbf{t}}s(\bm{y}) - \log\left[\frac{\kappa(\bm{\theta})}{\kappa(\bm{\theta}_0)}\right]\\ \approx & (\bm{\theta} - \bm{\theta}_0)^{\mathbf{t}}s(\bm{y}) - \log\left[M^{-1}\sum_{\bm{y}^{(m)}} (\bm{\theta} - \bm{\theta}_0)^{\mathbf{t}}s(\bm{y}^{(m)})\right] \end{align*}
Where \bm{\theta}_0 is an arbitrary vector of parameters, and \bm{y}^{(m)} are sampled from the distribution P_{\mathcal{Y}, \bm{\theta}_0}. In the words of Geyer and Thompson (1992), “[…] it is possible to approximate \bm{\theta} by using simulations from one distribution P_{\mathcal{Y}, \bm{\theta}_0} no matter which \bm{\theta}_0 in the parameter space is.” This result has been key for the MC-MLE algorithm (see Hunter, Krivitsky, and Schweinberger (2012)), which has been front and center in the ergm
R package.2
5 Start to finish example
There is no silver bullet to fit ERGMs. However, the following steps are a good starting point:
Inspect the data: Ensure the network you work with is processed correctly. Some common mistakes include isolates being excluded, vertex attributes not being adequately aligned (mismatch), etc.
Start with endogenous effects first: Before jumping into vertex/edge-covariates, try fitting models that only include structural terms such as edgecount, triangles (or their equivalent,) stars, reciprocity, etc.
After structure is controlled: You can add vertex/edge-covariates. The most common ones are homophily, nodal-activity, etc.
Evaluate your results: Once you have a model you are happy with, the last couple of steps are (a) assess convergence (which is usually done automagically by the
ergm
package,) and (b) assess goodness-of-fit, which in this context means how good was our model to capture (not-controlled for) properties of the network.
Although we could go ahead and use an existing dataset to play with, instead, we will simulate a directed graph with the following properties:
- 100 nodes.
- Homophily on Music taste.
- Gender heterophily.
- Reciprocity
The following code chunk illustrates how to do this in R. Notice that, like the previous example, we need to create a network with the vertex attributes needed to simulate homophily.
# Simulating the covariates (vertex attribute)
set.seed(1235)
# Simulating the data
<- network(100, directed = TRUE)
Y %v% "fav_music" <- sample(c("rock", "jazz", "pop"), 100, replace = TRUE)
Y %v% "female" <- rbinom(100, 1, 0.5) Y
Now that we have a starting point for the simulation, we can use the simulate_formula
function to get our network:
<- ergm::simulate_formula(
Y ~
Y +
edges nodematch("fav_music") +
nodematch("female") +
mutual,coef = c(-4, 1, -1, 2)
)
# And visualize it!
gplot(Y, vertex.col = c("green", "red")[Y %v% "female" + 1])
This figure is precisely why we need ERGMs (and why many Sunbelt talks don’t include a network visualization!). We know the graph has structure (it’s not random), but visually, it is hard to see.
5.1 Inspect the data
For the sake of time, we will not take the time to investigate our network properly. However, you should always do so. Make sure you do descriptive statistics (density, centrality, modularity, etc.), check missing values, isolates (disconnected nodes), and inspect your data visually through “notepad” and visualizations before jumping into your ERG model.
5.2 Start with endogenous effects first
The step is to check whether we can fit an ERGM or not. We can do so with the Bernoulli graph:
<- ergm(Y ~ edges)
model_1 summary(model_1)
Call:
ergm(formula = Y ~ edges)
Maximum Likelihood Results:
Estimate Std. Error MCMC % z value Pr(>|z|)
edges -3.78885 0.06833 0 -55.45 <1e-04 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 13724 on 9900 degrees of freedom
Residual Deviance: 2102 on 9899 degrees of freedom
AIC: 2104 BIC: 2112 (Smaller is better. MC Std. Err. = 0)
It is rare to see a model in which the edgecount is not significant. The next term we will add is reciprocity (mutual
in the ergm
package)
<- ergm(Y ~ edges + mutual)
model_2 ## Warning: 'glpk' selected as the solver, but package 'Rglpk' is not available;
## falling back to 'lpSolveAPI'. This should be fine unless the sample size and/or
## the number of parameters is very big.
summary(model_2)
## Call:
## ergm(formula = Y ~ edges + mutual)
##
## Monte Carlo Maximum Likelihood Results:
##
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -3.93230 0.07596 0 -51.769 <1e-04 ***
## mutual 2.18283 0.28278 0 7.719 <1e-04 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 13724 on 9900 degrees of freedom
## Residual Deviance: 2065 on 9898 degrees of freedom
##
## AIC: 2069 BIC: 2083 (Smaller is better. MC Std. Err. = 0.5686)
As expected, reciprocity is significant (we made it like this!.) Notwithstanding, there is a difference between this model and the previous one. This model was not fitted using MLE. Instead, since the reciprocity term involves more than one tie, the model cannot be reduced to a Logistic regression, so it needs to be estimated using one of the other available estimation methods in the ergm
package.
The model starts gaining complexity as we add higher-order terms involving more ties. An infamous example is the number of triangles. Although highly important for social sciences, including triangle terms in your ERGMs results in a degenerate model (when the MCMC chain jumps between empty and fully connected graphs). One exception is if you deal with small networks. To address this, Snijders et al. (2006) and Hunter (2007) introduced various new terms that significantly reduce the risk of degeneracy. Here, we will illustrate the use of the term “geometrically weighted dyadic shared partner” (gwdsp
,) which Prof. David Hunter proposed. The gwdsp
term is akin to triadic closure but reduces the chances of degeneracy.
# Fitting two more models (output message suppressed)
<- ergm(Y ~ edges + mutual + gwdsp(.5, fixed = TRUE))
model_3 # model_4 <- ergm(Y ~ edges + triangles) # bad idea
Right after fitting a model, we want to inspect the results. An excellent tool for this is the R package texreg
(Leifeld 2013):
library(texreg)
knitreg(list(model_1, model_2, model_3))
Model 1 | Model 2 | Model 3 | |
---|---|---|---|
edges | -3.79*** | -3.93*** | -3.82*** |
(0.07) | (0.08) | (0.22) | |
mutual | 2.18*** | 2.16*** | |
(0.28) | (0.30) | ||
gwdsp.OTP.fixed.0.5 | -0.02 | ||
(0.05) | |||
AIC | 2104.43 | 2068.95 | 2069.52 |
BIC | 2111.63 | 2083.35 | 2091.12 |
Log Likelihood | -1051.22 | -1032.48 | -1031.76 |
***p < 0.001; **p < 0.01; *p < 0.05 |
So far, model_2
is winning. We will continue with this model.
5.3 Let’s add a little bit of structure
Now that we only have a model featuring endogenous terms, we can add vertex/edge-covariates. Starting with "fav_music"
, there are a couple of different ways to use this node feature:
Directly through homophily (assortative mixing): Using the
nodematch
term, we can control for the propensity of individuals to connect based on shared music taste.Homophily (v2): We could activate the option
diff = TRUE
using the same term. By doing this, the homophily term is operationalized differently, adding as many terms as options in the vertex attribute.Mixing: We can use the term
nodemix
for individuals’ tendency to mix between musical tastes.
<- ergm(Y ~ edges + mutual + nodematch("fav_music"))
model_4 <- ergm(Y ~ edges + mutual + nodematch("fav_music", diff = TRUE))
model_5 <- ergm(Y ~ edges + mutual + nodemix("fav_music")) model_6
Now, let’s inspect what we have so far:
knitreg(list(`2` = model_2, `4` = model_4, `5` = model_5, `6` = model_6))
2 | 4 | 5 | 6 | |
---|---|---|---|---|
edges | -3.93*** | -4.29*** | -4.28*** | -3.55*** |
(0.08) | (0.10) | (0.10) | (0.23) | |
mutual | 2.18*** | 1.98*** | 1.96*** | 1.98*** |
(0.28) | (0.30) | (0.30) | (0.30) | |
nodematch.fav_music | 0.85*** | |||
(0.14) | ||||
nodematch.fav_music.jazz | 0.73** | |||
(0.25) | ||||
nodematch.fav_music.pop | 0.83*** | |||
(0.17) | ||||
nodematch.fav_music.rock | 0.88*** | |||
(0.16) | ||||
mix.fav_music.pop.jazz | -0.54 | |||
(0.35) | ||||
mix.fav_music.rock.jazz | -0.76* | |||
(0.36) | ||||
mix.fav_music.jazz.pop | -0.61 | |||
(0.34) | ||||
mix.fav_music.pop.pop | 0.09 | |||
(0.26) | ||||
mix.fav_music.rock.pop | -0.53 | |||
(0.30) | ||||
mix.fav_music.jazz.rock | -1.41** | |||
(0.44) | ||||
mix.fav_music.pop.rock | -0.87** | |||
(0.32) | ||||
mix.fav_music.rock.rock | 0.14 | |||
(0.26) | ||||
AIC | 2068.95 | 2031.20 | 2033.35 | 2037.32 |
BIC | 2083.35 | 2052.80 | 2069.35 | 2109.33 |
Log Likelihood | -1032.48 | -1012.60 | -1011.67 | -1008.66 |
***p < 0.001; **p < 0.01; *p < 0.05 |
Although model 5 has a higher loglikelihood, using AIC or BIC suggests model 4 is a better candidate. For the sake of time, we will jump ahead and add nodematch("female")
as the last term of our model. The next step is to assess (a) convergence and (b) goodness-of-fit.
<- ergm(Y ~ edges + mutual + nodematch("fav_music") + nodematch("female"))
model_final
# Printing the pretty table
knitreg(list(`2` = model_2, `4` = model_4, `Final` = model_final))
2 | 4 | Final | |
---|---|---|---|
edges | -3.93*** | -4.29*** | -3.95*** |
(0.08) | (0.10) | (0.11) | |
mutual | 2.18*** | 1.98*** | 1.86*** |
(0.28) | (0.30) | (0.32) | |
nodematch.fav_music | 0.85*** | 0.82*** | |
(0.14) | (0.14) | ||
nodematch.female | -0.74*** | ||
(0.14) | |||
AIC | 2068.95 | 2031.20 | 2002.64 |
BIC | 2083.35 | 2052.80 | 2031.44 |
Log Likelihood | -1032.48 | -1012.60 | -997.32 |
***p < 0.001; **p < 0.01; *p < 0.05 |
5.4 Evaluate your results
5.4.1 Convergence
As our model was fitted using MCMC, we must ensure the chains converged. We can use the mcmc.diagnostics
function from the ergm
package to check model convergence. This function looks at the last set of simulations of the MCMC model and generates various diagnostics for the user.
Under the hood, the fitting algorithm generates a stream of networks based on those parameters for each new proposed set of model parameters. The last stream of networks is thus simulated using the final state of the model. The mcmc.diagnostics
function takes that stream of networks and plots the sequence of the sufficient statistics included in the model. A converged model should show a stationary statistics sequence, moving around a fixed value without (a) becoming stuck at any point and (b) chaining the tendency. This model shows both:
mcmc.diagnostics(model_final, which = c("plots"))
Note: MCMC diagnostics shown here are from the last round of
simulation, prior to computation of final parameter estimates.
Because the final estimates are refinements of those used for this
simulation run, these diagnostics may understate model performance.
To directly assess the performance of the final model on in-model
statistics, please use the GOF command: gof(ergmFitObject,
GOF=~model).
Now that we know our model was good enough to represent the observed statistics (sample them, actually,) let’s see how good it is at capturing other features of the network that were not included in the model.
5.4.2 Goodness-of-fit
This would be the last step in the sequence of steps to fit an ERGM. As we mentioned before, the idea of Goodness-of-fit [GOF] in ERG models is to see how well our model captures other properties of the graph that were not included in the model. By default, the gof
function from the ergm
package computes GOF for:
The model statistics.
The outdegree distribution.
The indegree distribution.
The distribution of edge-wise shared partners.
The distribution of the geodesic distances (shortest path).
The process of evaluating GOF is relatively straightforward. Using samples from the posterior distribution, we check whether the observed statistics from above are covered (fall within the CI) of our model. If they do, we say that the model has a good fit. Otherwise, if we observe significant anomalies, we return to the bench and try to improve our model.
As with all simulated data, our gof()
call shows that our selected model was an excellent choice for the observed graph:
<- gof(model_final)
gof_final print(gof_final)
##
## Goodness-of-fit for in-degree
##
## obs min mean max MC p-value
## idegree0 11 5 11.25 20 1.00
## idegree1 22 14 24.88 38 0.58
## idegree2 23 17 27.33 39 0.34
## idegree3 30 9 19.66 29 0.00
## idegree4 10 3 10.28 18 1.00
## idegree5 3 0 4.45 12 0.66
## idegree6 1 0 1.48 4 1.00
## idegree7 0 0 0.47 3 1.00
## idegree8 0 0 0.15 2 1.00
## idegree9 0 0 0.03 1 1.00
## idegree10 0 0 0.02 1 1.00
##
## Goodness-of-fit for out-degree
##
## obs min mean max MC p-value
## odegree0 10 4 11.59 22 0.70
## odegree1 30 15 24.58 35 0.32
## odegree2 23 15 27.17 36 0.32
## odegree3 17 12 19.49 32 0.60
## odegree4 10 4 10.58 18 1.00
## odegree5 8 0 4.37 10 0.20
## odegree6 2 0 1.54 5 0.96
## odegree7 0 0 0.53 3 1.00
## odegree8 0 0 0.13 2 1.00
## odegree9 0 0 0.02 1 1.00
##
## Goodness-of-fit for edgewise shared partner
##
## obs min mean max MC p-value
## esp.OTP0 212 173 205.28 251 0.64
## esp.OTP1 7 1 10.20 23 0.58
## esp.OTP2 0 0 0.24 3 1.00
## esp.OTP3 0 0 0.01 1 1.00
##
## Goodness-of-fit for minimum geodesic distance
##
## obs min mean max MC p-value
## 1 219 184 215.73 262 0.88
## 2 449 323 443.96 641 0.94
## 3 790 504 815.51 1382 0.98
## 4 1151 686 1215.23 2172 0.86
## 5 1245 778 1370.14 2133 0.60
## 6 1043 757 1164.47 1519 0.38
## 7 745 515 791.92 1153 0.80
## 8 434 152 467.19 866 0.82
## 9 198 28 250.29 577 0.64
## 10 80 0 123.53 413 0.68
## 11 10 0 57.66 290 0.36
## 12 1 0 26.01 186 0.44
## 13 0 0 10.68 101 0.82
## 14 0 0 3.70 42 1.00
## 15 0 0 1.30 23 1.00
## 16 0 0 0.49 14 1.00
## 17 0 0 0.15 7 1.00
## 18 0 0 0.08 5 1.00
## 19 0 0 0.01 1 1.00
## Inf 3535 1061 2941.95 4713 0.40
##
## Goodness-of-fit for model statistics
##
## obs min mean max MC p-value
## edges 219 184 215.73 262 0.88
## mutual 16 7 15.81 24 1.00
## nodematch.fav_music 123 86 119.89 143 0.88
## nodematch.female 72 47 71.48 91 0.96
It is easier to see the results using the plot
function:
# Plotting the result (5 figures)
<- par(mfrow = c(3,2))
op plot(gof_final)
par(op)
6 Conclusion
We have shown here just a glimpse of what ERG models are. A large, active, collaborative community of social network scientists is working on new extensions and advances. If you want to learn more about ERGMs, I recommend the following resources:
7 Group activity
The ergm
package comes with a handful of vignettes (extended R examples) that you can use to learn more about the package. The vignette with the same name as the package shows an example fitting a model to the network faux.mesa.high
. Address the following questions:
What type of ERG model is this?
Looking at the plot in the vignette, what other term(s) could you consider worth adding to the model? Why?
The vignette did not include a goodness-of-fit analysis. Can you add one? What do you find?
Challenge: without using
ergm
to fit the model, estimate the following p1 model:~ edges + nodefactor("Race")
. Compare your results with what you get usingergm
.