help("mean")Programming in R
The R programming language (R Core Team 2023) is the de facto language for social network analysis1. Furthermore, R is home to the most comprehensive collection of packages implementing the methods we will cover here. Let’s start with the fundamentals.
Getting help
Unlike other languages, R’s documentation is highly reliable. The Comprehensive R Archive Network [CRAN] is the official repository of R packages. All packages posted on CRAN must pass a series of tests to ensure the quality of the code, including the documentation.
To get help on a function, we can use the help() function. For example, if we wanted to get help on the mean() function, we would do:
Naming conventions
R has a set of naming conventions that we should follow to avoid confusion. The most important ones are:
- Use lowercase letters (optional)
- Use underscores to separate words (optional)
- Do not start with a number
- Do not use special characters
- Do not use reserved words
Assignment
In R, we have two (four) ways of assigning values to objects: the <- and = binary operators2. Although both are equivalent, the former is the preferred way of assigning values to objects since the latter can be confused with function arguments.
x <- 1
x = 1Using functions and piping
In R, we use functions to perform operations on objects. Functions are implemented as function_name ( argument_1 , argument_2 , ... ). For example, the mean() function takes a vector of numbers and returns the mean of the values:
x <- c(1, 2, 3) # The c() function creates a vector
mean(x)
## [1] 2Furthermore, we can use the pipe operator (|>) to improve readability. The pipe operator takes the output of the left-hand side expression and passes it as the first argument of the right-hand side expression. Our previous example could be rewritten as:
c(1, 2, 3) |> mean()
## [1] 2Data structures
Atomic types are the minimal building blocks of R. They are logical, integer, double, character, complex, raw:
x_logical <- TRUE
x_integer <- 1L
x_double <- 1.0
x_character <- "a"
x_complex <- 1i
x_raw <- charToRaw("a")Unlike other languages, we do not need to declare the data type before creating the object; R will infer it from the value.
The next type is the vector. A vector is a collection of elements of the same type. The most common way to create a vector is with the c() function:
x_integer <- c(1, 2, 3)
x_double <- c(1.0, 2.0, 3.0)
x_logical <- c(TRUE, FALSE, TRUE)
## etc.R will coerce the data types to the most general type. For example, if we mix integers and doubles, R will coerce the integers into doubles. The coercion order is logical < integer < double < character
The next data structure is the list. A list is a collection of elements of any type. We can create a list with the list() function:
x_list <- list(1, 2.0, TRUE, "a")
x_list_named <- list(a = 1, b = 2.0, c = TRUE, d = "a")To access elements in a list, we have two options: by position or by name, the latter only if the elements are named:
x_list[[1]]
## [1] 1
x_list_named[["a"]]
## [1] 1
x_list_named$a
## [1] 1After lists, we have matrices. A matrix is a collection of elements of the same type arranged in a two-dimensional grid. We can create a matrix with the matrix() function:
x_matrix <- matrix(1:9, nrow = 3, ncol = 3)
x_matrix
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9
## We can access elements in a matrix by row column, or position:
x_matrix[1, 2]
## [1] 4
x_matrix[cbind(1, 2)]
## [1] 4
x_matrix[4]
## [1] 4The two last data structures are arrays and data frames. An array is a collection of elements of the same type arranged in a multi-dimensional grid. We can create an array with the array() function:
x_array <- array(1:27, dim = c(3, 3, 3))
## We can access elements in an array by row, column, and dimension, or
## position:
x_array[1, 2, 3]
## [1] 22
x_array[cbind(1, 2, 3)]
## [1] 22
x_array[22]
## [1] 22Data frames are the most common data structure in R. In principle, these objects are lists of vectors of the same length, each vector representing a column. Columns (lists) in data frames can be of different types, but elements in each column must be of the same type. We can create a data frame with the data.frame() function:
x_data_frame <- data.frame(
a = 1:3,
b = c("a", "b", "c"),
c = c(TRUE, FALSE, TRUE)
)
## We can access elements in a data frame by row, column, or position:
x_data_frame[1, 2]
## [1] "a"
x_data_frame[cbind(1, 2)]
## [1] "a"
x_data_frame$b[1] # Like a list
## [1] "a"
x_data_frame[[2]][1] # Like a list too
## [1] "a"Functions
Functions are the most important building blocks of R. A function is a set of instructions that takes one or more inputs and returns one or more outputs. We can create a function with the function() function:
## This function has two arguments (y is optional)
f <- function(x, y = 1) {
x + 1
}
f(1)
## [1] 2Starting with R 4, we can use the lambda syntax to create functions:
f <- \(x, y) x + 1
f(1)
## [1] 2Control flow
Control flow statements allow us to control the execution of the code. The most common control flow statements are if, for, while, and repeat. We can create a control flow statement with the if(), for(), while(), and repeat() functions:
## if
if (TRUE) {
"a"
} else {
"b"
}
## [1] "a"
## for
for (i in 1:3) {
cat("This is the number ", i, "\n")
}
## This is the number 1
## This is the number 2
## This is the number 3
## while
i <- 1
while (i <= 3) {
cat("This is the number ", i, "\n")
i <- i + 1
}
## This is the number 1
## This is the number 2
## This is the number 3
## repeat
i <- 1
repeat {
cat("This is the number ", i, "\n")
i <- i + 1
if (i > 3) {
break
}
}
## This is the number 1
## This is the number 2
## This is the number 3R packages
R is so powerful because of its extensions. R extensions (different from other programming languages) are called packages. Packages are collections of functions, data, and documentation that provide additional functionality to R. Although anyone can create and distribute R packages to other users, the Comprehensive R Archive Network [CRAN] is the official repository of R packages. All packages posted on CRAN are thoroughly tested, so generally, their quality is high.
To install R packages, we use the install.packages() function; to load them, we use the library() function. For example, the following code chunk installs the ergm package and loads it:
install.packages("ergm")
library(ergm)