Applied Network Science with R

Author

George G. Vega Yon, Ph.D.

Published

May 7, 2024

1 Preface

An AI image generated with Bing: Draw an image of a social network. Include a person examining the network and holding a laptop in one hand. The laptop should have the logo of the R programming language.

Statistical methods for networked systems are present in most disciplines. Despite language differences between areas, many methods developed to study specific problems can be helpful outside their original context; this is the premise of this book. Applied Network Science with R provides examples using the R programming language to study networked systems. Although most cases deal with social network analysis, the methods presented here can be applied to contexts such as biological networks, transportation networks, and many others.

The entire book was written using quarto–a literate programming system that allows mixing text and code–meaning that all the code presented is 100% executable and, thus, reproducible. The source code is available on GitHub at https://github.com/gvegayon/appliedsnar. Readers are encouraged to download the code and execute it on their machines using either RStudio or VScode.

Besides the R programming, we will be using RStudio. For data management, we will use dplyr and data.table. The network data management and visualization packages we will use are igraph, netdiffuseR, the statnet suite, and netplot.

1.1 About the project

This project began as a part of a workshop that took place at USC’s Center for Applied Network Analysis. Today, I use it to gather and study statistical methods to analyze networks, emphasizing social and biological systems. Moreover, the book will use statistical computing methods as a core component.

1.2 About the Author

I am a Research Assistant Professor at the University of Utah’s Division of Epidemiology, where I work on studying Complex Systems using Statistical Computing. I was born and raised in Chile. I have over ten years of experience developing scientific software focusing on high-performance computing, data visualization, and social network analysis. My training is in Public Policy (M.A. UAI, 2011), Economics (M.Sc. Caltech, 2015), and Biostatistics (Ph.D. USC, 2020).

I obtained my Ph.D. in Biostatistics under the supervision of Prof. Paul Marjoram andProf. Kayla de la Haye, with my dissertation titled “Essays on Bioinformatics and Social Network Analysis: Statistical and Computational Methods for Complex Systems.

If you’d like to learn more about me, please visit my website at https://ggvy.cl.