September 19, 2016
“The term reproducible research refers to the idea that the ultimate product of academic research is the paper along with the full computational environment used to produce the results in the paper such as the code, data, etc. that can be used to reproduce the results and create new work based on the research”
How does reproducible research looks like. Credit: Joe Sutliff/www.cdad.com/joe
Original study effect size versus replication effect size (correlation coefficients). Fig 3. in Nosek et al. (Science 2015)
Literate programming [published in 1983] is an approach to programming introduced by Donald Knuth in which a program is given as an explanation of the program logic in a natural language, such as English, interspersed with snippets of macros and traditional source code, from which a compilable source code can be generated
TeX is a typesetting system (or “formatting system”) designed and mostly written by Donald Knuth and released in 1978 [MS Word didn’t showed up until the 90’] […] TeX was designed with two main goals in mind: to allow anybody to produce high-quality books using minimal effort, and to provide a system that would give exactly the same results on all computers, at any point in time
A couple of tips
?grDevices::Devices
.h graph export
command with pdf/eps formats.outreg2
command (ssc install outreg2
)## ## stata: usage: stata [-h -q -s -b] ["stata command"] ## where: ## -h show this display ## -q suppress logo, initialization messages ## -s "batch" mode creating .smcl log ## -b "batch" mode creating .log file ## ## Notes: ## xstata is the command to launch the GUI version of Stata ## stata is the command to launch the console version of Stata ## ## -b is better than "stata < filename > filename".
We can read it in R!
read.delim("mystatatab.txt", sep = "\t", header = FALSE)
V1 V2 V3 V4 1 (1) (2) (3) 2 VARIABLES price price price 3 4 rep78 432.8 667.0* 76.29 5 (394.7) (342.4) (449.3) 6 1.foreign 1,023 -205.6 7 (866.1) (959.5) 8 mpg -292.4*** -271.6*** 9 (60.23) (57.77) 10 Constant 10,586*** 9,658*** 5,949*** 11 (1,556) (1,347) (1,423) 12 13 Observations 69 69 69 14 R-squared 0.267 0.251 0.001 15 Standard errors in parentheses 16 *** p<0.01, ** p<0.05, * p<0.1
Creating a graph and exporting it as EPS (Encapsulated PostScript). High res image that can be used in LaTeX and Word =).
## ## stata: usage: stata [-h -q -s -b] ["stata command"] ## where: ## -h show this display ## -q suppress logo, initialization messages ## -s "batch" mode creating .smcl log ## -b "batch" mode creating .log file ## ## Notes: ## xstata is the command to launch the GUI version of Stata ## stata is the command to launch the console version of Stata ## ## -b is better than "stata < filename > filename".
A neat Stata plot
auto <- foreign::read.dta("auto.dta") ans1 <- lm(price~rep78+factor(foreign)+mpg, auto) ans2 <- lm(price~rep78+mpg, auto) ans3 <- lm(price~rep78+factor(foreign), auto)
# texreg::texreg(list(ans1, ans2, ans3), table=FALSE) # if you want to use LaTeX texreg::htmlreg(list(ans1, ans2, ans3), table=FALSE)<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN” “http://www.w3.org/TR/html4/loose.dtd”>
Model 1 | Model 2 | Model 3 | ||
---|---|---|---|---|
(Intercept) | 10586.48*** | 9657.75*** | 5948.78*** | |
(1555.74) | (1346.54) | (1422.63) | ||
rep78 | 432.80 | 666.96 | 76.29 | |
(394.71) | (342.36) | (449.27) | ||
factor(foreign)Foreign | 1023.21 | -205.61 | ||
(866.09) | (959.55) | |||
mpg | -292.43*** | -271.64*** | ||
(60.23) | (57.77) | |||
R2 | 0.27 | 0.25 | 0.00 | |
Adj. R2 | 0.23 | 0.23 | -0.03 | |
Num. obs. | 69 | 69 | 69 | |
RMSE | 2550.90 | 2558.54 | 2955.15 | |
p < 0.001, p < 0.01, p < 0.05 |
cols <- auto$rep78 cols[is.na(cols)] <- 0 vran <- range(cols, na.rm = TRUE) cols <- (cols-vran[1])/(vran[2] - vran[1]) cols <- rgb(colorRamp(blues9)(cols), maxColorValue = 255) plot(price~mpg, auto, pch=19, col=cols)
A neat plot in R
JAMA On the “Statistical Analysis Subsection”
“[I]nclude the statistical software used to perform the analysis, including the version and manufacturer, along with any extension packages […]”" (see here)
Prevention Science On the “Ethical Responsibilities of Authors”
“Upon request authors should be prepared to send relevant documentation or data in order to verify the validity of the results. This could be in the form of raw data, samples, records, etc.” (see here)
Health Psychology On “Computer Code”
“We request that runnable source code be included as supplemental material to the article”
Annals of Behavioral Medicine On “Ethical Responsibilities of Authors”
“Upon request authors should be prepared to send relevant documentation or data in order to verify the validity of the results. This could be in the form of raw data, samples, records, etc.” (see here)
Journal | Accepts LaTeX | EPS figures |
---|---|---|
JAMA | no :( | yes |
Prevention Science | yes | yes |
Health Psychology | yes* | yes |
Annals of Behavioral Medicine | ? | ? |
American Journal of Public Health | no? | yes |
(*) Accepts PDFs.
Thanks!
George G. Vega Yon
vegayon@usc.edu
www.its.caltech.edu/~gvegayon