Learning Objectives
- Know what packages are and how to install them from CRAN
- Understand why literate programming is useful
- Create and edit an RMarkdown file
- Know how to manipulate some common chunk options
Packages are bundles of code which extend the functionality of R.
Anyone can make an R package, and anyone can install anyone else’s R package (if they make it available). This is part of the beauty of open source, and using different R packages is essential to modern R workflows.
You can get packages from many different places, but we’ll focus on just the most common one: CRAN. CRAN is the Comprehensive R Archive Network, a global network of servers which make available for download a set of vetted R packages.
The next section is about RMarkdown, a package, so we’ll install that now.
To download and install a package from CRAN, call the install.packages
command on a string with the name of the desired package. You will get output describing the installation progress.
install.packages("rmarkdown", repos="http://cran.rstudio.com/")
##
## The downloaded binary packages are in
## /var/folders/jr/6g1w83n911q9qh9rghb6x5080000gn/T//RtmpKefrE6/downloaded_packages
You may be asked to choose a mirror; the RStudio mirror is a good choice as it will pick the nearest mirror automatically. This will also download and install packages which RMarkdown depends on.
You only need to install a package once per machine, unless you need to update an already-installed package. Calling install.packages
for an existing package will update it if there’s a more recent version on CRAN than on your machine.
You can view all of the installed packages using the installed.packages
command. This will output a lot of information for each package, so if you only want a list of the installed package names, you can specify that you want the “Package” column. I like to look at this as a vector.
as.vector(installed.packages()[,"Package"])
## [1] "acepack" "AnnotationDbi" "assertthat" "backports"
## [5] "base" "base64enc" "BH" "Biobase"
## [9] "BiocGenerics" "BiocInstaller" "bitops" "boot"
## [13] "brew" "C50" "car" "caret"
## [17] "caTools" "checkmate" "class" "cluster"
## [21] "codetools" "colorspace" "combinat" "commonmark"
## [25] "compiler" "cowplot" "crayon" "curl"
## [29] "data.table" "datasets" "DBI" "deldir"
## [33] "desc" "devtools" "DiagrammeR" "DiagrammeRsvg"
## [37] "dichromat" "digest" "doParallel" "dplyr"
## [41] "dtw" "dynamicTreeCut" "e1071" "evaluate"
## [45] "fastcluster" "foreach" "foreign" "formatR"
## [49] "Formula" "gdata" "genetics" "geosphere"
## [53] "getopt" "ggforce" "ggmap" "ggplot2"
## [57] "ggraph" "ggrepel" "git2r" "GO.db"
## [61] "graphics" "grDevices" "grid" "gridBase"
## [65] "gridExtra" "gridSVG" "gtable" "gtools"
## [69] "highr" "Hmisc" "htmlTable" "htmltools"
## [73] "htmlwidgets" "httpuv" "httr" "igraph"
## [77] "impute" "influenceR" "infotheo" "IRanges"
## [81] "irlba" "iterators" "jpeg" "jsonlite"
## [85] "KernSmooth" "knitr" "labeling" "lattice"
## [89] "latticeExtra" "lazyeval" "lme4" "magrittr"
## [93] "mapproj" "maps" "maptools" "markdown"
## [97] "MASS" "Matrix" "MatrixModels" "matrixStats"
## [101] "memoise" "methods" "mgcv" "mime"
## [105] "minerva" "minet" "minqa" "ModelMetrics"
## [109] "munsell" "mvtnorm" "NetSwan" "nettools"
## [113] "nlme" "nloptr" "NMF" "nnet"
## [117] "openssl" "optparse" "pander" "parallel"
## [121] "partykit" "pbkrtest" "permute" "pkgmaker"
## [125] "plogr" "plotROC" "plyr" "png"
## [129] "pracma" "preprocessCore" "proto" "proxy"
## [133] "quadprog" "quantreg" "R6" "randomForest"
## [137] "RColorBrewer" "Rcpp" "RcppEigen" "registry"
## [141] "reshape2" "rgexf" "RgoogleMaps" "rjson"
## [145] "RJSONIO" "rmarkdown" "RNeo4j" "rngtools"
## [149] "Rook" "rootSolve" "roxygen2" "rpart"
## [153] "rprojroot" "RSQLite" "rstudioapi" "rsvg"
## [157] "S4Vectors" "scales" "shiny" "sourcetools"
## [161] "sp" "SparseM" "spatial" "splines"
## [165] "sportcolors" "stats" "stats4" "stringi"
## [169] "stringr" "survival" "tcltk" "teamcolors"
## [173] "tibble" "tidyr" "tools" "tweenr"
## [177] "udunits2" "units" "utils" "V8"
## [181] "vegan" "viridis" "visNetwork" "wesanderson"
## [185] "WGCNA" "whisker" "withr" "XML"
## [189] "xml2" "xtable" "yaml"
Most packages need to be loaded into the current environment to be accessible. RMarkdown is specially integrated in RStudio in a way that avoids this, but in general we load packages with the library
command:
library(rmarkdown) # notice the lack of quotes
This will come up again later in the lesson on dplyr
, an external package that does need to be loaded.
You can also view the packages that you have loaded into your workspace.
(.packages())
## [1] "rmarkdown" "stats" "graphics" "grDevices" "utils" "datasets"
## [7] "methods" "base"
R Markdown is a special file format which allows us to combine text, code, and the output of that code in a single file. This combination of explanation, code, and results is called literate programming and is a powerful way to share research and data explorations.
RMarkdown is an extended version of the Markdown (.md
) file format, which is an easy way to make nicely formatted text documents without endlessly tinkering with the formatting (as you might with LaTeX). The software community loves Markdown because in addition to being straightforward, it has good support for formatting code, which can be a pain in other formats.
RMarkdown takes this a step further by allowing you to run the code in your document, and having the output appear below the code that made it.
If you’ve used ipython/Jupyter notebooks before, R Markdown will feel similar. All the lessons in this workshop were created with R Markdown!
Rstudio makes it easy to create a new RMarkdown file, and it even starts with a demo file that shows off most of the basic features of the Rmd
format. In the upper-left corner, click the “new file” icon and select RMarkdown. A window should appear to help you configure this file initially. There’s a lot of options (R Markdown can do so much!), but for now, make sure your name is in the “Author” field, and change the “Title” to be something like “AARUG Workshop”.
Before we delve into what each of these pieces mean, let’s “knit” the document so we can see what kind of output RMarkdown produces. Above the file, press the knit
button, the one that looks like a ball of yarn.
You should see a new pane open in RStudio that shows R “knitting” the document, and when it’s done, a pop-up will appear showing the knitted output.
This new output being displayed as an html
file; look in the file browser pane, and you’ll see a .html
file next to your .Rmd
file (may need to refresh), because RStudio automatically saved this output when the document finished knitting.
Let’s look at the individual pieces in this document:
This is the section at the top, with three dashes before and after. This lists some metadata about the object. The title, date, and author form the start of the output document, and the output:
line instructs the knitting process to generate an html file.
You can enlarge text be preceding it with one or more pound signs (#
). This is mainly useful for organizing a document into sections. The more pound signs, the smaller the text, so when you make sub-section you should add at least one more pound sign than used in the parent sections’ title.
You can make text clickable by including a link to a different website. An example can be found above, where we included a link to CRAN.
There are two parts to creating linked text. The first part is including the text you want to see, surrounded by square brackets [CRAN]
. Immediately after that, add the link surrounded by parentheses (https://cran.r-project.org)
. The final product looks like [CRAN](https://cran.r-project.org)
.
The double-asterisks surrounding the word “Knit” in the second paragraph cause that piece of text to be bold. This phrase can be multiple words, but should not have spaces immediately on the inside of the asterisks. You can make text italix by similarly wrapping in underscores (_
) or using single asterisks.
This is the real meat of the document! An R Markdown code chunk is a section which starts and ends with triple-backticks (`, not '
). After the initial set, the curly-bracketed section which starts with {r
is what forces this to be ran as R code; without this piece, the section would get formatted like code, but would not be executed when knitting. The phrase after the r
is the chunk name. Chunks do not need to be named, but no two chunks can have the same name. Naming chunks can help keep code organized and make it easier to track down the source of errors when they occur.
As the second default section discusses, we can hide the code in a code chunk by placing a comma after out chunk name and setting an option echo=FALSE
. The code will still execute, and its output will be inserted in the knitted document, but it will not be shown.
Similarly, you can set eval=FALSE
to avoid running a code chunk.
There’s a lot more to RMarkdown than just this; as the demo document shows, you can visit http://rmarkdown.rstudio.com to learn more.
We’ll be using R Markdown for the rest of this workshop to keep a running log of what we’re learning. This will allow you to walk away with a knit document which has not only the code commands you’ve learned to use, but the output of those commands and some explanatory text. That’s literate programming!