Monash Data Fluency: Reproducible Research in R (with Quarto)




Workshop material

We will be working through material from the Carpentries incubator. The Data Fluency program at Monash University is almost entirely based on the Carpentries teaching philosophy. This new material caught our attention by perfectly capturing current thinking about some good scientific practices and practical means to achieve them.

We will use this example GitHub repository:

Some slides to put things in context:




Setup

You can use RStudio on your own laptop (preferred) or use Posit Cloud. If using RStudio on your own laptop, follow the software installation instructions below. If using Posit Cloud, please still do steps 3 and 4 below.

If you have a Monash University supplied laptop, you may need to submit a service request to eSolutions to be allowed to install software. For now, use Posit Cloud.

1. Install R

2. Install RStudio

3. In RStudio, run this R code to install some packages:

install.packages(
    c("quarto","tidyverse","BayesFactor","patchwork","renv","usethis"))

4. Create an account on GitHub if you don’t already have one.

5. Install git:

Windows users should install Git for Windows (includes Git Credential Manager*).

Mac users should install Apple’s Xcode command line tools, which includes git. This is also worthwhile because it will let you install R packages that need compilation. Open the Terminal app, type the command below, and press return:

xcode-select --install

(Mac users can also the binary installer here: ​​https://git-scm.com/download/mac. It is slightly out of date but should be fine. This method will be faster.)

Mac users should also install Git Credential Manager*: download page, installation instructions. Download and open the appropriate “.pkg” file for your machine, gcm-osx-arm64-…pkg if you have an M1 or M2 mac, gcm-osx-x64-…pkg otherwise.

 * Git Credential Manager should let us push (upload) changes to repositories on GitHub using the https method. Otherwise we would need to set up an ssh key to push changes, which is beyond the scope of this workshop.

If you run into problems installing any of these, you can use Posit Cloud for the workshop.

Authentication without Git Credential Manager

There is an alternative method to push changes to GitHub using tokens.

This method will also work from Posit Cloud.

Log in to github.com
→ Click the top right corner to open your account menu
→ Settings
→ Developer settings
→ Personal access tokens
→ Fine-grained personal access tokens (beta)
→ Generate new token

Select:

  • Repository access:
    • only select repositories → the repository we are working on
  • Repository permissions:
    • commit statuses → access level: read & write
    • contents → access level: read & write

In the Terminal pane in RStudio you need to configure who you are for git with the following (filling in your details):

git config --global user.email "your.email@monash.edu"
git config --global user.name "your_github_username"

When pushing from RStudio, enter your GitHub username and then paste the token when prompted for a password.

Optional further software

RStudio comes bundled with Quarto and LaTeX, but you can install them separately if you want to use them from the command line. LaTeX is used to create PDFs. We won’t use this in the workshop.




Bits and pieces

Quarto reference

Quarto markdown overview

Quarto user guides ← topics give a good idea of what is possible!

Quarto cross-references ← a useful but hard to discover feature


More information on the BibTex format

More information about writing equations

An online equation editor


Monash themed Quarto templates by Rob Hyndman

Example of a peer-reviewed “knitted” publication

Example of a GitHub repository associated with a publication

Example of workshop material (in bookdown)
(Jointly developed by University of Queensland and Monash, check the commit log!)


Advice on how to name files



_quarto.yml file required for this workshop

project:
    execute-dir: "project"



_quarto.yml suggested further settings

project:
    execute-dir: "project"     # Working directory always the same.
    output-dir: "output"       # HTML files go in this directory.
#    render:                    # Specify order of rendering.
#    - "report/preprocess.qmd"
#    - "report/my-report.qmd"

format:
    html:
        toc: true                # Table of contents.
        code-fold: true          # Initially hide code.
        fontsize: "10pt"         # Smaller font.
        embed-resources: true    # Ensure HTML files are fully self-contained.

After editing _quarto.yml, you may need to close and re-open your project in RStudio.

If you specify the order of rendering, the whole project can be run in the correct order in a single step. In RStudio this can be done in the “Build” pane with the “Render Project” button. On the command line it can be done with:

quarto render



Nice looking tables

The default appearance of tables produced by code chunks is a bit plain. Try these instead, and look at their documentation for how to customize the output further.

knitr::kable(mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
DT::datatable(mtcars)