Reproducible Research in R

  • Level: beginner-intermediate
  • Duration: 6 hours
  • Student numbers: 25-30

Welcome to the Reproducible Research in R (RRR) workshop. The main aim of this workshop is to set you on the right path of making your research more reproducible and shariable. Reproducible research means that future you and anyone else will be able to pick up your analysis and reproduce the same results, including figures and tables. Reproducible research also implies well-documented research, your code should be well commented and the reasons behind functions and methods should be explained thoroughly throughout the analysis. The communication aspect should not be an afterthought, it should be recorded with your analysis as you are going through it. Rmarkdown is a way of literate programing that keeps code, words and sentences together. The ability to easily collaborate and share your analysis goes hand-in-hand with good record-keeping and reproducibility. We are going to repurpose the git version control tool and leverage the GitHub remote hosting provider for managing and sharing our work. Git + GitHub will provide a very powerful resource for global collaboration and exposure of your work. In this workshop, we are going to version control our work and push it to github, which can then be accessed by your collaborators and supervisors. Git + GitHub should become an integral part of your workflow.

The RRR course given by the Monash Bioinformatics Platform for the Monash Data Fluency initiative. Our teaching style is based on the style of The Carpentries.

Learning outcomes

Attendees will learn how to:

  • write vanilla markdown, Rmarkdown and bookdown documents
  • use knitr, rmarkdown and bookdown R packages to build various document types including PDF, HTML and DOCX
  • create reproducible Rmarkdown documents leveraging .Rproj and .RData
  • include in-line citation and full references list in to Rmarkdown document using .bib files
  • create presentations from Rmarkdown documents that include R code
  • work with git version control tool
  • create reproducible and “backed up” analysis via remote repositories (e.g github)

Workshop description

This workshop is an introduction to writing and communicating research using Rmarkdown. Rmarkdown is an easy way to create documents that include your R code and its output, such figure and tables. Rmarkdown is a single document that can be “knitted” and shared as various document types such as PDF and HTML. Rmarkdown supports scientific writing such as use of citations and figure cross-referencing. Rmarkdown can also be used to create presentations that include your R code and its output. We will also cover bookdown, which is an extension to Rmarkdown that allows the creation of larger documents, such as books with multiple chapters.

In this workshop, we will also cover git version control tool which can help with organising and “checkpointing” Rmarkdown documents, associated R code and data. Git is not a backup system, but it does allow one to retrieve older versions of your work. Git, together with remote repositories like GitHub can provide a centralised location for your research. Together Rmarkdown, git and github can enable reproducible research that is visible and accessible to the greater public, including supervisors and management.

Prerequisite

This is an introductory level workshop, however some prior exposure to R and familiarity with RStudio is assumed. It is highly recommended that you read this article in full, if you have to prioritise, read at least these section (1,2,6,10). Additionally, it is highly recommended that you create an account at GitHub and remember your password.

Keywords

  • R
  • Rmarkdown
  • communication
  • reproducibility
  • git and github

Schedule

10:00-10:30am (30 minutes) Welcome and warm up

  \( ゚ヮ゚)/

10:30-12:00pm (1.5 hours) Rmarkdown

  • Introduction (30 minutes)
  • Vanilla markdown (30 minutes)
  • Rmarkdown (30 minutes)

12:00-1:00pm (1 hour) lunch


                     __
                    /
                 .-/-.
                 |'-'|
                 |   |
                 |   |   .-""""-.
                 \___/  /' .  '. \   \|/\//
                       (`-..:...-')  |`""`|
                        ;-......-;   |    |
                         '------'    \____/

1:00-3:00pm (2 hours) More Rmarkdown

  • Git and GitHub markdown (40 minutes)
  • More Rmarkdown (40 minutes)
  • YAML header (40 minutes)

3:00-3:15pm (15 minutes) Tea break

                     (  )  (   )  )
                     ) (   )  (  (
                     ( )  (    ) )
                     _____________
                    <_____________> ___
                    |             |/ _ \
                    |               | | |
                    |               |_| |
                 ___|             |\___/
                /    \___________/    \
                \_____________________/

3:15-4:45pm (1.5 hours) Even more Rmarkdown

  • Bibliographies (30 minutes)
  • Bookdown (30 minutes)
  • Miscellaneous (30 minutes)

4:45-5:00pm (15 minutes) Warm down


                                          (

                *                           )   *
                              )     *      (
                    )        (                   (
                   (          )     (             )
                    )    *           )        )  (
                   (                (        (      *
                    )          H     )        )
                              [ ]            (
                       (  *   |-|       *     )    (
                 *      )     |_|        .          )
                       (      | |    .
                 )           /   \     .    ' .        *
                (           |_____|  '  .    .
                 )          | ___ |  \~~~/  ' .   (
                        *   | \ / |   \_/  \~~~/   )
                            | _Y_ |    |    \_/   (
                *           |-----|  __|__   |      *
                            `-----`        __|__