6 Next steps
6.1 Deepen your understanding
Our number one recommendation is to read the book “R for Data Science” by Garrett Grolemund and Hadley Wickham.
Also, statistical tasks such as model fitting, hypothesis testing, confidence interval calculation, and prediction are a large part of R, and one we haven’t demonstrated fully today. Linear models, and the linear model formula syntax
~, are core to much of what R has to offer statistically. Many statistical techniques take linear models as their starting point, including limma for differential gene expression,
glm for logistic regression (etc), survival analysis with
coxph, and mixed models to characterize variation within populations.
“Statistical Models in S” by J.M. Chambers and T.J. Hastie is the primary reference for this, although there are some small differences between R and its predecessor S.
“An Introduction to Statistical Learning” by G. James, D. Witten, T. Hastie and R. Tibshirani can be seen as further development of the ideas in “Statistical Models in S”, and is available online. It has more of a machine learning than a statistics flavour to it (the distinction is fuzzy!).
“Modern Applied Statistics with S” by W.N. Venable and B.D. Ripley is a well respected reference covering R and S.
“Linear Models with R” and “Extending the Linear Model with R” by J. Faraway cover linear models, with many practical examples.
6.2 Expand your vocabulary
Have a look at these cheat sheets to see what is possible with R.
- RStudio’s collection of cheat sheets cover newer packages in R.
- An old-school cheat sheet for dinosaurs and people wishing to go deeper.
- A Bioconductor cheat sheet for biological data.
The R Manuals are the place to look if you need a precise definition of how R behaves.
6.3 Join the community
Join the Data Fluency community at Monash.
- Mailing list for workshop and event announcements.
- Slack for discussion.
- Monthly seminars on Data Science topics.
- Drop-in sessions on Friday afternoon.
Meetups in Melbourne:
The Carpentries run intensive two day workshops on scientific computing and data science topics worldwide. The style of this present workshop is very much based on theirs. For bioinformatics, COMBINE is an Australian student and early career researcher organization, and runs Carpentries workshops and similar.