We already saw some of R’s built in plotting facilities with the function plot
. A more recent and much more powerful plotting library is ggplot2
. ggplot2
is another mini-language within R, a language for creating plots. It implements ideas from a book called [“The Grammar of Graphics” [url https://www.amazon.com/Grammar-Graphics-Statistics-Computing/dp/0387245448]]. The syntax can be a little strange, but there are plenty of examples in the online documentation.
If ggplot2
isn’t already installed, we would need to install it.
# install.packages("ggplot2")
# or
# install.packages("tidyverse")
ggplot2
is part of the Tidyverse, so loadinging the tidyverse
package will load ggplot2
.
library(tidyverse)
## ── Attaching packages ────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 2.2.1 ✔ purrr 0.2.4
## ✔ tibble 1.3.4 ✔ dplyr 0.7.4
## ✔ tidyr 0.7.2 ✔ stringr 1.2.0
## ✔ readr 1.1.1 ✔ forcats 0.2.0
## ── Conflicts ───────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
Producing a plot with ggplot2
, we must give three things:
We continue using the Gapminder data, which was loaded with:
gap <- read_csv("gapminder.csv")
## Parsed with column specification:
## cols(
## country = col_character(),
## continent = col_character(),
## year = col_integer(),
## lifeExp = col_double(),
## pop = col_integer(),
## gdpPercap = col_double()
## )
Let’s make our first ggplot.
ggplot(gap, aes(x=year, y=lifeExp)) +
geom_point()
The call to ggplot
and aes
sets up the basics of how we are going to represent the various columns of the data frame. aes
defines the “aesthetics”, which is how columns of the data frame map to graphical attributes such as x and y position, color, size, etc. We then literally add layers of graphics to this.
aes
is another example of magic “non-standard evaluation”, arguments to aes
may refer to columns of the data frame directly.
Further aesthetics can be used. Any aesthetic can be either numeric or categorical, an appropriate scale will be used.
ggplot(gap, aes(x=year, y=lifeExp, color=continent, size=pop)) +
geom_point()