# 5 Thinking in R

The result of a t-test is actually a value we can manipulate further. Two functions help us here. `class`

gives the “public face” of a value, and `typeof`

gives its underlying type, the way R thinks of it internally. For example numbers are “numeric” and have some representation in computer memory, either “integer” for whole numbers only, or “double” which can hold fractional numbers (stored in memory in a base-2 version of scientific notation).

`## [1] "numeric"`

`## [1] "double"`

Let’s look at the result of a t-test:

`## [1] "htest"`

`## [1] "list"`

```
## [1] "statistic" "parameter" "p.value" "conf.int" "estimate"
## [6] "null.value" "alternative" "method" "data.name"
```

`## [1] 4.301261e-29`

In R, a t-test is just another function returning just another type of data, so it can also be a building block. The value it returns is a special type of vector called a “list”, but with a public face that presents itself nicely. This is a common pattern in R. Besides printing to the console nicely, this public face may alter the behaviour of generic functions such as `plot`

and `summary`

.

Similarly a data frame is a list of vectors that is able to present itself nicely.

## 5.1 Lists

Lists are vectors that can hold anything as elements (even other lists!). It’s possible to create lists with the `list`

function. This becomes especially useful once you get into the programming side of R. For example writing your own function that needs to return multiple values, it could do so in the form of a list.

```
## $hello
## [1] "Hello" "world"
##
## $numbers
## [1] 1 2 3 4
```

`## [1] "list"`

`## [1] "list"`

`## [1] "hello" "numbers"`

Accessing lists can be done by name with `$`

or by position with `[[ ]]`

.

`## [1] "Hello" "world"`

`## [1] 1 2 3 4`

## 5.2 Other types not covered here

Matrices are another tabular data type. These come up when doing more mathematical tasks in R. They are also commonly used in bioinformatics, for example to represent RNA-Seq count data. A matrix, as compared to a data frame:

- contains only one type of data, usually numeric (rather than different types in different columns).
- commonly has
`rownames`

as well as`colnames`

. (Base R data frames can have`rownames`

too, but it is easier to have any unique identifier as a normal column instead.) - has individual cells as the unit of observation (rather than rows).

Matrices can be created using `as.matrix`

from a data frame, `matrix`

from a single vector, or using `rbind`

or `cbind`

with several vectors.

You may also encounter “S4 objects”, especially if you use Bioconductor packages. The syntax for using these is different again, and uses `@`

to access elements.

## 5.3 Programming

Once you have a useful data analysis, you may want to do it again with different data. You may have some task that needs to be done many times over. This is where programming comes in:

- Writing your own functions.
- For-loops to do things multiple times.
- If-statements to make decisions.

The “R for Data Science” book is an excellent source to learn more. The Monash Bioinformatics Platform “R more” course also covers this.