By Econometrics and Free Software

**Econometrics and Free Software**, and kindly contributed to R-bloggers)

This week I had the opportunity to teach R at my workplace, again. This course was the βadvanced Rβ course, and unlike the one I taught at the end of last year, I had one more day (so 3 days in total) where I could show my colleagues the joys of the `tidyverse`

and R.

To finish the section on programming with R, which was the very last section of the whole 3 day course I wanted to blow their minds; I had already shown them packages from the `tidyverse`

in the previous days, such as `dplyr`

, `purrr`

and `stringr`

, among others. I taught them how to use `ggplot2`

, `broom`

and `modelr`

. They also liked `janitor`

and `rio`

very much. I noticed that it took them a bit more time and effort for them to digest `purrr::map()`

and `purrr::reduce()`

, but they all seemed to see how powerful these functions were. To finish on a very high note, I showed them the ultimate `purrr::map()`

use case.

Consider the following; imagine you have a situation where you are working on a list of datasets. These datasets might be the same, but for different years, or for different countries, or they might be completely different datasets entirely. If you used `rio::import_list()`

to read them into R, you will have them in a nice list. Let’s consider the following list as an example:

`library(tidyverse)`

```
data(mtcars)
data(iris)
data_list = list(mtcars, iris)
```

I made the choice to have completely different datasets. Now, I would like to map some functions to the columns of these datasets. If I only worked on one, for example on `mtcars`

, I would do something like:

```
my_summarise_f = function(dataset, cols, funcs){
dataset %>%
summarise_at(vars(!!!cols), funs(!!!funcs))
}
```

And then I would use my function like so:

```
mtcars %>%
my_summarise_f(quos(mpg, drat, hp), quos(mean, sd, max))
```

```
## mpg_mean drat_mean hp_mean mpg_sd drat_sd hp_sd mpg_max drat_max
## 1 20.09062 3.596563 146.6875 6.026948 0.5346787 68.56287 33.9 4.93
## hp_max
## 1 335
```

`my_summarise_f()`

takes a dataset, a list of columns and a list of functions as arguments and uses tidy evaluation to apply `mean()`

, `sd()`

, and `max()`

to the columns `mpg`

, `drat`

and `hp`

of `mtcars`

. That’s pretty useful, but not useful enough! Now I want to apply this to the list of datasets I defined above. For this, let’s define the list of columns I want to work on:

```
cols_mtcars = quos(mpg, drat, hp)
cols_iris = quos(Sepal.Length, Sepal.Width)
cols_list = list(cols_mtcars, cols_iris)
```

Now, let’s use some `purrr`

magic to apply the functions I want to the columns I have defined in `list_cols`

:

```
map2(data_list,
cols_list,
my_summarise_f, funcs = quos(mean, sd, max))
```

```
## [[1]]
## mpg_mean drat_mean hp_mean mpg_sd drat_sd hp_sd mpg_max drat_max
## 1 20.09062 3.596563 146.6875 6.026948 0.5346787 68.56287 33.9 4.93
## hp_max
## 1 335
##
## [[2]]
## Sepal.Length_mean Sepal.Width_mean Sepal.Length_sd Sepal.Width_sd
## 1 5.843333 3.057333 0.8280661 0.4358663
## Sepal.Length_max Sepal.Width_max
## 1 7.9 4.4
```

That’s pretty useful, but not useful enough! I want to also use different functions to different datasets!

Well, let’s define a list of functions then:

```
funcs_mtcars = quos(mean, sd, max)
funcs_iris = quos(median, min)
funcs_list = list(funcs_mtcars, funcs_iris)
```

Because there is no `map3()`

, we need to use `pmap()`

:

```
pmap(
list(
dataset = data_list,
cols = cols_list,
funcs = funcs_list
),
my_summarise_f)
```

```
## [[1]]
## mpg_mean drat_mean hp_mean mpg_sd drat_sd hp_sd mpg_max drat_max
## 1 20.09062 3.596563 146.6875 6.026948 0.5346787 68.56287 33.9 4.93
## hp_max
## 1 335
##
## [[2]]
## Sepal.Length_median Sepal.Width_median Sepal.Length_min Sepal.Width_min
## 1 5.8 3 4.3 2
```

Now I’m satisfied! Let me tell you, this blew their minds !

To be able to use things like that, I told them to always solve a problem for a single example, and from there, try to generalize their solution using functional programming tools found in `purrr`

.

If you found this blog post useful, you might want to follow me on twitter for blog post updates.

**leave a comment**for the author, please follow the link and comment on their blog:

**Econometrics and Free Software**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more…

Source:: R News