Functions with multiple results in tidyverse

I have continued playing with the tidyverse for different parts of a couple of projects.

Often I need to apply a function by groups of observations; sometimes, that function returns more than a single number. It could be something like for each group fit a distribution and return the distribution parameters. Or, simpler for the purposes of this exploration, calculate and return a bunch of numbers.

If I have a data frame called field_data, with family codes (trees with the same parents, codes have been changed to protect the innocent) and stem diameters (in mm), I could do the following in base R:

And if I need to do this for several variables, I will need to merge each of these matrices in a data frame.

Mobile phone antenna in church (Photo: Luis, click to enlarge).

Continuing with my experimentation with the tidyverse, I was wondering how to get the above going with dplyr et al. After failing a few times I asked the question in Twitter and got a number of helpful replies.

One of the keys is that dplyr can store a list result from a function. Modifying my toy function is pretty straightforward, and now looks like:

And we can check the contents of summary_two to see we have a list in which each element contains 4 values:

We still need to extract the elements of each element of the list and assign them to a variable name. Using map from the purrr package is pretty straightforward in this case, and we can extract the values either using their names or their position in the element.

I’m still playing with ideas to be lazier at extraction time. An almost abhorrent idea is to provide the output as character for posterior type conversion, as in:

And we can get all the way there with:

Which I assume has all sort of potential negative side-effects, but looks really cool.

In case you want to play with the problem, here is a tiny example of field data.

Leave a Reply