A common problem when running a simple (or not so simple) analysis is forgetting that the levels of a factor has been coded using integers. R doesn't know that this variable is supposed to be a factor and when fitting, for example, something as simple as a one-way anova (using lm()) the variable will be… Continue reading R pitfall #1: check data structure

# Category: code

## All combinations of levels for two factors

There are circumstances when one wants to generate all possible combinations of levels for two factors. For example, factor one with levels 'A', 'B' and 'C', and factor two with levels 'D', 'E', 'F'. The function expand.grid() comes very handy here:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
combo = expand.grid(factor1 = LETTERS[1:3], factor2 = LETTERS[4:6]) combo factor1 factor2 1 A D 2 B D 3 C D 4 A E 5 B E 6 C E 7 A F 8 B F 9 C F |

Omitting the variable names (factor1 and factor 2) will automatically name the variables… Continue reading All combinations of levels for two factors

## “Not in” in R

When processing data it is common to test if an observation belongs to a set. Let's suppose that we want to see if the sample code belongs to a set that includes A, B, C and D. In R it is easy to write something like:

1 |
inside.set = subset(my.data, code %in% c('A', 'B', 'C', 'D')) |

Now, what happens if what we want are… Continue reading “Not in” in R