R pitfalls #4: redefining the basics

I try to be economical when writing code; for example, I tend to use single quotes over double quotes for characters because it saves me one keystroke. One area where I don’t do that is when typing TRUE and FALSE (R accepts T and F as well), just because it is clearer to see in code and syntax highlighting kicks in. That’s why I was surprised to read Jason Morgan’s post in that it is possible to redefine T and F and get undesirable behavior.

Playing around it is quite easy to redefine other fundamental constants in R. For example, I posted in Twitter:

Ouch, dangerous! I tend to muck around with matrices quite a bit and, being a friend of parsimony, I often use capital letters to represent them. This would have eventually bitten me if I had used the abbreviated TRUE and FALSE. As Kevin Ushey replied to my tweet, one can redefine even basic functions like ‘+’ and be pure evil; over the top, sure, but possible.

Clown faces
Some times coding is scary (Photo: Luis).


I was doing some Tukeys HSD results and innocently decided to call my vector letters, then wondered why my vector was 26 characters long rather than the 6 it should have been. Guess where R stores its alphabet! Easily done!

R is a very large language with lots of reserved (although redefinable) keywords. It pays to pay attention all the time.

Yeah,well, this is true in pretty much every language out there. Heck, I knew folks who thought it was funny to add this line to their coworkers’ .login file: “alias ls=logout” .

I doubt that someone would change “pi”, but T and F might change by accidental reassignment.

Running some error checking on at least the most common possibilities might be of use. I guess, though that these statements would have to come at the end of your script to make sure nothing was overwritten. Something along the lines of:

if (!identical(T, TRUE)) stop("'T' has been reassigned to ", T)

I doubt that someone would do it on purpose, but I can think of a number of acronyms related to my area of work for which it would make sense to use pi as a variable name.

I’ve had some nasty bugs due to redefining T to a transition matrix in my early days of R programming; I was pretty annoyed when I found out the language would let me do something so dangerous.

One redefinition in R is very useful for cross-platform work, e.g. when you are developing a script on a Mac but someone else will also be using it on Windows. This allows all the calls to quartz() to open a new graph window to still function on Windows (and could be easily swapped to go the other way):

# when collaborating, need to swap windows() vs quartz() calls. This does that nicely:
# (courtesy of https://stat.ethz.ch/pipermail/r-help/2008-December/181899.html)

if(.Platform$OS.type==”windows”) {
quartz<-function() windows()

Leave a Reply