R pitfalls #4: redefining the basics

I try to be economical when writing code; for example, I tend to use single quotes over double quotes for characters because it saves me one keystroke. One area where I don’t do that is when typing TRUE and FALSE (R accepts T and F as well), just because it is clearer to see in code and syntax highlighting kicks in. That’s why I was surprised to read Jason Morgan’s post in that it is possible to redefine T and F and get undesirable behavior.

Playing around it is quite easy to redefine other fundamental constants in R. For example, I posted in Twitter:

[sourcecode lang="R"]
> pi
[1] 3.141593
> pi <- 2
> pi*2
[1] 4
[/sourcecode]

Ouch, dangerous! I tend to muck around with matrices quite a bit and, being a friend of parsimony, I often use capital letters to represent them. This would have eventually bitten me if I had used the abbreviated TRUE and FALSE. As Kevin Ushey replied to my tweet, one can redefine even basic functions like ‘+’ and be pure evil; over the top, sure, but possible.

Clown faces
Some times coding is scary (Photo: Luis).

9 thoughts on “R pitfalls #4: redefining the basics

  • 2012/12/14 at 9:05 pm
    Permalink

    I was doing some Tukeys HSD results and innocently decided to call my vector letters, then wondered why my vector was 26 characters long rather than the 6 it should have been. Guess where R stores its alphabet! Easily done!

    Reply
    • 2012/12/14 at 9:14 pm
      Permalink

      R is a very large language with lots of reserved (although redefinable) keywords. It pays to pay attention all the time.

      Reply
      • 2012/12/14 at 9:22 pm
        Permalink

        It certainly does. I always check to make sure that i get what i wanted (which is how i discovered the letters reservation)

        Reply
  • 2012/12/15 at 5:18 am
    Permalink

    Yeah,well, this is true in pretty much every language out there. Heck, I knew folks who thought it was funny to add this line to their coworkers’ .login file: “alias ls=logout” .

    Reply
  • 2012/12/15 at 5:21 am
    Permalink

    I doubt that someone would change “pi”, but T and F might change by accidental reassignment.

    Running some error checking on at least the most common possibilities might be of use. I guess, though that these statements would have to come at the end of your script to make sure nothing was overwritten. Something along the lines of:

    if (!identical(T, TRUE)) stop("'T' has been reassigned to ", T)

    Reply
    • 2012/12/15 at 6:09 am
      Permalink

      I doubt that someone would do it on purpose, but I can think of a number of acronyms related to my area of work for which it would make sense to use pi as a variable name.

      Reply
    • 2012/12/31 at 6:56 pm
      Permalink

      I’ve had some nasty bugs due to redefining T to a transition matrix in my early days of R programming; I was pretty annoyed when I found out the language would let me do something so dangerous.

      Reply
  • 2012/12/17 at 3:40 pm
    Permalink

    One redefinition in R is very useful for cross-platform work, e.g. when you are developing a script on a Mac but someone else will also be using it on Windows. This allows all the calls to quartz() to open a new graph window to still function on Windows (and could be easily swapped to go the other way):

    # when collaborating, need to swap windows() vs quartz() calls. This does that nicely:
    # (courtesy of https://stat.ethz.ch/pipermail/r-help/2008-December/181899.html)

    if(.Platform$OS.type==”windows”) {
    quartz<-function() windows()
    }

    Reply
    • 2012/12/17 at 8:28 pm
      Permalink

      That’s handy indeed.

      Reply

Leave a Reply