Revisiting homicide rates

A pint of R plotted an interesting dataset: intentional homicides in South America. I thought the graphs were pretty but I was unhappy about the way information was conveyed in the plots; relative risk should be very important but number of homicides is very misleading as it also relates to country population (this problem often comes up in our discussions in Stats Chat).

Instead of just complaining I decided to try a few alternatives (disclaimer: I do not have a good eye for colors or design but I am only looking at ways that could better show relative risk). I therefore downloaded the MS Excel file, which contained a lot of information from other countries and extracted only the information relevant to these plots, which you can obtain here: homicides.csv (4 KB). Some quick code could display the following graph:

[sourcecode language="R"]
require(ggplot2)

setwd(‘~/Dropbox/quantumforest’)
kill = read.csv(‘homicides.csv’, header = TRUE)

kp = ggplot(kill, aes(x = year, y = country, fill = rate))

# Colors coming from
# http://learnr.wordpress.com/2010/01/26/ggplot2-quick-heatmap-plotting/
png(‘homicides-tile.png’, width = 500, height = 500)
kp = kp + geom_tile() + scale_x_continuous(name = ‘Year’, expand = c(0, 0)) +
scale_y_discrete(name = ‘Country’, expand = c(0, 0)) +
scale_fill_gradient(low = ‘white’, high = ‘steelblue’, name = ‘Homicide rate’) +
theme_bw() +
opts(panel.grid.major = theme_line(colour = NA),
panel.grid.minor = theme_line(colour = NA))
dev.off()
[/sourcecode]

Tile graph for homicides.

It is also possible to use a line graph, but it quickly gets very messy, so I created totally arbitrary violence categories:

[sourcecode language="R"]
# Totally arbitrary classification
kill$type = ifelse(kill$country %in% c(‘Brazil’, ‘Colombia’, ‘Venezuela’),
‘Freaking violent’,
ifelse(kill$country %in% c(‘Ecuador’, ‘Surinam’, ‘Guyana’),
‘Plain violent’,
‘Sort of quiet’))

kp2 = ggplot(kill, aes(x = year, y = rate, colour = country))

png(‘homicides-lines.png’, width=1000, height = 300)
kp2 + geom_line() + facet_grid(. ~ type) +
scale_y_continuous(‘Homicides/100,000 people’) +
scale_x_continuous(‘Year’) + theme_bw() +
opts(axis.text.x = theme_text(size = 10),
axis.text.y = theme_text(size = 10),
legend.position = ‘none’)
dev.off()
[/sourcecode]

Another view, which still requires labeling countries. Venezuela, what did happen to you? You look like the Wild West...

I hope others will download the data and provide much better alternatives to display violence. If you do, please add a link in the comments.

Leave a Reply