Evolving notes, images and sounds by Luis Apiolaza

Plotting earthquake data

Since 4th September 2010 we have had over 2, 800 quakes (considering only magnitude 3+) in Christchurch. Quakes come in swarms, with one or few strong shocks, followed by numerous smaller ones and then the ocasional shock, creating an interesting data visualization problem. In our case, we have had swarms in September 2010, December 2010, February 2011, June 2011 and December 2011.

Geonet provides the basic information and there have been several attempts at displaying the full set of shocks. For example, Christchurch Quake Map uses animation, while Canterbury Quake Live uses four panels showing quakes for last 24 hours, last week, last month and since September 2010. While both alternatives are informative, it is hard to see long-term trends due to overplotting, particularly when we move beyond one week during a swarm.

Geonet allows data extraction through forms and queries. Rough limits for the Christchurch earthquakes are: Southern Latitude (-43.90), Northern Latitude (-43.15), Western Longitude (171.75) and Eastern Longitude (173.35). We can limit the lower magnitude to 3, as shocks are hard to feel below that value.

Graph presented by Canterbury Quake Live, notice how difficult is to read the bottom graph.

The file earthquakes.csv contains 2,802 shocks, starting on 2010-09-04. The file can be read using the following code:

setwd('~/Dropbox/quantumforest')
library(ggplot2)

# Reading file and manipulating dates
qk <- read.csv('earthquakes.csv', header=TRUE)

# Joins parts of date, converts it to UTC and then
# expresses it in NZST
qk$DATEtxt <-  with(qk, paste(ORI_YEAR,'-',ORI_MONTH,'-',ORI_DAY,' ',
                    ORI_HOUR,':',ORI_MINUTE,':',ORI_SECOND, sep=''))
qk$DATEutc <- as.POSIXct(qk$DATEtxt, tz='UTC')
qk$DATEnz <- as.POSIXct(format(qk$DATEutc, tz='Pacific/Auckland'))
head(qk)

The following code produces a plot that, in my opinion, presents a clearer idea of the swarms but that I still feel does not make justice to the problem.

png('earthquakesELE.png',height=600, width=1200)
  ele <- ggplot(qk, aes(DATEnz, MAG, ymin=3, ymax=MAG))
  ele + geom_linerange(color='grey') + geom_point(color='red', size=1) +
    scale_y_continuous(name='Magnitude', limits=c(2.5,7.2)) +
    scale_x_datetime(name='Date',major='month') + theme_bw() +
    opts(panel.grid.major = theme_line(size=0))
dev.off()

Graph displaying quake swarms, but still far from perfect. Looking forward to your ideas.

Please let me know if you have a better idea for the plot.

P.S.1 If you want to download data from Geonet there will be problems reading in R the resulting earthquakes.csv file, because the file is badly formed. All lines end with a comma except for the first one, tripping R into believing that the first column contains row names. The easiest way to fix the file is to add a comma at the end of the first line, which will create an extra empty variable called X that is not used in the plots.

P.S.2 We had some discussions by email with Michael (who commented below), where he suggested dropping the grey lines and using alpha less than one to reduce plotting density. I opted for also using an alpha scale, so stronger quakes are darker to mimic the psychology of 'quake suffering': both frequent smaller quakes and the odd stronger quakes can be distressing to people. In addition, now the plot uses a 1:6 ratio.

png('earthquakesALPHA.png',height=300, width=1800)
  qplot(DATEnz, MAG, data = qk, alpha = MAG) +
    geom_point(color = 'red', size=1.5) +
    scale_x_datetime('Date', major='month') +
    scale_y_continuous('Magnitude') +
    opts(legend.position = 'none',
         axis.text.x = theme_text(colour = 'black'),
         axis.text.y = theme_text(colour = 'black'))
dev.off()

New version, click to expand.

I ran out of time, but the background needs more work, as well as finding the right level of alpha to best tell the story.

2 Comments

  1. Michael MacAskill

    Hi, nice to see someone using R to delve into alternative ways of depicting this data. The existing sites were great in getting up and running so well but haven’t developed much since.

    With your visualisation (which is very nice), you might want to try out a few changes (which may or may not improve things):
    (1) drop the grey lines and just plot the red points. The grey lines give too much emphasis to individual quakes when they are rare (e.g. towards the end of one ‘cluster’, which de-emphasises the rise of the next one.
    (2) try different aspect ratios. As a general rule with data like this, it is often improved by decreasing the height of the y-axis (try it at 1/2 0r 1/4 of its current size). A great example of this was Cleveland’s re-envisioning of the famous sunspot cycle graph (e.g. http://processtrends.com/pg_data_vis_bank_to_45.htm ). And Tufte gave some thought to this when ‘inventing’ the sparkline.

    Anyway, you’ve spurred me into downloading a new CSV file from GeoNet (thanks for posting your lat/long constraints) and trying out some things again myself. Your date processing code above will speed that up. Cheers.

    PS do you have an RSS feed?

    • Luis

      Hi Michael,

      Thanks for the comments and the link. I tried before the points only, but then one loses the sense of order for the quakes. The idea of using the grey lines is to avoid dominating the image with too much ‘ink’; I’ll try to make it even lighter.

      I will change the aspect ratio, which I hope will make the graph more easy on the eye. I like Tufte’s sparklines but they get very messy in this graph (too many points with varying lags between them).

      The RSS feed for this site is http://www.quantumforest.com/feed/ It shows on the index page but not on individual posts; one more thing to fix in the design!

      By the way, you work on a very cool topic. Cheers.

© 2024 Palimpsest

Theme by Anders NorenUp ↑