Influences: Cronopios and Famas

Books have accompanied me for all my life, or at least for as long as I can remember. However, my reading habits have changed many times, from reading simple books, to reading very complex books, to reading anything, to reading if I squeeze a few minutes here and there, to… you get the idea. ‘Habits’ is a funny word, an oxymoron, to refer to constant change.

Today I was thinking of influential books. No ‘good’ books or books that have received many awards or that have guided generations or catalyzed social change. I mean only books that have been important for me at a given point in time. If I had read them before or after that time they may have passed unnoticed. But I read them then, at the right time… for me.

As an adult I have moved houses several times, and every time I have lost books. There are also books that have been with me all this time. One of them is ‘Cronopios and Famas’ a collection of very short stories by Julio Cortázar§, one of the big voices of Argentinian literature. My first encounter with ‘Historias de Cronopios y Famas’–the original Spanish title–was in my maternal grandparents’ apartment. I was living with them and I was looking for something to read. Anything. I opened a drawer and found some interesting books, including Cortazar’s. It was one of the first editions, which I think belonged to one of my uncles, the one in exile.

Why was this an important book? Language, raw language. I am completely at lost when trying to explain Cortázar to someone who has not read his books. As Borges said:

No one can retell the plot of a Cortázar story; each one consists of determined words in a determined order. If we try to summarize them, we realize that something precious has been lost—Jorge Luis Borges

In ‘Progreso y retroceso’ (progress and regress) the whole story fits in only two paragraphs. The story is about a crystal that lets flies through but that does not let them come back because ‘no one knows what stuff in the flexibility of the fibers of this crystal, which was too fibrous’ or something like that:

Inventaron un cristal que dejaba pasar las moscas. La mosca venía empujaba un poco con la cabeza y, pop, ya estaba del otro lado. Alegría enormísima de la mosca.

Todo lo arruinó un sabio húngaro al descubrir que la mosca podía entrar pero no salir, o viceversa a causa de no se sabe que macana en la flexibilidad de las fibras de este cristal, que era muy fibroso. En seguida inventaron el cazamoscas con un terrón de azúcar dentro, y muchas moscas morían desesperadas. Así acabó toda posible confraternidad con estos animales dignos de mejor suerte.

The story is straightforward, with simple, almost pedestrian words. But those words have been extremely carefully selected, crafted in a particular order. I imagine Cortázar spending countless hours, agonizing on a myriad small decisions until reaching a point of perfect simplicity.

There was a clear before and after reading this book in 1981: language was not the same ever again. I learned to find the fantastic side of the quotidian. I grew to appreciate risk when building sentences, when pushing meanings and readings. My whole way to look at the world was influenced by a small book of ridiculous short stories.

P.S. I published this post in my old, extinct blog on 2009-02-02

This time is Calvino

This happens relatively frequently: I am talking with someone else that doesn’t know me well and, at some point of the conversation I have mentioned that I am a forester. Then we move into books and I mention someone like Borges or Calvino and they look at me with this puzzled face as in ‘I didn’t know that foresters could read’. I know, it happens to other professions as well; just for the record not all of us are semi-literate apes, working with a chainsaw.

I was sorting out my bookshelves at work when I found a copy of The literature machine, a collection of essays by Italo Calvino. It had my name and signature, together with 2002, Melbourne, Australia. (Digression: besides my name and signature I always put the city where I bought a book). I had vague memories of walking around in Melbourne’s CBD and finding an underground bookshop. At the time I was not looking for anything in particular, just browsing titles.

Why did I buy the book and never read it? I do remember browsing it and getting distracted by something more urgent, albeit clearly unimportant, because I cannot remember what was it. Probably I was not ready either; it has happened to me before. From ‘Uncle Tom’s cabin’ when I was nine, to ‘The Fountainhead’ when I was a teenager, to ‘The literature machine’ seven years ago. Most likely there is an issue of maturity, of being ready to read a particular story, philosophy or approach to the world.

Many years ago I read some of Calvino’s books, like Cosmicomics (brilliantly funny) and ‘The cloven viscount’ (very enjoyable reading). But I particularly struggle with two literary forms: essays and plays. I sometimes can get into the former, but the latter has proven–until today–insurmountable.

However, today is the time for Calvino and essays. There is something deeply stimulating in these essays, together with a quaintness created by forty years gone since they were written. The feeling of freshness, possibility and hope from 1968 reads strange in 2017. At the same time, there is a bit of breaking with the system, since the implosion of the international economy. Maybe it is an excellent time to resonate with Calvino, as in the old days.

Paying for a job well done

At the moment I am writing R code that involves a lot of simulation for a project. This time I wanted to organize the work properly, put a package together, document it,… the whole shebang. Hadley Wickham has excellent documentation for this process in Advanced R, which works very well as a website. Up to this point there is nothing new; but the material is also available as a book.

At this point in my life I do not want to have a physical object if I can avoid it. On top of that, code tutorials work a lot better as a website, so one can copy, paste and experiment. PDF or ebooks are not very handy for this subject either. Here enters a revolutionary notion: I like to pay people who do a good job and, in the process, make my job easier but sometimes I do not want an object in exchange.

One short term solution: asking Hadley for his favorite charity and donating the cost of a copy of the book. That gets most people happy except, perhaps, the publisher. I then remembered this idea by Cory Doctorow, in which he acts as a middleman between people who wish to pay him for his stories (but don’t want a physical copy of books) and school libraries that wish to have copies of the books.

Wouldn’t it be nice to have an arrangement like that for programming and research books? For example, we could get R learners who prefer but can’t afford books and people willing to pay for them.

Paying for intangibles (Photo: Luis, click to enlarge).
Paying for intangibles (Photo: Luis, click to enlarge).

Flotsam 11: mostly on books

‘No estaba muerto, andaba the parranda’ as the song says. Although rather than partying it mostly has been reading, taking pictures and trying to learn how to record sounds. Here there are some things I’ve come across lately.

I can’t remember if I’ve recommended Matloff’s The Art of R Programming before; if I haven’t, go and read the book for a good exposition of the language. Matloff also has an open book (as in free PDF, 3.5MB) entitled ‘From Algorithms to Z-Scores: Probabilistic and Statistical Modeling in Computer Science’. The download link is near the end of the page. He states that the reader ‘must know calculus, basic matrix algebra, and have some minimal skill in programming’, which incidentally is the bare minimum for someone that wants to get a good handle on stats. In my case I learned calculus partly with Piskunov’s book (I’m a sucker for Soviet books, free DjVu), matrix algebra with Searle’s book and programming with… that’s another story.

I’ve ordered a couple of books from CRC Press, which I hope to receive soon (it depends on how long it takes for the parcel to arrive to the middle of nowhere):

  • Stroup’s Generalized Linear Mixed Models: Modern Concepts, Methods and Applications, which according to the blurb comes ‘with numerous examples using SAS PROC GLIMMIX’. You could be wondering Why is he reading a book that includes SAS as a selling point? Well, SAS is a very good statistical thinking that still has a fairly broad installed based. However, the real selling point is that I’ve read some explanations on mixed models written by Stroup and he has superb understanding of the topic. I’m really looking forward to put my paws on this book.
  • Lunn et al.’s The BUGS Book: A Practical Introduction to Bayesian Analysis. I don’t use BUGS but occasionally use JAGS and one of the things that irks me of programs like BUGS, JAGS or INLA is that they follow the ‘here is a bunch of examples’ approach to documentation. This books is supposed to provide a much more detailed account of the ins and outs of fitting models and a proper manual. Or at least that’s what I’m hoping to find in it.

Finally, a link to a fairly long (and somewhat old) list of R tips and the acknowledgements of a PhD thesis that make you smile (via Arthur Charpentier).

Gratuitous picture: frozen fence (Photo: Luis, click to enlarge).
Gratuitous picture: frozen fence (Photo: Luis, click to enlarge).

‘He was not dead, he was out partying’.

Matrix Algebra Useful for Statistics

I was having a conversation with an acquaintance about courses that were particularly useful in our work. My forestry degree involved completing 50 compulsory + 10 elective courses; if I had to choose courses that were influential and/or really useful they would be Operations Research, Economic Evaluation of Projects, Ecology, 3 Calculus and 2 Algebras. Subsequently my PhD was almost entirely research based but I sort of did Matrix Algebra: Dorian lent me his copy of Searle’s Matrix Algebra Useful for Statistics and passed me a pile of assignments that Shayle Searle used to give in his course in Cornell. I completed the assignments on my own pace and then sat a crazy take-home exam for 24 hours.

Later that year I bought a cloth-bound 1982 version of the book, not the alien vomit purple paperback reprint currently on sale, which I consult from time to time. Why would one care about matrix algebra? Besides being a perfectly respectable intellectual endeavor on itself, maybe you can see that the degrees of freedom are the rank of a quadratic form; you can learn from this book what a quadratic form and a matrix rank are. Or you want to see more clearly the relationship between regression and ANOVA, because in matrix form a linear model is a linear model is a linear model. The commands outer, inner and kronecker product make a lot more sense once you know what an outer product and an inner product of vectors are. Thus, if you really want to understand a matrix language for data analysis and statistics (like R), it seems reasonable to try to understand the building blocks for such a language.

The book does not deal with any applications to statistics until chapter 13. Before that it is all about laying foundations to understand the applications, but do not expect nice graphs and cute photos. This is a very good text where one as to use the brain to imagine what’s going on in the equations and demonstrations. The exercises rely a lot on ‘prove this’ and ‘prove that’, which lead to much frustration and, after persevering, to many ‘aha! moments’.

XKCD 1050: In your face! Actually I feel the opposite concerning math.

I am the first to accept that I have a biased opinion about this book, because it has sentimental value. It represents difficult times, dealing with a new language, culture and, on top of that, animal breeding. At the same time, it opened doors to a whole world of ideas. This is much more than I can say of most books.

PS 2012-12-17: I have commented on a few more books in these posts.

A good part of my electives were in humanities (history & literature), which was unusual for forestry. I just couldn’t conceive going through a university without doing humanities.

Review: “Forest Analytics with R: an introduction”

Forestry is the province of variability. From a spatial point of view this variability ranges from within-tree variation (e.g. modeling wood properties) to billions of trees growing in millions of hectares (e.g. forest inventory). From a temporal point of view we can deal with daily variation in a physiological model to many decades in an empirical growth and yield model. Therefore, it is not surprising that there is a rich tradition of statistical applications to forestry problems.

At the same time, the scope of statistical problems is very diverse. As the saying goes forestry deals with “an ocean of knowledge, but only one centimeter deep”, which is perhaps an elegant way of saying a jack of all trades, master of none. Forest Analytics with R: an introduction by Andrew Robinson and Jeff Hamann (FAWR hereafter) attempts to provide a consistent overview of typical statistical techniques in forestry as they are implemented using the R statistical system.

Following the compulsory introduction to the R language and forest data management concepts, FAWR deals mostly with three themes: sampling and mapping (forest inventory), allometry and model fitting (e.g. diameter distributions, height-diameter equations and growth models), and simulation and optimization (implementing a growth and yield model, and forest estate planning). For each area the book provides a brief overview of the problem, a general description of the statistical issues, and then it uses R to deal with one or more example data sets. Because of this structure, chapters tend to stand on their own and guide the reader towards a standard analysis of the problem, with liberal use of graphics (very useful) and plenty of interspersing code with explanations (which can be visually confusing for some readers).

While the authors bill the book as using “state-of-the-art statistical and data-handling functionality”, the most modern applications are probably the use of non-linear mixed-effects models using a residual maximum likelihood approach. There is no coverage of, for example, Bayesian methodologies increasingly present in the forest biometrics literature.

Harvesting Eucalyptus urophylla x E. grandis hybrid clones in Brazil (Photo: Luis).

FAWR reminds me of a great but infuriating book by Italo Calvino (1993): “If on a Winter’s Night a Traveler“. Calvino starts many good stories and, once the reader is hooked in them, keeps on moving to a new one. The authors of FAWR acknowledge that they will only introduce the techniques, but a more comprehensive coverage of some topics would be appreciated. Readers with some experience in the topic may choose to skip the book altogether and move directly to, for example, Pinheiro and Bates (2000) book on Mixed-Effect Models in S and S-Plus and Lumley’s (2010) Complex Surveys: A Guide to Analysis Using R. FAWR is part of the growing number of “do X using R” books that, although useful in the short term, are so highly tied to specific software that one suspects they should come with a best-before date. A relevant question is how much content is left once we drop the software specific parts… perhaps not enough.

The book certainly has redeeming features. For example, Part IV introduces the reader to calling an external function written in C (a growth model), to then combining the results with R functions to create a credible growth and yield forecasting system. Later the authors tackle harvest scheduling through linear programming models, task often addressed using domain-specific (both proprietary and expensive) software. The authors use this part to provide a good case study of model implementation.

At the end of the day I am ambivalent about FAWR. On the one hand, it is possible to find better coverage of most topics in other books or R documentation. On the other, it provides a convenient point of entry if one is lost on how to start working in forest biometrics with R. An additional positive aspect is that the book increases R credibility as an alternative for forest analytics, which makes me wish this book had been around 3 years ago, when I needed to convince colleagues to move our statistics teaching to R.

P.S. This review was published with minor changes as “Apiolaza, L.A. 2012. Andrew P. Robinson, Jeff D. Hamann: Forest Analytics With R: An Introduction. Springer, 2011. ISBN 978-1-4419-7761-8. xv+339 pp. Journal of Agricultural, Biological and Environmental Statistics 17(2): 306-307” (DOI: 10.1007/s13253-012-0093-y).
P.S.2. 2012-05-31. After publishing this text I discovered that I already used the sentence “[f]orestry deals with variability and variability is the province of statistics” in a blog post in 2009.
P.S.3. 2012-05-31. I first heard the saying “forestry deals with an ocean of knowledge, but only one centimeter deep” around 1994 in a presentation by Oscar García in Valdivia, Chile.
P.S.4. 2012-06-01. Added links to both authors internet presence.

The weirdness of ebooks

A couple of weeks ago I got a Sony Reader PRS-T1 through the use of Flybuys (a loyalty card scheme available in New Zealand). I had been thinking about buying an Amazon Kindle but then we got the Flybuys catalogue and I could not see the point of shelling out cash for something that I could get much more cheaply.

The device is quite nice and my only hardware quibble is the front frame of the screen, which is too reflective. In contrast, the Sony software to synchronize ebook reader and computer (a mac in my case) is a piece of junk. Therefore, the first thing I did was to install Calibre, which is not pretty but quite effective as a book manager.

By default Sony software pushes the reader to buy books through Whitcoulls, which manages to inspire limitless disappointment: selection is poor and prices high. Why would someone pays $27 for an ebook? The good thing is one can buy books from other sources (e.g. Book Depository or quite a few other book stores), at much lower prices, often below $10. Most books will come with a DRM (usually using Adobe Digital Editions) although there are a few bookshops (e.g. Baen) that ship them DRM-free.

Here is, perhaps, the biggest disappointment with the way many publishers/stores are dealing with their customers; they are treating us like potential criminals. We already went through this with iTunes, which eliminated DRM for music files some years ago. Why choose to unnecessarily constrain the files, particularly when DRM is annoying and can be easily broken? In a similar vein, how come that publishers manage to make pirate copies of books much more easily available than legal ones? If i- I have the money (promise, I do) and ii- I am willing to pay (promise, I do too), why can’t I get legal copies of books by Borges, Cortázar, Bolaño or whoever I want to read? Publishers should compulsorily read this comic by The Oatmeal.

On the plus side, as a quick Google search will show, it is possible to easily break either Amazon’s or Adobe’s DRM using a plugin for Calibre. I am not saying that one should do it, only that is an option. It would always be a pain to end up with unreadable books in the same way that Microsoft MSN customers ended up with unplayable music.

Tuff-luv ebook cover.

After that rant, how does it feel to read in the Sony Reader? It is quite nice and, after a little while, the hardware disappears and the story moves ahead, just like in a normal book. It won’t work for all books—e.g. if you are a fan of Edward Tufte’s books– but it is perfect for most novels or short stories. Rather than buying a typical Sony cover, I ordered this one from Tuff-Luv (a company from the UK) that arrived in less than one week: excellent service and shipped from Germany! The cover is nice looking and, more importantly, covers the shiny sides of the reader, so no more reflection.

In summary, I’m back at reading lots because it is easy to always carry many books in my backpack. This is bliss for an inveterate ‘reading many books in parallel’ aficionado.