In our research group we often have people creating statistical models that end up in publications but, most of the time, the practical implementation of those models is lacking. I mean, we have a bunch of barely functioning code that is very difficult to use in a reliable way in operations of the breeding programs.… Continue reading Implementing a model as an R package
Reading a folder with many small files
One of the tools we use in our research is NIR (Near-Infrared Spectroscopy), which we apply to thousands of samples to predict their chemical composition. Each NIR spectrum is contained in a CSV text file with two numerical columns: wavelength and reflectance. All files have the same number of rows (1296 in our case), which… Continue reading Reading a folder with many small files
From character to numeric pedigrees
In quantitative genetic analyses we often use a pedigree to represent the relatedness between individuals, so this is accounted in the analyses, because the observations are not independent of each other. Often this pedigree contains alphanumeric labels, and most software can cope with that. Sometimes, though, we want to use numeric identities because we would… Continue reading From character to numeric pedigrees
Being data curious: the strange case of lamb consumption in NZ
There is a lot of talk about the skills needed for working in Statistics/Data Science, with the discussion often focusing on theoretical understanding, programming languages, exploratory data analysis, and visualization. There are many good blog posts dealing with how you get data, process it with your favorite language and then creating some good-looking plots. However,… Continue reading Being data curious: the strange case of lamb consumption in NZ
Reducing friction in R to avoid Excel
When you have students working in a project there is always an element of quality control. Some times the results just make sense, while others we are suspicious about something going wrong. This means going back to check the whole analysis process: can we retrace all the steps in a calculation (going back to data… Continue reading Reducing friction in R to avoid Excel