In our research group we often have people creating statistical models that end up in publications but, most of the time, the practical implementation of those models is lacking. I mean, we have a bunch of barely functioning code that is very difficult to use in a reliable way in operations of the breeding programs.… Continue reading Implementing a model as an R package

# Category: stats

## Being data curious: the strange case of lamb consumption in NZ

There is a lot of talk about the skills needed for working in Statistics/Data Science, with the discussion often focusing on theoretical understanding, programming languages, exploratory data analysis, and visualization. There are many good blog posts dealing with how you get data, process it with your favorite language and then creating some good-looking plots. However,… Continue reading Being data curious: the strange case of lamb consumption in NZ

## Functions with multiple results in tidyverse

I have continued playing with the tidyverse for different parts of a couple of projects. Often I need to apply a function by groups of observations; sometimes, that function returns more than a single number. It could be something like for each group fit a distribution and return the distribution parameters. Or, simpler for the… Continue reading Functions with multiple results in tidyverse

## Turtles all the way down

One of the main uses for R is for exploration and learning. Let's say that I wanted to learn simple linear regression (the bread and butter of statistics) and see how the formulas work. I could simulate a simple example and fit the regression with R:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
library(arm) # For display() # Simulate 5 observations set.seed(50) x <- 1:5 y <- 2 + 3*x + rnorm(5, mean = 0, sd = 3) # Fit regression reg <- lm(y ~ x, dat) display(reg) # lm(formula = y ~ x, data = dat) # coef.est coef.se # (Intercept) 3.99 3.05 # x 2.04 0.92 # --- # n = 5, k = 2 # residual sd = 2.91, R-Squared = 0.62 # Plot it plot(y ~ x) abline(coef(reg)) |

The formulas for the intercept ($latex b_0$) and… Continue reading Turtles all the way down

## Cute Gibbs sampling for rounded observations

I was attending a course of Bayesian Statistics where this problem showed up: There is a number of individuals, say 12, who take a pass/fail test 15 times. For each individual we have recorded the number of passes, which can go from 0 to 15. Because of confidentiality issues, we are presented with rounded-to-the-closest-multiple-of-3 data… Continue reading Cute Gibbs sampling for rounded observations