A while ago I wanted to run a quantitative genetic analysis where the performance of genotypes in each site was considered as a different trait. If you think about it, with 70 sites and thousands of genotypes one is trying to fit a 70×70 additive genetic covariance matrix, which requires 70*69/2 = 2,415 covariance components. Besides requiring huge amounts of memory and being subject to all sort of estimation problems there were all sort of connectedness issues that precluded the use of Factor Analytic models to model the covariance matrix. The best next thing was to run over 2,000 bivariate analyses to build a large genetic correlation matrix (which has all sort of issues, I know). This meant leaving the computer running for over a week.

In another unrelated post, Kevin asked if I have ever considered using ASReml-R to run in parallel using a computer with multiple cores. Short answer, I haven’t, but there is always a first time.

```
require(asreml) # Multivariate mixed models
require(doMC) # To access multiple cores
registerDoMC() # to start a "parallel backend"
# Read basic density data
setwd('~/Dropbox/quantumforest')
# Phenotypes
dat = read.table('density09.txt', header = TRUE)
dat$FCLN = factor(dat$FCLN)
dat$MCLN = factor(dat$MCLN)
dat$REP = factor(dat$REP)
dat$SETS = factor(dat$SETS)
dat$FAM = factor(dat$FAM)
dat$CP = factor(dat$CP)
dat$Tree_id = factor(dat$Tree_id)
head(dat)
# Pedigree
ped = read.table('pedigree.txt', header = TRUE)
names(ped)[1] = 'Tree_id'
ped$Tree_id = factor(ped$Tree_id)
ped$Mother = factor(ped$Mother)
ped$Father = factor(ped$Father)
# Inverse of the numerator relationship matrix
Ainv = asreml.Ainverse(ped)$ginv
# Wrap call to a generic bivariate analysis
# (This one uses the same model for all sites
# but it doesn't need to be so)
bivar = function(trial1, trial2) {
t2 = dat[dat$Trial_id == trial1 | dat$Trial_id == trial2,]
t2$den1 = ifelse(t2$Trial_id == trial1, t2$DEN, NA)
t2$den2 = ifelse(t2$Trial_id == trial2, t2$DEN, NA)
t2$Trial_id = t2$Trial_id[drop = TRUE]
# Bivariate run
# Fits overall mean, random correlated additive effects with
# heterogeneous variances and diagonal matrix for heterogeneous
# residuals
den.bi = asreml(cbind(den1,den2) ~ trait,
random = ~ ped(Tree_id):corgh(trait),
data = t2,
ginverse = list(Tree_id = Ainv),
rcov = ~ units:diag(trait),
workspace=64e06, maxiter=30)
# Returns only log-Likelihood for this example
return(summary(den.bi)$loglik)
}
# Run the bivariate example in parallel for two pairs of sites
# FR38_1 with FR38_2, FR38_1 with FR38_3
foreach(trial1=rep("FR38_1",2), trial2=c("FR38_2", "FR38_3")) %dopar%
bivar(trial1, trial2)
```

The code runs in my laptop (only two cores) but I still have to test its performance in my desktop (4 cores) and see if it really makes a difference in time versus running the analyses sequentially. Initially my mistake was using the multicore package (`require(multicore)`

) directly, which doesn’t start the ‘parallel backend’ and ran sequentially. Using `require(doMC)`

loads multicore but takes care of starting the ‘parallel backend’.

P.S. Thanks to Etienne Laliberté, who sent me an email in early 2010 pointing out that I had to use doMC. One of the few exceptions in the pile of useless emails I have archived.

P.S.2. If you want to learn more about this type of models I recommend two books: Mrode’s Linear Models for the Prediction of Animal Breeding Values, which covers multivariate evaluation with lots of gory details, and Lynch and Walsh’s Genetics and Analysis of Quantitative Traits, which is the closest thing to the bible in quantitative genetics.

P.S.3. **2012-05-08.** I did run the code in my desktop (iMac with 4 cores), making a couple of small changes in the code to have a fairer comparison between using `%dopar%`

(for parallel code) and `%par%`

(for sequential execution).

- Added
`trace = FALSE`

to the`asreml()`

call, to avoid printing partial iteration details to the screen, which happened when using`%do%`

. - Used 4 pairs of sites, so I could utilize all cores in my desktop.

Execution time for the parallel version—measured with `Sys.time()`

—was, roughly, 1/(number of cores) the time required by the sequential version.