Evolving notes, images and sounds by Luis Apiolaza

Category: research (Page 1 of 7)

Python not suitable platform for reproducible research

While [Active Papers] has achieved its mission of demonstrating that unifying computational reproducibility and provenance tracking is doable and useful, it has also demonstrated that Python is not a suitable platform to build on for reproducible research. Breaking changes at all layers of the software stack are too frequent.

Konrad Hinsen in Archiving Active Papers

I started using Python for my PhD around 1997, to control simulations I wrote using Fortran 90. I chose Python based on Konrad Hinsen’s writings at the time in a long-disappeared website. A few years later I moved all my work to R, which I found much more stable. I have some 20-year-old R base code that still runs. 😇

Incidentally, last year I wrote a series of posts on Some love for base R.

Why did my breeding values go down?

At first, the question may sound strange. You have been collecting data, running analyses using various acronyms (PBLUP, GBLUP, HBLUP, …), in a univariate/multivariate fashion, using ad-hoc or commercial software (asreml-R, Bolt, …), generating a long list of numbers sorted from highest to lowest. 

The list was uploaded to a website (or a printed catalogue) but YOU are now in a meeting talking with industry producers and someone is reading the list, looking for the genotypes (families, clones, varieties) they used before. Unsurprising to you, but annoying to them the breeding values of their favourite genotypes are lower than in a previous year.

—Why did the breeding values of my genotype are down?

—Well… we have more data and new genotypes in the breeding programme.

—So my genotypes are worse than before now?

—No, they are as good as before.

—But the values are lower —pointing at the list.

Compared to the average

And things don’t get much better from that point onwards

💡 Lightbulb moment

You ask some people from the audience to come to the front, and ask order them from tallest to shorter until everyone is happy with their position. Then point out where the average would be; some people will have positive deviations, other negative deviations from that average.

Now get two people, hopefully much taller than the ones already standing in front on the audience. Ask them to join the “ranking”. Point out that 1) there is a new average, 2) that the individual deviations changed from the previous height ranking, 3) that some positive deviations are now negative.

By now, most people have seen the change of ranking happening right there. They have seen that the intrinsic value of people’s height did not change. Instead, there are a few taller people in the group, changing the average. This should happen in a breeding programme when it is working well.

—And this is why your favourite genotype’s breeding values are lower this year.

#breeding #treebreeding #quantitativegenetics

I am a sucker… for interesting problems

As a child and as a young person I moved a lot (schools, cities, countries). On one side it gave me plenty of interesting experiences; on the other, a weird sense of belonging and facing racism. Later, I started moving a lot less to the point that I’ve been working almost 18 years for the same university.

And I have been working in quantitative genetics and breeding for t-h-i-r-t-y y-e-a-r-s.

After a while one starts thinking “and then, what?” This is where getting interested in the weird, the margins, the unusual comes handy. And I don’t mean like playing a musical instrument (although I have great respect for the ukulele) or doing sports (if you are keen on underwater hockey). That’s personal time. 

I mean at work. For me it could be doing something completely outside genetics/breeding or, perhaps, working with unusual organisms. This type of problems involve applying what we know to a different context, while learning from someone else, with a vastly different background.

My favourite personal example is when I was collaborating with a PhD student (now Dr David Sinn) who was researching squid personality. I had absolutely no idea about squid; now I know they are very smart and cute. I had never considered that squid had personality either.

It was a good distraction, moved me from routine and learnt something new. Once in a while, I am a sucker for interesting problems, because they are really fun (and you may end up with a cool publication).

Sinn, D.L., Apiolaza, L.A. and Moltschaniwskyj, N.A. 2006. Genetic analysis and reproductive consequences of squid personality traits. Journal of Evolutionary Biology 19(5): 1437-1447. https://doi.org//10.1111/j.1420-9101.2006.01136.x

Photo: Julian Finn / Museum Victoria

Big blob thesis OR chapter = publication?

I have a PhD student that should be submitting by approximately Christmas time, so the last week has been a lot of reading and editing. Three chapters, which will eventually become articles, are now “thesis ready”, meaning they may need minor polish before submission but they are ready for external thesis evaluation.

As a supervisor, I have a strong preference for a thesis to be made out of a number of publications (or pieces that will turn into publications) over the long document, big blob approach. Some reasons:

  • Once the thesis goes for external review, some of the chapters will have already undergone peer review. It’s hard to fault a chapter that was already published. 😉
  • Once the students finish, they already have some publications under their belt. They have gone through the process, and I insist they do, so it is less mysterious.
  • We don’t have to chase a students after they finished just to write publications. It has happened that they get a job and publications become a distant priority.

There are some drawbacks too:

  • There is some level of repetition, often there are variations of similar introductions (although it depends on the structure of the chapters).
  • There is a need for writing a general introduction and general conclusions (albeit often short).
  • Completing each part requires more work, because students are targeting a higher standard of writing (publication rather than thesis).

I am sure there will be vastly different experiences in this topic.

Are we working with a model organism?

There are organisms that are highly popular in research, like fruit flies or mice in animals,Arabidopsis or poplars in plants. There are very good reasons to work with those species (model organisms), as there are very good reasons not to work with them.

If you work in primary production—cereals, veggies, fruits, animals, or trees as I do—in essence feeding the world and providing biomaterials, we tend to think as not working with model organisms. However, once we have been working in a breeding programme for a while, we start accumulating measurements of a very broad set of traits under wide-ranging environmental conditions. Not only that, but then we start using some of that same genetic material in cultivation/management trials.

Progressively we start managing enough information that some of the genotypes in our breeding programme start acting/feeling like model organisms. So, yes, Pinus radiata (radiata pine, Monterrey pine) is my model organism.

« Older posts

© 2024 Palimpsest

Theme by Anders NorenUp ↑