Quantum Forest

notes in a shoebox

Category: policy (page 1 of 2)

Why you shouldn’t entrust your pet to Glenstar Kennels

Travel is part of life and if you have pets, allergy finding appropriate boarding for them is a must. This is my explanation for why you should not entrust your dog to Glenstar Kennels, in Canterbury, New Zealand.

At the end of 2016 I had work approval for a two-month trip overseas. Normally I would book accommodation for my dog at the SPCA boarding kennels (as when we had two-months repairs to our house following Christchurch’s earthquake). However, as this trip included Christmas/New Year, it was impossible to find a vacancy. I was happy to find a spot for my dog at Glenstar Kennels spanning the whole end of year period.

Sadly, after 2 months travelling I found a sad surprise when I went to pick up my dog. He was almost 5 kg overweight, which for a 27 kg dog is 20% of weight gain in 2 months. As an illustration, imagine if you were 75 kg and gained 15 kg in only 2 months.

I immediately wrote to the owner of Glenstar Kennels, who stated that “Whilst I do agree he has put on weight had he fed a cheap food and lost weight in the kennels I feel you would be more upset”. Well, a dog becoming overweight is not any better than a dog losing weight! Both situations lead to reduced animal lifespan. While I believe the quality of the food provided was probably appropriate, the combination of physical activity and food quantity was clearly inappropriate.

The New Zealand Animal Welfare Act 1999 and the Code of Welfare for Dogs 2010 (administered by the Ministry for Primary Industries), establish a series of minimum standards for dog care:

  1. Dogs need a balanced daily diet in quantities that meet their requirements for health and welfare and to maintain their ideal bodyweight.
  2. The amount of food offered needs to be increased if a dog is losing condition, or decreased if it is becoming overweight.
  3. The code of welfare for dogs applies in all situations, including temporary housing such as shelters, doggy daycares or day boarding facilities, and kennels

According to the schedule sent to me by Glenstar Kennels, there were 12 hours of contact a day when someone had access to my dog and 65 days to figure out that he was becoming overweight and take appropriate action. The photo below shows my dog’s increased girth as I tried to fit his harness as per the size when I drop him off at Glenstar’s facility on 9th December and his size on 12th February (after gaining 5 kg). There is a dramatic difference, well explained by the term negligence.

Poor doggio showing his change of girth after two months in Glenstar Kennels.

Poor doggio showing his change of girth after two months in Glenstar Kennels.

I am very unhappy with the level of care provided by Glenstar Kennels and the lack of a satisfactory reply to my written complaints. After 2 months away I had to take my dog to his usual veterinary to discuss his overweight, with the associated cost, so I could bring him back to good health. Dog and I have been doing our usual daily walks (as we did before the trip), and I have been very careful about his nutrition, so he can go slowly back to his optimal 27 kg.

Unfortunately, there is no compulsory regulatory body for dog boarding kennels that could enforce the Code of Welfare for Dogs. However, I feel that I have to write this review and make Glenstar Kennels negligence public, so other dog owners (and potential customers) are aware of their extremely poor service.

P.S. The owners of Glenstar Kennels also have another company: Star Pet Travel for pet relocations. They use Glenstar Kennels for their temporary accommodation. I wouldn’t use them either.

Financiamiento de investigación

He pospuesto muchas veces este post, visit así es que va en su estado actual, incompleto, parcialmente digerido, como para empezar una conversación.

Como investigador me beneficia decir que todos los países deberían investir (¿o será gastar?) más recursos en investigación. Mientras más grande sea el presupuesto, más probable es que me va a tocar una tajada. Sin embargo, cuando miro a Chile desde la distancia, hay algo que me incomoda en la lógica de la mayor inversión en investigación.

La historia es más o menos así:

  1. Chile dedica una proporción muy pequeña de su presupuesto a la investigación científica.
  2. Los países desarrollados invierten mucho más en investigación.
  3. Por lo tanto, si Chile quiere ser desarrollado hay que invertir más en investigación.
  4. ¿Cómo va a poder participar Chile en la economía del conocimiento con tan poca investigación?

A un nivel superficial, la historia parece tener sentido. Pero cuando me detengo a pensar, “la historia no me convence, sólo me atraganta” (como diría Fulano).

Los primeros dos puntos son completamente ciertos: Chile dedica una proporción pequeña (algunos dirían minúscula) del presupuesto nacional a la investigación científica. Los países desarrollados invierten muchas veces más en proporción (y órdenes de magnitud más en manera absoluta) de su producto interno bruto en materias de investigación (datos aquí). Sin embargo, y este sin embargo es muy relevante, hay un supuesto de causalidad en el corazón de mi “pero”. El supuesto es que los países son más desarrollados porque invirtieron más en investigación.

La relación entre inversión y desempeño económico no es tan directa como se supone.

Inversión en investigación y desarrollo como por porcentaje del producto interno bruto para países de la OECD. La relación entre inversión y desempeño económico no es tan directa como se supone.

Ahí es dónde empiezo a dudar. Los países desarrollados invierten mucho más en arte, pero ¿son más desarrollados porque invierten más dinero en ballet (o teatro, o películas, o estatuas)? Quizás los países más ricos se pueden dar e lujo de invertir más en cultura—y la ciencia es una expresión cultural—y dicha inversión puede que tenga efectos positivos en la economía (o puede que no). Por ejemplo, Japón invierte un mayor porcentaje que Alemania en investigación, y la economía alemana ha crecido mucho más que la japonesa. Por supuesto, podrías comentar, el contexto económico y cultural es diferente. ¿Por qué esperarías una relación directa? Bueno, ese es mi punto.

Tal vez una pregunta mejor es ¿por qué la investigación científica debería ser evaluada como un camino al desarrollo? Los investigadores tenemos una culpa parcial en el tema. En algún momento hubo que justificar el presupuesto y alguien dijo “pero es una inversión”, y de ahí continuamos repitiéndola.

Privilegio

Los investigadores representamos un grupo privilegiado: hemos tenido la mejor educación disponible; somos la cúspide del sistema educacional. En cierta medida, nuestras demandas por mayor financiamiento representan la extensión de ese privilegio, mientras la mayoría de la población recibe una educación que condena a trabajos con el ingreso mínimo.

Si uno estuviera a cargo del desarrollo de políticas públicas, ¿cuáles serían la inversiones que maximizan el beneficio para la sociedad? Quizás invertir en educación de buena calidad para una mayoría de la población, mejorar salud y nutrición para los sectores menos favorecidos tengan una mayor rentabilidad social. Algo así como los esfuerzos para reducir la mortalidad infantil (datos aquí. Por si acaso, los países con más alta mortalidad en el gráfico son México y Turquía).

Uno de los mayores logros en Chile: la reducción de la mortalidad infantil.

Uno de los mayores logros en Chile: la reducción de la mortalidad infantil. Las otras líneas representan países del OECD.

¿Y la economía del conocimiento?

Si uno piensa en Google, Facebook, Uber, Airbnb, …, Apple. ¿Cuánta ciencia hay en estas compañías? ¿Cuánta tecnología e ingeniería? Una simple apuesta: hay mucha más tecnología, ingeniería y emprendimiento que ciencia.

Monsanto, 23andMe, Syngenta. Una historia similar.

No creo que nos falte ciencia pero hay una carencia de emprendimiento técnico/científico. Sobran doctorados mientras faltan masters que integren entendimiento científico con comercialización. Esta situación no es exclusiva de Chile: sobran doctorados en buena parte del planeta. En muchas áreas existe un esquema pirámide de la enseñanza: hay muchos más estudiantes graduándose con postgrados que posiciones disponibles en universidades e institutos de investigación.

Hay una publicación del World Economic Forum que trata de medir cuáles son los países más creativos. Para ese fin usa un índice con tres factores: tecnología (inversión en investigación y desarrollo, patentes per capita), talento (porcentaje de adultos con educación terciaria y trabajadores en actividades creativas) y tolerancia (tratamiento de inmigrantes, minorías étnicas y alternativas sexuales). Es posible tener un alto índice de creatividad con valores no particularmente alto para algún factor, y la creatividad tiene una correlación positiva con el desempeño económico. Chile ciertamente tiene que trabajar en talento y tolerancia, que involucran mayores sectores de la población; es decir, son esencialmente indicadores más democráticos.

¿Quieres decir que no hay que financiar la investigación?

Buscando crear instancias de financiamiento es fácil llegar a vender la idea de una conexión estrecha entre ciencia y desarrollo, pero ¿cuántos investigadores trabajan pensando en eso? Si uno es honesto, prácticamente nadie estudia “la quinta pata del gato” en un tema científico para desarrollar el país. Uno lo hace porque le interesa el desafío, por querer entender y explicar.

Para dejarlo bien claro: no estoy diciendo paren de financiar la investigación científica. Lo que sí estoy diciendo es que los motivos comúnmente presentados por personas lobbying al gobierno tienen una relación causal tenue con el desarrollo económico del país. La investigación merece ser financiada como una representación de la cultura del país, así como lo son el teatro, la música, etc.

P.S. La mayor parte de mi investigación se conecta con aplicaciones de mejoramiento genético, estadística y ciencias de la madera en la industria forestal (aunque en ocasiones he trabajado en genética de la personalidad de calamares, desempeño reproductivo en moscas y algunas otras rarezas).

P.S.2. Para una visión diametralmente opuesta, Science is vital if Britain is to prosper, publicada por coincidencia el mismo día que este post.

P.S.3. El artículo Basic research as a political symbol (PDF) de Roger Pielke Jr. presenta una discusión de la evolución del concepto de investigación básica. Matt Ridley también describe un punto de vista interesante, sugiriendo que la mayoría de la investigación podría ser financiada por el sector privado.

Back of the envelope look at school decile changes

Currently there is some discussion in New Zealand about the effect of the reclassification of schools in socioeconomic deciles. An interesting aspect of the funding system in New Zealand is that state and state-integrated schools with poorer families receive substantially more funding from the government than schools that receive students from richer families (see this page in the Ministry of Education’s website).

One thing that I haven’t noticed before is that funding decisions are more granular than simply using deciles, more about as deciles 1 to 4 are split into 3 steps each. For example, infection for Targeted Funding for Educational Achievement in 2015 we get the following amounts per student for decile: 1 (A: $905.81, B: $842.11, C: $731.3), 2 (D: $617.8, E: 507.01, F: 420.54), 3 (G: $350.25, H: $277.32, I: $220.59), 4 (J: $182.74, K: $149.99, L: $135.12), 5 ($115.76), 6 ($93.71), 7: ($71.64), 8 ($46.86), 9 ($28.93) and 10 ($0).

The Ministry of Education states that 784 schools ‘have moved to a higher decile rating’ while 800 ‘have moved to a lower decile rating’ (800 didn’t move). They do not mean that those numbers of schools changed deciles, but that information also includes changes of steps within deciles. Another issue is that it is not the same to move one step at the bottom of the scale (e.g. ~$63 from 1A to 1B) or at the top (~$29 from 9 to 10); that is, the relationship is not linear.

I assume that the baseline to measure funding changes is to calculate how much would a school would get per student in 2015 without any change of decile/step. That is, funding assuming that the previous step within decile had stayed constant. Then we can calculate how a student will get with the new decile/step for the school. I have limited this ‘back of the envelope’ calculation to Targeted Funding for Educational Achievement, which is not the only source of funding linked to deciles. There are other things like Special Education Grant and Careers Information Grant, but they have much smaller magnitude (maximum $73.94 & $37.31 per student) and the maximum differences between deciles 1 and 10 are 2:1.

Steps are in capital letters and need to be translated into money. Once we get that we can calculate differences at both student level and school level:

If we look at the 50% of the schools in the middle of the distribution they had fairly small changes, approximately +/- $22 per student per year or at the school level +/- 3,000 dollars per year.

An interesting, though not entirely surprising, graph is plotting changes of funding on the size of the school. Large schools are much more stable on deciles/step than small ones.

Change of funding per student per year (NZ$) on size of the school (number of students).

Change of funding per student per year (NZ$) on size of the school (number of students).

Change of funding per school per year (thousands of NZ$) on school size (number of students).

Change of funding per school per year (thousands of NZ$) on school size (number of students).

Overall, there is a small change of the total amount of money for Targeted Funding for Educational Achievement used in the reclassified school system versus using the old deciles ($125M using 2014 deciles versus $132M using 2015 deciles) and for most schools the changes do not seem dramatic. There is, however, a number of schools (mostly small ones) who have had substantial changes to their funding. Very small schools will tend to display the largest changes, as the arrival or departure of only few pupils with very different socioeconomic backgrounds would have a substantial effect. An example would be Mata School in the Gisborne area, which moved 13 steps in decile funding (from A to N) with a roll of 11 kids. How to maintain a more steady funding regime seems to be a difficult challenge in those cases.

One consequence of the larger variability in small schools is that rural areas will be more affected by larger changes of funding. While overall 34% of the schools had no changes to their decile/step classification in rural areas that reduces to 22%; on top of that, the magnitude of the changes for rural schools is also larger.

Footnote:

Data files used for this post: DecileChanges_20142015 and directory-school-current.

Operational school funding is much more complex than deciles, as it includes allocations depending on number of students, use of Maori language, etc.

P.S. Stephen Senn highlights an obvious problem with the language the Ministry uses: there are 9 deciles (the points splitting the distribution into 10 parts). We should be talking about tenths, a much simpler word, instead of deciles.

A couple of thoughts on biotech and food security

“What has {insert biotech here} done for food security?” This question starts at the wrong end of the problem, because food security is much larger than any biotechnology. I would suggest that governance, property rights and education are the fundamental issues for food security, followed by biotechnological options. For example, the best biotechnology is useless if one is trying to do agriculture in a war-ravaged country.

Once we have a relatively stable government and educated people can rely on property rights, the effects of different biotechnologies will be magnified and it will be possible to better assess them. I would say that matching the most appropriate technologies to the local environmental, economic and cultural conditions is a good sign of sustainable agriculture. I would also say that the broader the portfolio of biotechnology and agronomic practices the more likely a good match will be. That is, I would not a priori exclude any biotechnology from the table based on generic considerations.

Should the success of a biotechnology for food security be measured as yield? It could be one of the desired effects but it is not necessarily the most important one. For example, having less fluctuating production (that is reducing the variance rather than increasing the mean) could be more relevant. Or we could be interested in creating combinations of traits that are difficult to achieve by traditional breeding (e.g. biofortification), where yield is still the same but nutritional content differs. Or we would like to have a reduction of inputs (agrochemicals, for example) while maintaining yield. There are many potential answers and—coming back to matching practices to local requirements—using a simple average of all crops in a country (or a continent) is definitely the wrong scale of assessment. We do not want to work with an average farmer or an average consumer but to target specific needs with the best available practices. Some times this will include {insert biotech, agronomical practices here}, other times this will include {insert another biotech and set of agronomical practices here}.

And that is the way I think of improving food security.

Should I reject a manuscript because the analyses weren’t done using open source software?

“Should I reject a manuscript because the analyses weren’t done using open software?” I overheard a couple of young researchers discussing. Initially I thought it was a joke but, to my surprise, it was not funny at all.

There is an unsettling, underlying idea in that question: the value of a scientific work can be reduced to its computability. If I, the reader, cannot replicate the computation the work is of little, if any, value. Even further, my verification has to have no software cost involved, because if that is not the case we are limiting the possibility of computation to only those who can afford it. Therefore, the almost unavoidable conclusion is that we should force the use of open software in science.

What happens if the analyses were run using a point-and-click interface? For example SPSS, JMP, Genstat, Statistica, and a few other programs allow access to fairly complex analytical algorithms via a system of menus and icons. Most of them are not open source nor generate code for the analyses. Should we ban their use in science? One could argue that if users only spend the time and learn a programming language (e.g. R or Python) they will be free of the limitations of point-and-click. Nevertheless, we would be shifting accessibility from people that can pay for an academic license for a software to people that can learn and moderately enjoy programming. Are we better off as research community by that shift?

There is another assumption: open software will always provide good (or even appropriate) analytical tools for any problem. I assume that in many cases OSS is good enough and that there is a subset of problems where it is the best option. However, there is another subset where it is suboptimal. For example, I deal a lot with linear mixed models used in quantitative genetics, an area where R is seriously deficient. In fact, I should have to ignore the last 15 years of statistical development to run large problems. Given that some of the data sets are worth millions of dollars and decades of work, Should I sacrifice the use of best models so a hypothetical someone, somewhere can actually run my code without paying for an academic software license? This was a rhetorical question, by the way, as I would not do it.

There are trade-offs and unintended consequences in all research policies. This is one case where I think the negative effects would outweigh the benefits.

Gratuitous picture: I smiled when I saw the sign with the rightful place for forestry (Photo: Luis).

Gratuitous picture: I smiled when I saw the sign with the rightful place for forestry (Photo: Luis).

P.S. 2013-12-20 16:13 NZST Timothée Poisot provides some counterarguments for a subset of articles: papers about software.

Protectionism under another name

This morning Radio New Zealand covered a story (audio) where Tomatoes New Zealand (TNZ, sale the growers association) was asking the Government to introduce compulsory labeling for irradiated products (namely imported Australian tomatoes), stating that consumers deserve an informed choice (TNZ Press Release). Two points that I think merit attention:

  • Food irradiation is perfectly safe: it does not make food radioactive, it does not alter the nutritional value of food and reduces the presence (or completely eliminates) the presence of microorganisms that cause disease or pests (the latter being the reason for irradiation in this case).
  • The second point is that the call for labeling does not come from consumers (or an organization representing them) but from producers that face competition.

This situation reminded me of Milton Friedman talking about professional licenses The justification offered is always the same: to protect the consumer. However, the reason is demonstrated by observing who lobbies. Who is doing the lobbying is very telling in this case, particularly because there is no real reason to induce fear on the consumer, except that irradiation sounds too close to radioactive, and therefore TNZ is hoping to steer consumers away from imported tomatoes. Given that TNZ is for informing the consumer they could label tomatoes with ‘many of the characteristics in this tomato are the product of mutations‘. Harmless but scary.

P.S. The media is now picking up the story. Shameful manipulation.
P.S.2 Give that TNZ is in favor of the right to know I want a list of all chemicals used in the production of New Zealand tomatoes, how good is their water management and the employment practices of tomato growers.

Scraping pages and downloading files using R

I have written a few posts discussing descriptive analyses of evaluation of National Standards for New Zealand primary schools.The data for roughly half of the schools was made available by the media, but the full version of the dataset is provided in a single-school basis. In the page for a given school there may be link to a PDF file with the information on standards sent by the school to the Ministry of Education.

I’d like to keep a copy of the PDF reports for all the schools for which I do not have performance information, so I decided to write an R script to download just over 1,000 PDF files. Once I can identify all the schools with missing information I just loop over the list, using the fact that all URL for the school pages start with the same prefix. I download the page, look for the name of the PDF file and then download the PDF file, which is named school_schoolnumber.pdf. And that’s it.

Of course life would be a lot simpler if the Ministry of Education made the information available in a usable form for analysis.

Can you help?

It would be great if you can help me to get the information from the reports. The following link randomly chooses a school, click on the “National Standards” tab and open the PDF file.

Then type the achievement numbers for reading, writing and mathematics in this Google Spreadsheet. No need to worry about different values per sex or ethnicity; the total values will do.

Gratuitous picture: a simple summer lunch (Photo: Luis).

A word of caution: the sample may have an effect

This week I’ve tried to i-stay mostly in the descriptive statistics realm and ii-surround any simple(istic) models with caveats and pointing that they are very preliminary. We are working with a sample of ~1, tooth 000 schools that did reply to Fairfax’s request, rx while there is a number of schools that either ignored the request or told Fairfax to go and F themselves. Why am I saying this? If one goes and gets a simple table of the number of schools by type and decile there is something quite interesting: we have different percentages for different types of schools represented in the sample and the possibility of bias on the reporting to Fairfax, information pills due to potential low performance (references to datasets correspond to the ones I used in this post):

Now let’s compare this number with the school directory:

As a proportion we are missing more secondary schools. We can use the following code to get an idea of how similar are school types, because the small number of different composite schools is a pain. If

Representation of different schools types and deciles is uneven.

Different participations in the sample for school types. This type is performance in mathematics.


I’m using jittering rather than box and whisker plots to i- depict all the schools and ii- get an idea of the different participation of school types in the dataset. Sigh. Another caveat to add in the discussion.

P.S. 2012-09-27 16:15. Originally I mentioned in this post the lack of secondary schools (Year 9-15) but, well, they are not supposed to be here, because National Standards apply to years 1 to 8 (Thanks to Michael MacAskill for pointing out my error.)

Some regressions on school data

Eric and I have been exchanging emails about potential analyses for the school data and he published a first draft model in Offsetting Behaviour. I have kept on doing mostly data exploration while we get a definitive full dataset, epidemic and looking at some of the pictures I thought we could present a model with fewer predictors.

The starting point is the standards dataset I created in the previous post:

There seems to be a different trend for secondary vs non-secondary schools concerning the relationship between number of full time teacher equivalent and total roll. The presence of a small number of large schools suggests that log transforming the variables could be a good idea.

Difference on the number of students per FTTE between secondary and non-secondary schools.

Now we fit a model where we are trying to predict reading standards achievement per school accounting for decile, authority , proportion of non-european students, secondary schools versus the rest, and a different effect of number of students per FTTE
for secondary and non-secondary schools.

The residuals are still a bit of a mess:

Residuals for this linear model: still a bit of a mess.

If we remember my previous post decile accounted for 45% of variation and we explain 4% more through the additional predictors. Non-integrated schools have lower performance, a higher proportion of non-European students reduce performance, secondary schools have lower performance and larger classes tend to perform better (Eric suggests reverse causality, I’m agnostic at this stage), although the rate of improvement changes between secondary and non-secondary schools. In contrast with Eric, I didn’t fit separate ethnicities as those predictors are related to each other and constrained to add up to one.

Of course this model is very preliminary, and a quick look at the coefficients will show that changes on any predictors besides decile will move the response by a very small amount (despite the tiny p-values and numerous stars next to them). The distribution of residuals is still heavy-tailed and there are plenty of questions about data quality; I’ll quote Eric here:

But differences in performance among schools of the same decile by definition have to be about something other than decile. I can’t tell from this data whether it’s differences in stat-juking, differences in unobserved characteristics of entering students, differences in school pedagogy, or something else. But there’s something here that bears explaining.

Updating and expanding New Zealand school data

In two previous posts I put together a data set and presented some exploratory data analysis on school achievement for national standards. After those posts I exchanged emails with a few people about the sources of data and Jeremy Greenbrook-Held pointed out Education Counts as a good source of additional variables, malady including number of teachers per school and proportions for different ethnic groups.

The code below call three files: Directory-Schools-Current.csv, teacher-numbers.csv and SchoolReport_data_distributable.csv, which you can download from the links.

This updated data set is more comprehensive but it doesn’t change the general picture presented in my previous post beyond the headlines. Now we can get some cool graphs to point out the obvious, for example the large proportion of Maori and Pacific Island students in low decile schools:

Proportion of Pacific Island (vertical axis) and Maori students (horizontal axis) in schools with points colored by decile. Higher proportions for both are observed in low decile schools.

I have avoided ‘proper’ statistical modeling because i- there is substantial uncertainty in the data and ii- the national standards for all schools (as opposed to only 1,000 schools) will be released soon; we do’t know if the published data are a random sample. In any case, a quick linear model fitting the proportion of students that meet reading standards (reading.OK) as a function of decile and weighted by total school roll—to account for the varying school sizes—will explain roughly 45% of the observed variability on reading achievement.

Model fit has a few issues with distribution of residuals, we should probably use a power transformation for the response variable, but I wouldn’t spend much more time before getting the full data for national standards.

Residuals of a quick weighted linear model. The residuals show some heterogeneity of variance (top-left) and deviation from normality (top-right) with heavy tails.

Bonus plot: map of New Zealand based on school locations, colors depicting proportion of students meeting reading national standards.

[sourcecode lang=”R”]
qplot(longitude, latitude,
data = standards, color = reading.OK)
[/sourcecode]

New Zealand drawn using school locations; color coding is for proportion of students meeting reading national standards.

P.S. 2012-09-26 16:01. The simple model above could be fitted taking into account the order of the decile factor (using ordered()) or just fitting linear and quadratic terms for a numeric expression of decile. Anyway, that would account for 45% of the observed variability.

P.S. 2012-09-26 18:17. Eric Crampton has posted preliminary analyses based on this dataset in Offsetting Behaviour.

Older posts

© 2017 Quantum Forest

Theme by Anders NorenUp ↑