Let’s Find Out – or: the Power of Reference

The core message of a number of books I’ve recently had the great pleasure to read has been fairly simple. Have a look. Check it out. Put your numbers in perspective. In a world awash with statistics and cognitive biases imploring us to cheer mindlessly for our own team, having the skill and wherewithal to step back and carefully ask: “can this really be so?” is golden.

One of recently passed celebrity professor and YouTube phenomenon Hans Rosling’s most profound advice for countering misinformation about the state of the world is precisely this: put all numbers in perspective. Never accept unaccompanied numbers – never believe the numerator without checking the denominator. What matters, as Bryan Caplan never ceases to emphasize as the GMU Economics creed, “are statistics, not emotions – and arguments, not stories.”

But, a statistic may never be left alone, Rosling maintains, but always compared to other relevant numbers. What share of its total category does this statistic represent? What was it last year, 5 or 10 or 20 years ago? Is there some self-evident change in associated behavior that is relevant or ought to explain it? A century ago street cars used to kill and injure hundreds of people every year, but since very few American cities make use of street cars today, the casualty is fortunately much lower. If we keep in mind that miles travelled by cars far outnumber miles travelled by street cars, reporting the number of street car deaths – while probably correct – entirely miss the point when discussing traffic safety. In How Not To Be Wrong, Mathematics professor Jordan Ellenberg quipped

Dividing one number by another is mere computation ; knowing what to divide by what is mathematics.

Here’s another example. If I told you about 23 000 individual deaths and spent a brief 10 second on each of them, going through the list would take me almost three days. On a personal level like that, 23 000 deaths is an absurd, insane, catastrophe-style event that few people are emotionally equipped to handle – essentially the size of my hometown, wiped out in a single year. If I told you those 23 000 deaths were due to antibiotic resistant diseases in the U.S. last year, the pandemic scenarios working through your mind quickly escalate. That many! Let’s find the nearest bunker!

If I then told you that cancer and heart diseases (each!) claim the lives of about 20x that, the fear of lethal apocalyptic germs consuming the world ought to quickly recede. Oh.

Here’s another example. It is entirely correct to point out that the number of people killed in worldwide airplane accidents in 2018 (556 people) was much higher than the year before (44 people) and the year before that (325 people). Would one be excused for believing that air travel is getting more risky and dangerous? Forbes, for instance, ran a roughly accurate story claiming that airline fatalities increased by 900%.

Not in the slightest. The number of fatalities from air travel has been falling for decades, all while the number of flights and miles travelled have increased exponentially, meaning that the per-flight, per-mile or per-passenger risk of death has kept dropping. Not to mention that alternative modes of travelling like driving is magnitudes more dangerous.

While Rosling teaches us to figure out what the base rate is, i.e. putting our statistic into appropriate perspective, one of Philip Tetlock’s tricks for becoming a ‘Superforecaster’ is to use Bayesian updating of one’s beliefs. This picks up precisely where Rosling’s idea left off. Once we know where to start, we have to amass more information, numbers and observations from other points of view – Bayesian updating is a popular method to incorporate and synthesize new information with the old.

In short “Calculation, like logic, is your friend” (Landsburg 2018: 44). Statistics matter and numbers can deceive. In order to better understand our realities and see through mistakes that others make – either intentionally to deceive or persuade, or unintentionally through ignorance – we must embrace the core message of people like Ellenberg, Tetlock, Duffy, Rosling or Pinker.

Always Be Comparing Thy Numbers. Never accept an unaccompanied statistic. Never trust numerators without denominators.

Nightcap

  1. Bringing natural law to international relations Samuel Gregg, Law & Liberty
  2. How to face down the Secret Service Irfan Khawaja, Policy of Truth
  3. Affirmative Action at Harvard and statistics Gelman, Goel, & Ho, Boston Review
  4. The right’s triumph; the Left’s complicity Chris Dillow, Stumbling & Mumbling

Twelve Things Worth Knowing According to Jacques Delacroix, PhD, Plus a Very Few Brain Food Items.

Note: I wish you all a prosperous, healthy, and writerly year 2019. (No wishes for happiness, it will come from all the above.)

I have a French nephew who is super-smart. Not long after graduating from the best school in France, he moved to Morocco where he married a super-smart Moroccan woman. He is so smart that he asked me for my intellectual will before I depart for another planet. It’s below.

Here are my qualifications: I taught in universities for thirty years, including twenty-five years in a business school in Silicon Valley. My doctorate is in sociology. (Please, don’t judge me.) My fields of specialization are Organizational Theory and the Sociology of Economic Development. My degree is from a very good university although I am a French high school dropout. My vita is linked here (pdf). Its academic part is respectable from a scholarly standpoint, no more. There is much additional info in my book: I Used to Be French: an Immature Autobiography, available from me, and on Amazon Kindle, and in my electronic book of memoirs in French: “Les Pumas de grande-banlieue: histoires d’émigration”, also on Amazon Kindle.

1. When the facts don’t fit your perspective you should change …. ? (Complete sentence.)

2. One basic complex idea worth knowing that resists learning: natural selection.

Note: the effective mechanism involved is multi-generational differential reproduction. You don’t understand natural selection until you can put a meaning on all three words.

3. Another basic idea worth knowing, a counter-intuitive one, that also resists learning: the principle of Comparative Advantage: If you are not working at what you do the very best, you are impoverishing me. There is a ten-lesson quick course on my blog to explain this. Look for short essays with the word “protectionism” in the title. A longform version can also be found, here.

4. Taking from the poor is a stupid way to try to become rich when you can invent a new world – like Steve Jobs – and be immensely rewarded for it. Or open a decent restaurant and be well rewarded, or learn welding. There isn’t much you can take from the poor anyway because they are poor. Plus, the bastards often resist!

5. Culture is in the heads (plural). Everything else isn’t “culture.”

6. How a body of people act is not simply the addition of the thinking of its individual human members. (There is a sociology!)

7. Beware those pesky fractions. Quick test: Five years ago, my income was 40% of yours. Now, my income is only 20% of yours. Am I earning less than I did five years ago?

8. Correlation is not causation but there is no causation without some sort of correlation.

9. Statistical significance is significant even if you don’t quite know what it signifies. Find out. It’s not hard.

10. Use statistical estimation methods even if you don’t understand them well. It will improve your reasoning rigor by confronting you brutally with the wrongness of your guesses. And you can only become better at it with practice.

11. There is not text that’s not improved by extirpating from it half of all adjectives and adverbs.

12. Reading is still the most efficient way to improve your comprehension of the world.

It seems to me that if you understand these twelve points inside out, you are well above average in general culture; that’s even true on a global scale.

Below are some intellectual anchoring points of my life. They are subjectively chosen, of course. Don’t lend them too much credence.

My favorite singer-composers: Jacques Brel; the Argentinean Communist Atahualpa Yupanqui. (I can’t help it.)

My favorite instrumental musics: baroque music, the blues.

My favorite painters: Caravaggio (link); Delacroix (Eugene); Delacroix (Krishna).

I don’t have a favorite book because I read all the time without trying to rank books. These three books have made a lasting impression, changed my brain pathways forever, I suspect: Daniel Defoe, Robinson Crusoe; George R. Stewart, Earth Abides; Eric Hoffer, The True Believer: Thoughts on the Nature of Mass Movements.

The only two intelligent things I have said in my life:

“Once you know a woman well vertically, you know nothing about her horizontally.”

“There is not bad book.”

On the point of quantifying in general and quantifying for policy purposes

Recently, I stumbled on this piece in Chronicle by Jerry Muller. It made my blood boil. In the piece, the author basically argues that, in the world of education, we are fixated with quantitative indicators of performance. This fixation has led to miss (or forget) some important truths about education and the transmission of knowledge. I wholeheartedly disagree because the author of the piece is confounding two things.

We need to measure things! Measurements are crucial to our understandings of causal relations and outcomes.  Like Diane Coyle, I am a big fan of the “dashboard” of indicators to get an idea of what is broadly happening.  However, I agree with the authors that very often the statistics lose their entire meaning. And that’s when we start targeting them!

Once we know that this variable becomes the object of target, we act in ways that increase this variable. As soon as it is selected, we modify our behavior to achieve fixed targets and the variable loses some of its meaning. This is also known as Goodhart’s law whereby “when a measure becomes a target, it ceases to be a good measure” (note: it also looks a lot like the Lucas critique).

Although Goodhart made this point in the context of monetary policy, it applies to any sphere of policy – including education. When an education department decides that this is the metric they care about (e.g. completion rates, minority admission, average grade point, completion times, balanced curriculum, ratio of professors to pupils, etc.), they are inducing a change in behavior which alters the significance carried by this variable.  This is not an original point. Just go to google scholar and type “Goodhart’s law and education” and you end up with papers such as these two (here and here) that make exactly the point I am making here.

In his Chronicle piece, Muller actually makes note of this without realizing how important it is. He notes that “what the advocates of greater accountability metrics overlook is how the increasing cost of college is due in part to the expanding cadres of administrators, many of whom are required to comply with government mandates(emphasis mine).

The problem he is complaining about is not metrics per se, but rather the effects of having policy-makers decide a metric of relevance. This is a problem about selection bias, not measurement. If statistics are collected without an intent to be a benchmark for the attribution of funds or special privileges (i.e. that there are no incentives to change behavior that affects the reporting of a particular statistics), then there is no problem.

I understand that complaining about a “tyranny of metrics” is fashionable, but in that case the fashion looks like crocs (and I really hate crocs) with white socks.

The Cost of ‘Free’ – or why I don’t like freeware

This is a partial response to Fabio Rojas recent post on the fate of Stata, a statistics package, given the rise of a free alternative, R. Rojas and others have many reasons for why R is a good package, but for now I wish to deal with the argument that it being ‘free’ is a virtue.

R is free, but I see it as a fault because it reveals that it doesn’t have a devoted support system and because it isn’t free at all. It’s actually very costly!

If you’ve spent any time with an economist you should know that there is no such thing as a free lunch. If R is free we should not simply assume it is better. To the contrary we should ask why it is free. As I have tried to argue elsewhere, it is because when you purchase software you aren’t just purchasing a few lines of code. You’re purchasing the support system that comes with it. When a company purchases Stata, or any commercial software, they do so with the expectation that they can call a dedicated hotline for troubleshooting. As software has evolved you’ve seen companies experiment with pricing to acknowledge the fact that we don’t purchase a one time software but a continuous support system.

Consider Xbox or Playstation’s online services. Their use is charged on a per time basis because it costs money to run servers and provide customer support. Even ‘freemium’ games, which nominally don’t require any money to play, survive off micro transactions which enable companies to earn steady revenues in exchange for continuing support and new content. I would not be surprised if freemium statistical software is tried in the future – access to basic regressions is free but more advanced models cost money to run. I half joke.

But let’s assume you’re good at coding and don’t need much support outside of a few days reading an R book. Should you praise R for being ‘free’? No, because you still paid the time value of your time. Every hour spent learning how to code in R is an hour you could have spent doing any number of things.

Now to be clear, you may still want to learn R if it frees up your time in the future by automating X process. This post isn’t to argue against adopting R. My point is only to say that it isn’t free in a meaningful sense. Adopting R costs in the sense that you’re giving up a devoted support system and value of time equal to how long it takes you to become proficient in it.

It’s possible that once you account for those things R is still ‘cheaper’ than commercial software like Stata or SPSS. That is an empirical question beyond the scope of this post.

How dairy farmers unions in Canada are distorting the facts about supply management

Under heat recently as President Trump has criticized supply management in Canada and retaliated against it, the different provincial associations representing dairy farmers have moved on the offensive. To promote the virtues of this system meant to reduce production in order to prop up prices through the use of trade tariffs, production quotas and price controls (how can we call those virtues), these unions have produced numerous infographics to make their case. It is even part of what they dub their These-infographics-show-that-diary-prices-are-lower-in-Canada-than-elsewhere, that milk is still a cheap drink relative to other type of drinks and those prices, supposedly, increase more slowly than elsewhere. All of these graphics are dishonest and must be dismantled.

The most egregious of these infographics – present in the “lobby day kit” – shows the price of milk in Australia (1.55 CAD), Canada (1.45 CAD) and New Zealand (1.65 CAD). They are seemingly using 2014 prices. First of all, they use data that conflicts massively with the reports of Statistics Canada that suggest that milk prices hover between 2.33$ to 2.48$ per liter.  Their data is provided by AC Nielsen but no justification is presented as to why they are better than Statistics Canada. The truth is that it is not better. Participants in Nielsen surveys come from a self-selected pool of storeowners who wish to participate and are then selected by Nielsen to be part of the data collection. Then, they can record prices. It should be mentioned that not all regions of Canada are covered in the data. Although the Nielsen data does have some uses (especially with regards to market studies), it hardly measures up Statistics Canada when comes the time to evaluate price levels. This is because the government agency collects information from all regions and tries a broader sweep of retailers in order to create the consumer price index.

But an even larger problem is that, in their comparison of prices, they don’t mention that New Zealand taxes milk. In New Zealand, all food items are subjected to sales tax, which is not the case in Canada and Australia. Hence, when they compare retail prices, they are comparing prices that exclude taxes and prices that include taxes. One would like to find if they acknowledge this fact in the methodological mentions, but there are none!

Using prices available at Numbeo.com and Expatisan.com and the exchange rates made available by the Bank of Canada, we can correct for this problem of theirs. Simply changing prices source leads to a massively different result with regards to Australia whose milk prices are lower than in Canada. Secondly, once we adjust for the sales tax in New Zealand, we find that prices in New Zealand are lower than in Canada. In fact they are lower than in one of Canada’s cheapest market, Montreal (let alone Toronto or Vancouver).  So the infographic they show in order to lobby governments is a fabrication.

Table 1: The real price of milk

Using Numbeo.com (regular milk)
Unadjusted Adjusted for taxes
 Australia  $           1.59  $                 1.59
 New Zealand  $           2.26  $                 1.97
 Canada  $           1.99  $                 1.99
 Using Expatisan.com (whole milk)
 Unadjusted  Adjusted for taxes
 Sydney  $           1.82  $                 1.47
 Wellington  $           2.42  $                 2.10
 Montreal  $           2.87  $                 2.87

Source: Numbeo.com and Expatisan.com, consulted May 16th 2014 and the Bank of Canada’s currency converter. Note: using the Statistics Canada price would make Canada’s situation even worse by comparison.

This is part of a pattern of deceit since they also massage data for numerous other graphs that are presented to Canadians in efforts to convince them of the virtues of supply management. One other example is an infographic that presents a figure of nominal milk prices in Australia before and after the abolition of supply management. Given that prices seem more volatile after 2000 and that they increase more steeply, they try to make us believe that liberalization was a failure. This is not the case. Any sensible policy analyst would deflate nominal prices by the general price index to control for inflation. When one does just that using the data from the Australian Bureau of Statistics, one sees that real prices stabilized in the first ten years of deregulation after increasing roughly 15% in the decade prior. And since 2010, real prices have been falling constantly.

Other examples abound. In one instance, the Quebec union of dairy farmers circulated an infographic meant to show that nominal prices for dairy products increased faster in the United States than in Canada. Again, they omit inflation. Since 1990 (their own starting date), prices of dairy products have risen more slowly than inflation – indicating a decline in real prices. In Canada, the opposite occurred – inflation increased more slowly than dairy prices indicating an increase of the real price.

The debate around supply management is complicated. The policy course to adopt in order to improve agricultural productivity and lower prices for Canadians is hard to pinpoint. But whatever position one may hold, no one is well-served by statistical manipulations offered by the unions representing dairy farmers.

On 7 million deaths from air pollution

ATTN published a video of An-huld (the really cool guy who made my childhood by being in all my favorite action movies like Predator* and who ended up being the governor of California). In that short clip, Schwarznegger starts by saying that 7 million individuals die from pollution-related illnesses.

That number is correct. But it is misleading.

People see pollution as “all and the same”. But some forms of pollution increase with development (sulfur emissions and some would argue that too much CO2 emissions is pollution as it causes climate change). However, others drop dramatically – especially heavy particules (Pm10) which are a great cause of smog. Julian Simon (the late cornucopian economist who is one my greatest intellectual influence) pointed out this issue and noted that the deadliest forms of pollution are those that relate to underdevelopment.

Back in 2003, Jack Hollander published the Real Environmental Crisis: Why Poverty, Not Affluence is the Environment’s Number One Enemy. Hollander pointed out that simply from the combustion of organic matter (read: firewood and animal manure – literally burning fecal matter) indoors for the purposes of heating, cooking and lighting was responsible for close to 2 millions deaths.

Since then, the WHO came out with a study pointing out that around 3 billion people cook and heat their homes with open fires and stoves that rely on biomass or anthracite-coal. They put the number of premature deaths directly resulting from this at over 4 million people. This is close to 60% of the figure cited by the former President of California (yes, I know he was governor – see here). In other words, 60% of the people who die prematurely as a result of strokes, ischaemic heart diseases, chronic obstructive pulmonary diseases and lung cancers can be attributed to indoor air pollution. That means pollution resulting from the fact that you are so poor that you have to burn anything at hand at the cost of your health.

True, richer countries pollute and there are policy solutions (I have often argued that governments are better at polluting than at reducing pollution, but that is another debate) that should be adopted. But, these forms of pollution do not harm human life as much as those that come with poverty.

* By the way, when you watch Predator, do you realize that there are two future American governors in that movie? I mean, imagine that when Predator came out, some dude from the future told you that two of the main actors would end governing American states. Pretty freaky!