On Borjas, Data and More Data

I see my craft as an economic historian as a dual mission. The first is to answer historical question by using economic theory (and in the process enliven economic theory through the use of history). The second relates to my obsessive-compulsive nature which can be observed by how much attention and care I give to getting the data right. My co-authors have often observed me “freaking out” over a possible improvement in data quality or be plagued by doubts over whether or not I had gone “one assumption too far” (pun on a bridge too far). Sometimes, I wish more economists would follow my historian-like freakouts over data quality. Why?

Because of this!

In that paper, Michael Clemens (whom I secretly admire – not so secretly now that I have written it on a blog) criticizes the recent paper produced by George Borjas showing the negative effect of immigration on wages for workers without a high school degree. Using the famous Mariel boatlift of 1980, Clemens basically shows that there were pressures on the US Census Bureau at the same time as the boatlift to add more black workers without high school degrees. This previously underrepresented group surged in importance within the survey data. However since that underrepresented group had lower wages than the average of the wider group of workers without high school degrees, there was an composition effect at play that caused wages to fall (in appearance). However, a composition effect is also a bias causing an artificial drop in wages and this drove the results produced by Borjas (and underestimated the conclusion made by David Card in his original paper to which Borjas was replying).

This is cautionary tale about the limits of econometrics. After all, a regression is only as good as the data it uses and suited to the question it seeks to answer. Sometimes, simple Ordinary Least Squares are excellent tools. When the question is broad and/or the data is excellent, an OLS can be a sufficient and necessary condition to a viable answer. However, the narrower the question (i.e. is there an effect of immigration only on unskilled and low-education workers), the better the method has to be. The problem is that the better methods often require better data as well. To obtain the latter, one must know the details of a data source. This is why I am nuts over data accuracy. Even small things matter – like a shift in the representation of blacks in survey data – in these cases. Otherwise, you end up with your results being reversed by very minor changes (see this paper in Journal of Economic Methodology for examples).

This is why I freak out over data. Maybe I can make two suggestions about sharing my freak-outs.

The first is to prefer a skewed ratio of data quality to advanced methods (i.e. simple methods with crazy-data). This reduces the chances of being criticized for relying on weak assumptions. The second is to take a leaf out of the book of the historians. While historians are often averse to advantaged data techniques (I remember a case when I had to explain panel data regressions to historians which ended terribly for me), they are very respectful of data sources. I have seen historians nurture datasets for years before being willing to present them. When published, they generally stand up to scrutiny because of the extensive wealth of details compiled.

That’s it folks.

 

Can we trust US interwar inequality figures?

This question is the one that me and Phil Magness have been asking for some time and we have now assembled our thoughts and measures in the first of a series of papers. In this paper, we take issue with the quality of the measurements that will be extracted from tax records during the interwar years (1918 to 1941).

More precisely, we point out that tax rates at the federal level fluctuated wildly and were at relatively high levels. Since most of our inequality measures are drawn from the federal tax data contained in the Statistics of Income, this is problematic. Indeed, high tax rates might deter honest reporting while rapidly changing rates will affect reporting behavior (causing artificial variations in the measure of market income). As such, both the level and the trend of inequality might be off.  That is our concern in very simple words.

To assess whether or not we are worrying for nothing, we went around to find different sources to assess the robustness of the inequality estimates based on the federal tax data. We found what we were looking for in Wisconsin whose tax rates were much lower (never above 7%) and less variable than those at the federal levels. As such, we found the perfect dataset to see if there are measurement problems in the data itself (through a varying selection bias).

From the Wisconsin data, we find that there are good reasons to be skeptical of the existing inequality measured based on federal tax data. The comparison of the IRS data for Wisconsin with the data from the state income tax shows a different pattern of evolution and a different level (especially when deductions are accounted for). First of all, the level is always inferior with the WTC data (Wisconsin Tax Commission). Secondly, the trend differs for the 1930s.

Table1 for Blog

I am not sure what it means in terms of the true level of inequality for the period. However, it suggests that we ought to be careful towards the estimations advanced if two data sources of a similar nature (tax data) with arguably minor conceptual differences (low and stable tax rates) tell dramatically different stories.  Maybe its time to try to further improve the pre-1945 series on inequality.

On the paradox of poverty and good health in Cuba

One of the most interesting (in my opinion) paradox in modern policy debates relates to how Cuba, a very poor country, has been able to generate health outcomes close to the levels observed in rich countries. To be fair, academics have long known that there is only an imperfect relation between material living standards and biological living standards (full disclosure: I am inclined to agree, but with important caveats better discussed in a future post or article, but there is an example). The problem is that Cuba is really an outlier. I mean, according to the WHO statistics, its pretty close to the United States in spite of being far poorer.

In the wake of Castro’s death, I believed it necessary to assess why Cuba is an outlier and creates this apparent paradox. As such, I decided to move some other projects aside for the purposes of understanding Cuban economic history and I have recently finalized the working paper (which I am about to submit) on this paradox (paper here at SSRN).

The working paper, written with physician Gilbert Berdine (a pneumologist from Texas Tech University), makes four key arguments to explain why Cuba is an outlier (that we ought not try to replicate).

The level of health outcomes is overestimated, but the improvements are real

 Incentives matter, even in the construction of statistics and this is why we should be skeptical. Indeed, doctors are working under centrally designed targets of infant mortality that they must achieve and there are penalties if the targets are not reached. As such, physicians respond rationally and they use complex stratagems to reduce their reported levels. This includes the re-categorization of early neonatal deaths as late fetal deaths which deflates the infant mortality rate and the pressuring (sometimes coercing) of mothers with risky pregnancies to abort in order to avoid missing their targets. This overstates the level of health outcomes in Cuba since accounting for reclassification of deaths and a hypothetically low proportions of pressured/coerced abortions reduces Cuban life expectancy by close to two years (see figure below). Nonetheless, the improvements in Cuba since 1959 are real and impressive – this cannot be negated.

Cuba1.png

 

Health Outcomes Result from Coercive Policy 

Many experts believe that we ought to try to achieve the levels of health outcomes generated by Cuba and resist the violations of human rights that are associated with the ruling regime. The problem is that they cannot be separated. It this through the use of coercive policy that the regime is able to allocate more than 10% of its tiny GDP to health care and close to 1% of its population to the task of being a physician. It ought also be mentioned that physicians in Cuba are also mandated to violate patient privacy and report information to the regime. Consequently, Cuban physicians (who are also members of the military) are the first line of internal defense of the regime. The use of extreme coercive measures has the effect of improving health outcomes, but it comes at the price of economic growth. As documented by Werner Troesken, there are always institutional trade-offs in term of health care. Either you adopt policies that promote growth but may hinder the adoption of certain public health measures or you adopt these measures at the price of growth. The difference between the two choices is that economic growth bears fruit in the distant future (i.e. there are palliative health effects of economic growth that take more time to materialize).

Health Outcomes are Accidents of Non-Health Related Policies

As part of the institutional trade-off that make Cubans poorer, there might be some unintended positive health-effects. Indeed, the rationing of some items does limit the ability of the population to consume items deleterious to their health. The restrictions on car ownership and imports (which have Cuba one of the Latin American countries with the lowest rate of car ownership) also reduces mortality from road accidents which,  in countries like Brazil, knock off 0.8 years of life expectancy at birth for men and 0.2 years for women.  The policies that generate these outcomes are macroeconomic policies (which impose strict controls on the economy) unrelated to the Cuban health care system. As such, the poverty caused by Cuban institutions  may also be helping Cuban live longer.

Human Development is not a Basic Needs Measure

The last point in the paper is that human development requires agency.  Since life expectancy at birth is one of the components of the Human Development Indexes (HDI),  Cuba fares very well on that front. The problem is that the philosophy between HDIs is that individual must have the ability to exercise agency. It is not a measure of poverty nor a measure of basic needs, it is a measure meant to capture how well can individual can exercise free will: higher incomes buy you some abilities, health provides you the ability to achieve them and education empowers you.

You cannot judge a country with “unfree” institutions with such a measure. You need to compare it with other countries, especially countries where there are fewer legal barriers to human agency. The problem is that within Latin America, it is hard to find such countries, but what happens when we compare with the four leading countries in terms of economic freedom. What happens to them? Well, not only do they often beat Cuba, but they have actually come from further back and as such they have seen much larger improvements that Cuba did.

This is not to say that these countries are to be imitated, but they are marginal improvements relative to Cuba and because they have freer institutions than Cuba, they have been able to generate more “human development” than Cuba did.

Cuba2.png

Our Conclusion

Our interpretation of Cuban health care provision and health outcomes can be illustrated by an analogy with an orchard. The fruit of positive health outcomes from the “coercive institutional tree” that Cuba has planted can only be picked once, and the tree depletes the soil significantly in terms of human agency and personal freedom. The “human development tree” nurtured in other countries yields more fruit, and it promises to keep yielding fruit in the future. Any praise of Cuba’s health policy should be examined within this broader institutional perspective.

On British Public Debt, the American Revolution and the Acadian Expulsion of 1755

I have a new working paper out there on the role of the Acadian expulsion of 1755 in fostering the American revolution.  Most Americans will not know about the expulsion of a large share of the French-speaking population (known as the Acadians) of the Maritimes provinces of Canada during the French and Indian Wars.

Basically, I argue that the policy of deportation was pushed by New England and Nova Scotia settlers who wanted the well-irrigated (thanks to an incredibly sophisticated – given the context of a capital-scarce frontier economy – dyking system) farms of the Acadians. Arguing that the French population under nominal British rule had only sworn an oath of neutrality, they represented a threat to British security, the settlers pushed hard for the expulsion. However, the deportation was not approved by London and was largely the result of colonial decisions rather than Imperial decisions. The problem was that the financial burden of the operation (equal to between 32% of 38% of the expenditures on North America – and that’s a conservative estimate) were borne by England, not the colonies.

This fits well, I argue, into a public choice framework. Rent-seeking settlers pushed for the adoption of a policy whose costs were spread over a large population (that of Britain) but whose benefits they were the sole reapers.

The problem is that this, as I have argued elsewhere, was a key moment in British Imperial history as it contributed to the idea that London had to end the era of “salutary neglect” in favor of a more active management of its colonies.  The attempt to centralize management of the British Empire, in order to best prioritize resources in a time of rising public debt and high expenditures level in the wars against the French, was a key factor in the initiation of the American Revolution.

Moreover, the response from Britain was itself a rent-seeking solution. As David Stasavage has documented, government creditors in England became well-embedded inside the British governmental structure in order to minimize default risks and better control expenses. These creditors were a crucial part of the coalition structure that led to the long Whig Supremacy over British politics (more than half a century). In that coalition, they lobbied for policies that advantaged them as creditors. The response to the Acadian expulsion debacle (for which London paid even though it did not approve it and considered the Acadian theatre of operation to be minor and inconsequential) should thus be seen also as a rent-seeking process.

As such, it means that there is a series of factors, well embedded inside broader public choice theory, that can contribute to an explanation of the initiation of the American Revolution. It is not by any means a complete explanation, but it offers a strong partial contribution that considers the incentives behind the ideas.

Again, the paper can be consulted here or here.

On doing economic history

I admit to being a happy man. While I am in general a smiling sort of fellow, I was delightfully giggling with joy upon hearing that another economic historian (and a fellow  Canadian from the LSE to boot), Dave Donaldson, won the John Bates Clark medal. I dare say that it was about time. Nonetheless I think it is time to talk to economists about how to do economic history (and why more should do it). Basically, I argue that the necessities of the trade require a longer period of maturation and a considerable amount of hard work. Yet, once the economic historian arrives at maturity, he produces long-lasting research which (in the words of Douglass North) uses history to bring theory to life.

Economic History is the Application of all Fields of Economics

Economics is a deductive science through which axiomatic statements about human behavior are derived. For example, stating that the demand curve is downward-sloping is an axiomatic statement. No economist ever needed to measure quantities and prices to say that if the price increases, all else being equal, the quantity will drop. As such, economic theory needs to be internally consistent (i.e. not argue that higher prices mean both smaller and greater quantities of goods consumed all else being equal).

However, the application of these axiomatic statements depends largely on the question asked. For example, I am currently doing work on the 19th century Canadian institution of seigneurial tenure. In that work, I  question the role that seigneurial tenure played in hindering economic development.  In the existing literature, the general argument is that the seigneurs (i.e. the landlords) hindered development by taxing (as per their legal rights) a large share of net agricultural output. This prevented the accumulation of savings which – in times of imperfect capital markets – were needed to finance investments in capital-intensive agriculture. That literature invoked one corpus of axiomatic statements that relate to capital theory. For my part, I argue that the system – because of a series of monopoly rights – was actually a monopsony system through the landlords restrained their demand for labor on the non-farm labor market and depressed wages. My argument invokes the corpus of axioms related to industrial organization and monopsony theory. Both explanations are internally consistent (there are no self-contradictions). Yet, one must be more relevant to the question of whether or not the institution hindered growth and one must square better with the observed facts.

And there is economic history properly done. It tries to answer which theory is relevant to the question asked. The purpose of economic history is thus to find which theories matter the most.

Take the case, again, of asymetric information. The seminal work of Akerlof on the market for lemons made a consistent theory, but subsequent waves of research (notably my favorite here by Eric Bond) have showed that the stylized predictions of this theory rarely materialize. Why? Because the theory of signaling suggests that individuals will find ways to invest in a “signal” to solve the problem. These are two competing theories (signaling versus asymetric information) and one seems to win over the other.  An economic historian tries to sort out what mattered to a particular event.

Now, take these last few paragraphs and drop the words “economic historians” and replace them by “economists”.  I believe that no economist would disagree with the definition of the tasks of the economist that I offered. So why would an economic historian be different? Everything that has happened is history and everything question with regards to it must be answered through sifting for the theories that is relevant to the event studied (under the constraint that the theory be consistent). Every economist is an economic historian.

As such, the economic historian/economist must use advanced tools related to econometrics: synthetic controls, instrumental variables, proper identification strategies, vector auto-regressions, cointegration, variance analysis and everything you can think of. He needs to do so in order to answer the question he tries to answer. The only difference with the economic historian is that he looks further back in the past.

The problem with this systematic approach is the efforts needed by practitioners.  There is a need to understand – intuitively – a wide body of literature on price theory, statistical theories and tools, accounting (for understanding national accounts) and political economy. This takes many years of training and I can take my case as an example. I force myself to read one scientific article that is outside my main fields of interest every week in order to create a mental repository of theoretical insights I can exploit. Since I entered university in 2006, I have been forcing myself to read theoretical books that were on the margin of my comfort zone. For example, University Economics by Allen and Alchian was one of my favorite discoveries as it introduced me to the UCLA approach to price theory. It changed my way of understanding firms and the decisions they made. Then reading some works on Keynesian theory (I will confess that I have never been able to finish the General Theory) which made me more respectful of some core insights of that body of literature. In the process of reading those, I created lists of theoretical key points like one would accumulate kitchen equipment.

This takes a lot of time, patience and modesty towards one’s accumulated stock of knowledge. But these theories never meant anything to me without any application to deeper questions. After all, debating about the theory of price stickiness without actually asking if it mattered is akin to debating with theologians about the gender of angels (I vote that they are angels and since these are fictitious, I don’t give a flying hoot’nanny). This is because I really buy in the claim made by Douglass North that theory is brought to life by history (and that history is explained by theory).

On the Practice of Economic History

So, how do we practice economic history? The first thing is to find questions that matter.  The second is to invest time in collecting inputs for production.

While accumulating theoretical insights, I also made lists of historical questions that were still debated.  Basically, I made lists of research questions since I was an undergraduate student (not kidding here) and I keep everything on the list until I have been satisfied by my answer and/or the subject has been convincingly resolved.

One of my criteria for selecting a question is that it must relate to an issue that is relevant to understanding why certain societies are where there are now. For example, I have been delving into the issue of the agricultural crisis in Canada during the early decades of the 19th century. Why? Because most historians attribute (wrongly in my opinion)  a key role to this crisis in the creation of the Canadian confederation, the migration of the French-Canadians to the United States and the politics of Canada until today. Another debate that I have been involved in relates to the Quiet Revolution in Québec (see my book here) which is argued to be a watershed moment in the history of the province. According to many, it marked a breaking point when Quebec caught up dramatically with the rest of  Canada (I disagreed and proposed that it actually slowed down a rapid convergence in the decade and a half that preceded it). I picked the question because the moment is central to all political narratives presently existing in Quebec and every politician ushers the words “Quiet Revolution” when given the chance.

In both cases, they mattered to understanding what Canada was and what it has become. I used theory to sort out what mattered and what did not matter. As such, I used theory to explain history and in the process I brought theory to life in a way that was relevant to readers (I hope).  The key point is to use theory and history together to bring both to life! That is the craft of the economic historian.

The other difficulty (on top of selecting questions and understanding theories that may be relevant) for the economic historian is the time-consuming nature of data collection. Economic historians are basically monks (and in my case, I have both the shape and the haircut of friar Tuck) who patiently collect and assemble new data for research. This is a high fixed cost of entering in the trade. In my case, I spent two years in a religious congregation (literally with religious officials) collecting prices, wages, piece rates, farm data to create a wide empirical portrait of the Canadian economy.  This was a long and arduous process.

However, thanks to the lists of questions I had assembled by reading theory and history, I saw the many steps of research I could generate by assembling data. Armed with some knowledge of what I could do, the data I collected told me of other questions that I could assemble. Once I had finish my data collection (18 months), I had assembled a roadmap of twenty-something papers in order to answer a wide array of questions on Canadian economic history: was there an agricultural crisis; were French-Canadians the inefficient farmers they were portrayed to be; why did the British tolerate catholic and French institutions when they conquered French Canada; did seigneurial tenure explain the poverty of French Canada; did the conquest of Canada matter to future growth; what was the role of free banking in stimulating growth in Canada etc.

It is necessary for the economic historian to collect a ton of data and assemble a large base of theoretical knowledge to guide the data towards relevant questions. For those reasons, the economic historian takes a longer time to mature. It simply takes more time. Yet, once the maturation is over (I feel that mine is far from being over to be honest), you get scholars like Joel Mokyr, Deirdre McCloskey, Robert Fogel, Douglass North, Barry Weingast, Sheilagh Ogilvie and Ronald Coase (yes, I consider Coase to be an economic historian but that is for another post) who are able to produce on a wide-ranging set of topics with great depth and understanding.

Conclusion

The craft of the economic historian is one that requires a long period of apprenticeship (there is an inside joke here, sorry about that). It requires heavy investment in theoretical understanding beyond the main field of interest that must be complemented with a diligent accumulation of potential research questions to guide the efforts at data collection. Yet, in the end, it generates research that is likely to resonate with the wider public and impact our understanding of theory. History brings theory to life indeed!

The GDP, real wages and working hours of France since the 13th century

Every few years, an economic historian in training spends thousands of hours in archives assembling a long quantitative essay. It’s the work of monks (in fact, when you go far back in history, you also end up working with monks and nuns – which was my case on Canadian economic history). It’s the kind of work that requires patience, attention to details and (did I say it already?) patience.

I did that for my own work on Canadian economic history. For two years, I locked myself in the archives of two religious congregations to collect and transcribe close to a million price and wages information. For these two years, I did not write one single paper. I just collected the data and constituted a list of the papers I could write. However, once its finished, you may party like a sailor fresh off the boat because you end up with a wealth of data to answer hundreds of questions. When I finished my own thing on Canada, I was thrilled as I thought it constituted a great advance in quantitative knowledge (which I could use to assess tougher historical questions).

However, compared to the work of Leonardo Ridolfi, my own work looks like a dwarf (I confess envy here).  Ridolfi spent hundreds of hours assembling a quantitative essay on France’s economy since 1250. This is monumental!  France has generally been a statistical abyss (except for demography and some price series) especially when compared to England. Yet, the country is highly relevant to western economic history. After all, the question of why did the Industrial Revolution take place in Britain is the mirror of asking why it did not happen in France. As a result, Ridolfi’s work fills one of the largest voids in the field of economic history and will end up being one of the most cited dissertations for the next ten years I expect.

He constructed estimates of real wages, prices, incomes and working hours. As such, he provided the widest possible statistical portrait possible which (I wont get into details here) circumvents tons of empirical complications that may limit the quality of each variable taken separately (see for example the manner in which GDP is calculated and the role that estimating working hours plays).

I invite anyone interested in economic history to read his work. But, I will give you the main conclusion I gathered: France was not as poor as many believed. I recently pointed this out in an article which I am trying to get published, but Ridolfi’s work proves my point beyond my wildest expectations. I assembled the most relevant figures below.

Ridolfi.png

On the reversal of fortune, urbanization and Canada

One of the more famous articles of economist Daron Acemoglu is his 2002 article on the reversal of fortunes where he points out that countries colonized by Europeans in 1500 that were relatively rich then are relatively poor now. In the paper, they use urban density as a proxy for economic development at that point in time.

I was not particularly convinced by this because of the issue of ruralization in colonial economies. I am still not convinced in fact. As many scholars interested in American colonial history point out, the country de-urbanized (ruralized) during the colonial era as cities grew at a slower pace than the general population. As such, the share of the US population in rural areas increased. But Jeffrey Williamson and Peter Lindert documented that in 1774, the United States were the richest place in the world (beating England on top of being more egalitarian). 

This is normal. Economies on the frontier had land to labor ratios that were the exact opposite of those in Europe. The opportunity cost of congregating in one area was high given the abundance of land that could be brought under cultivation. This is why the Americas (North America at least) was the Best Poor Man’s Country. As such, areas with low population density are not necessarily poor (even if urbanization is a pretty strong predictor of wealth).

This is where Canada comes in. Today, the country easily fits in the “relatively rich” group. According to the figures 1 and 2 in the work of Acemoglu, Johnson and Robinson, it would have been in the “relatively poor” group well behind countries in Latin America. However, I recently finished compiling the Canadian GDP figures between 1688 and 1790 which I can now compare with those of Arroyo Abad and Van Zanden for Peru and Mexico. With my Canadian data (see the figure below), we can see that Canada was as poor as Latin America around 1680 (the start date of my data).

GelosoGDP.png

So, Canada was a relatively poor country back which was equally poor (or moderately richer) than Latin American countries. Why does that matter to the reversal of fortune story? Well, with the urbanization data, one shows that the non-urbanized of 1500 are the rich of the today. With the GDP data for the 1680s, we see that the more urbanized countries were also poorer than the less urbanized countries.

Now, my argument is limited by the fact that I am using 1680s GDP rather than 1500 GDP. But, one should simply extend the urbanization series to circa 1700 and the issue is resolved.  In any case, this should fuel the skepticism towards the strength of the reversal of fortune argument.