Interwar US inequality data are deeply flawed

For some years now, Phil Magness and myself have been working on improving the existing income inequality for the United States prior to World War II. One of the most important point we make concerns why we, as economists, ought to take data assumptions seriously. One of the most tenacious stylized facts (that we do not exactly dispute) is that income inequality in the United States has followed a U-curve trajectory over the 20th century. Income inequality was high in the early 1920s and descended gradually until the 1960s and then started to pick up again. That stylized fact comes from the work of Thomas Piketty and Emmanuel Saez with their data work (first image illustrated below). However, from the work of Auten and Splinter and Mechling et al. , we know that the increase post-1960 as measured by Piketty is somewhat overstated (see second image illustrated below).  While the criticism suggest a milder post-1960 increase, me and Phil Magness believe that the real action is on the left side of the U-curve (pre-1960).

Inequality

NOL1

Why? Here is our case made simple: the IRS data used to measure inequality up to at least 1943 are deeply flawed. In another paper recently submitted, I made the argument that some of the assumptions made by Piketty and Saez had flaws. This did not question the validity of the data itself. We decided to use state-level income tax data from the IRS to compute the state-level inequality and compare them with state-income tax data (e.g. the IRS in Wisconsin versus Wisconsin’s own personal income tax data). What we found is that the IRS data overstates the level of inequality by appreciable proportions.

Why is that? There are two reasons. The first is that the federal tax system had wide fluctuations in tax rates between 1917 and 1943 which means wide fluctuations in tax compliance. Previous scholars such as Gene Smiley pointed out that when tax rates fell, compliance went up so that measured inequality went up. But measured inequality is not true inequality because “off-the-books” income (which was unmeasured) divorced true inequality from measured inequality.  This is bound to generate false fluctuations in measurement as long as tax compliance was voluntary (which is true until 1943). State income taxes do not face that problem as their tax systems tended to be more stable throughout the period. The same is true with personal exemptions.

The second reason speaks to the manner the federal data is presented. The IRS created wide categories with the numbers of taxpayers according to net taxable income (rather than gross income) in each categories. For example, the categories go from 0$ to 1,000$ per filler and then increase by slice of 1,000$ until 10,000$ and then by slices of 5,000$ etc. This makes it hard to pinpoint where to start each the calculations for each of the fractiles of top earners. This is not true of all state income tax systems. For example, Delaware sliced the data by categories of 100$ and 500$ instead. Thus, we can more easily pinpoint the two. More importantly, most state-income tax systems reported the breakdown both for net taxable and gross income. This is crucial because Piketty and Saez need to adjust the pre-1943 IRS data – which are in net income – to that they can tie properly with the post-1943 IRS data – which are in adjusted gross income. Absent this correction, they would get an artificial increase in inequality in 1943. The problem is that the data for this adjustment is scant and their proposed solution has not been subjected to validation.

What do our data say? We compared them to the work of Mark Frank et al. who used the same methodology and Piketty Saez but at the state-level using the same sources. The image below pretty much sums it up! If the points are above the red line, the IRS data overestimates inequality. If below, the IRS underestimates. Overall, the bias tends towards overestimation. In fact, when we investigated all of the points separately, we found that those below the red line result merely from the way that Delaware’s (DE) was adjusted to convert net income into gross income. When we compared only net income-based measures of inequality, none are below the red line except Delaware from 1929 to 1931 (and by much smaller margins than shown in the figure below).

IRS.png

In our paper, we highlight how the state-level data is conceptually superior to the federal-level data. The problem that we face is that we cannot convert those measures into adjustments for the national level of inequality. All that our data do is suggest which way the bias cuts. While we find this unfortunate, we highlight that this would unavoidably alter the left side of the curve in the first graph of this blog post. The initial level of inequality would be less than it is now. Thus, combining this with the criticisms made for the post-1960 era, we may be in presence of a U-curve that looks more like a shallow tea saucer than the pronounced U-curve generally highlighted.  The U-curve form is not invalidated (i.e. is it a quadratic-looking function of time or not), but the shape of the curve’s tails is dramatically changed.

Shares of Income – Common Left Delusions

Two big conceptual mistakes are hidden in one small graph that help the leftist delusion.

1. I do not contest the data. I have not checked them. They may be correct. I don’t know; I have another purpose.

2. People who use this graph (though not the makers of the graph, maybe) implicitly assume that those who were in the top income 1% in 1980 are the same as those who are in the top 1% in 2016, or their parents.  The graph says nothing about this. One thing is clear: Steve Job or his parents would not have been in the top 1% in 1980; Steve Jobs would have been, for sure, in 2010, his estate in 2016. The graph does not show the perpetuation of privilege and of inequality, as users almost always imply. Suppose that 100% of those who were in the 1% in 2016 were not (or their parents, or their grandparents) in 1980. This would show a fast change of economic elites. It might pose a problem but not the problem the envious imply when they display the graph.

noldelacroixsharesofincome

The problem here is intellectual passivity.

3. The percentage of income that accrues to a given fraction of the population – including the top 1%  – tells you nothing about how well anyone has fared economically, whether anybody is richer or poorer than he was at the beginning. Here is an example: Suppose, you and I both earn $1,000 at the beginning of the period of observation. Thus, we each get 50% of our joint income (1000/2000). Suppose further that during the period observation, my income doubles while yours quadruples, I am now getting only 33% while you are getting 66% (2000/2000+4000 vs 4000/2000+4000). My share in percentage terms has declined while yours has ballooned. Question: Am I now poorer than I was at the beginning of the period? That’s a “Yes/No” question.; don’t equivocate. The problem is here is failure to understand elementary school math.

The chart is produced by the World Inequality Organization, a single purpose outfit not dedicated to the possibility that inequality may be decreasing. The data it offers have not been certified by the usual scholarly processes  This organization’s executive committee includes Thomas Piketty who could not get his data straight in his best-selling book. He had to refer critics to a website to get his story down. The earlier edition of the same book became famous for not including in US calculations: food stamps, rent support, free medical care, and more, in US welfare recipients’ incomes. I don’t know the others, which may or may not matter. Too many Europeans for my taste. I don’t like it, from 40 years of observation. That last remark is somewhat subjective, of course.

Together these simple comments add up to this critical judgment of the relevant chart: Either, those who use it normally don’t know what they are talking about or, they are not saying anything that matters.

On “strawmanning” some people and inequality

For some years now, I have been interested in the topic of inequality. One of the angles that I have pursued is a purely empirical one in which I attempt to improvement measurements. This angle has yielded two papers (one of which is still in progress while the other is still in want of a home) that reconsider the shape of the U-curve of income inequality in the United States since circa 1900.

The other angle that I have pursued is more theoretical and is a spawn of the work of Gordon Tullock on income redistribution. That line of research makes a simple point: there are some inequalities that are, in normative terms, worrisome while others are not. The income inequality stemming from the career choices of a benedictine monk and a hedge fund banker are not worrisome. The income inequality stemming from being a prisoner of one’s birth or from rent-seekers shaping rules in their favor is worrisome.  Moreover, some interventions meant to remedy inequalities might actually make things worse in the long-run (some articles even find that taxing income for the sake of redistribution may increase inequality if certain conditions are present – see here).  I have two articles on this (one forthcoming, the other already published) and a paper still in progress (with Rosolino Candela), but they are merely an extension of the aforementioned Gordon Tullock and some other economists like Randall Holcombe, William Watson and Vito Tanzi. After all, the point that a “first, do no harm” policy to inequality might be more productive is not novel (all that it needs is a deep exploration and a robust exposition).

Notice that there is an implicit assumption in this line of research: inequality is a topic worth studying. This is why I am annoyed by statements like those that Gabriel Zucman made to ProMarket. When asked if he was getting pushback for his research on inequality (which is novel and very important), Zucman answers the following:

Of course, yes. I get pushback, let’s say not as much on the substance oftentimes as on the approach. Some people in economics feel that economics should be only about efficiency, and that talking about distributional issues and inequality is not what economists should be doing, that it’s something that politicians should be doing.

This is “strawmanning“. There is no economist who thinks inequality is not a worthwhile topic. Literally none. True, economists may have waned in their interest towards the topic for some years but it never became a secondary topic. Major articles were published in major journals throughout the 1990s (which is often identified as a low point in the literature) – most of them groundbreaking enough to propel the topic forward a mere decade later. This should not be surprising given the heavy ideological and normative ramifications of studying inequality. The topic is so important to all social sciences that no one disregards it. As such, who are these “some people” that Zucman alludes too?

I assume that “some people” are strawmen substitutes for those who, while agreeing that inequality is an important topic, disagree with the policy prescriptions and the normative implications that Zucman draws from his work. The group most “hostile” to the arguments of Zucman (and others such as Piketty, Saez, Atkinson and Stiglitz) is the one that stems from the public choice tradition. Yet, economists in the public-choice tradition probably give distributional issues a more central role in their research than Zucman does. They care about institutional arrangements and the rules of the game in determining outcomes. The very concept of rent-seeking, so essential to public choice theory, relates to how distributional coalitions can emerge to shape the rules of the game in a way that redistribute wealth from X to Y in ways that are socially counterproductive. As such, rent-seeking is essentially a concept that relates to distributional issues in a way that is intimately related to efficiency.

The argument by Zucman to bolster his own claim is one of the reason why I am cynical towards the times we live in. It denotes a certain tribalism that demonizes the “other side” in order to avoid engaging in them. That tribalism, I believe (but I may be wrong), is more prevalent than in the not-so-distant past. Strawmanning only makes the problem worse.

On the “tea saucer” of income inequality since 1917

I disagree often with the many details that underlie the arguments of Thomas Piketty and Emmanuel Saez. That being said, I am also a great fan of their work and of them in general. In fact, I think that both have made contributions to economics that I am envious to equal. To be fair, their U-curve of inequality is pretty much a well-confirmed fact by now: everyone agrees that the period from 1890-1929 was a high-point of inequality which leveled off until the 1970s and then picked up again.

Nevertheless, while I am convinced of the curvilinear aspect of the evolution of income inequality in the United State as depicted by Piketty and Saez, I am not convinced by the amplitudes. In their 2003 article, the U-curve of inequality really looks like a “U” (see image below).  Since that article, many scholars have investigated the extent of the increase in inequality post-1980 (circa). Many have attenuated the increase, but they still find an increase (see here here here here here here here here here). The problem is that everyone has been considering the increase – i.e. the right side of the U-curve. Little attention has been devoted to the left side of the U-curve even though that is where data problems should be considered more carefully for the generation of a stylized fact. This is the contribution I have been coordinating and working on for the last few months alongside John Moore, Phil Magness and Phil Schlosser. 

Blog Figure

To arrive at their proposed series of inequality, Piketty and Saez used the IRS Statistics of Income (SOI) to derive top income fractiles. However, the IRS SOI have many problems. The first is that between 1917 and 1943, there are many years where there are less than 10% of the potential tax population that files a tax return. This prohibits the use of a top 10% income share in many years unless an adjustment is made. The second is that prior to 1943, the IRS reports net income and reports adjusted gross income after 1943. As such, to link post-1943 with pre-1943, there needs to be an additional adjustment. Piketty and Saez made some seemingly reasonable assumptions, but they have never been put to the test regarding sensitivity and robustness. This is leaving aside issues of data quality (I am not convinced IRS data is very good as most of it was self-reported pre-1943 which is a period with wildly varying tax rates). The question here is “how good” are the assumptions?

What we did is verify each assumption to see their validity. The first one we tackle is the adjustment for the low number of returns. To make their adjustments, Piketty and Saez used the fact that single households and married households filed in different quantities relative to their total population. Their idea is that a year in which there was a large number of return was used, the ratio of single to married could be used to adjust the series. The year they used is 1942. This is problematic as 1942 is a war year with self-reporting when large quantities of young American males are abroad fighting. Using 1941, the last US peace year, instead shows dramatically different ratios. Using these ratios knocks off a few points from the top 10% income share. Why did they use 1942? Their argument was there was simply not enough data to make the correction in 1941.  They point to a special tabulation in the 1941 IRS-SOI of 112,472 1040A forms from six states which was not deemed sufficient to make to make the corrections. However, later in the same document, there is a larger and sufficient sample of 516,000 returns from all 64 IRS collection districts (roughly 5% of all forms). By comparison, the 1942 sample Piketty and Saez used to correct only had 455,000 returns.  Given the war year and the sample size, we believe that 1941 is a better year to make the adjustment.

Second, we also questioned the smoothing method to link net income-based series with adjusted-gross income based series (i.e. pre-1943 and post-1943 series). The reason for this is that the implied adjustment for deductions made by Piketty and Saez is actually larger than all the deductions claimed that were eligible under the definition of Adjusted Gross Income – which is a sign of overshot on their parts. Using the limited data available for deductions by income groups and making some assumptions (very conservative ones) to move further back in time, we found that adjusting for “actual deductions” yields a lower level of inequality. This is contrasted with the fixed multipliers which Piketty and Saez used pre-1943.

Third, we question their justification for not using the Kuznets income denominator. They argued that Kuznets’ series yielded an implausible figure because, in 1948, its use yielded a greater income for non-fillers than for fillers.  However, this is not true of all years. In fact, it is only true after 1943. Before 1943, the income of non-fillers is equal in proportion to the one they use post-1944 to impute the income of non-fillers. This is largely the result of an accounting error definition. Incomes before 1943 were reported as net income and as gross incomes after that point. This is important because the stylized fact of a pronounced U-curve is heavily sensitive to the assumption made regarding the denominator.

These three adjustments are pretty important in terms of overall results (see image below).  The pale blue line is that of Piketty of Saez as depicted in their 2003 paper in the Quarterly Journal of Economics. The other blue line just below it is the effect of deductions only (the adjustment for missing returns affects only the top 10% income share). All the other lines that mirror these two just below (with the exception of the darkest blue line which is the original Kuznets inequality estimates) compound our corrections with three potential corrections for the denominators. The U-curve still exists, but it is not as pronounced. When you look with the adjustments made by Mechling et al. (2017) and Auten and Splinter (2017) for the post-1960 period (green and red lines) and link them with ours, you can still see the curvilinear shape but it looks more like a “tea saucer” than a pronounced U-curve.

In a way, I see this as a simultaneous complement to the work of Richard Sutch and to the work of Piketty and Saez: the U-curve still exists, but the timing and pattern is slightly more representative of history. This was a long paper to write (and it is a dry read given the amount of methodological discussions), but it was worth it in order to improve upon the state of our knowledge.

FigureInequality

Is the U-curve of US income inequality that pronounced?

For some time now, I have been skeptical of the narrative that has emerged regarding income inequality in the West in general and in the US in particular. That narrative, which I label UCN for U-Curve Narrative, simply asserts that inequality fell from a high level in the 1910s down to a trough in the 1970s and then back up to levels comparable to those in the 1910s.

To be sure, I do believe that inequality fell and rose over the 20th century.  Very few people will disagree with this contention. Like many others I question how “big” is the increase since the 1970s (the low point of the U-Curve). However, unlike many others, I also question how big the fall actually was. Basically, I do think that there is a sound case for saying that inequality rose modestly since the 1970s for reasons that are a mixed bag of good and bad (see here and here), but I also think that the case that inequality did not fall as much as believed up to the 1970s is a strong one.

The reasons for this position of mine relates to my passion for cliometrics. The quantitative illustration of the past is a crucial task. However, data is only as good as the questions it seek to answer. If I wonder whether or not feudal institutions (like seigneurial tenure in Canada) hindered economic development and I only look at farm incomes, then I might be capturing a good part of the story but since farm income is not total income, I am missing a part of it. Had I asked whether or not feudal institutions hindered farm productivity, then the data would have been more relevant.

Same thing for income inequality I argue in this new working paper (with Phil Magness, John Moore and Phil Schlosser) which is a basically a list of criticisms of the the Piketty-Saez income inequality series.

For the United States, income inequality measures pre-1960s generally rely on tax-reporting data. From the get-go, one has to recognize that this sort of system (since it is taxes) does not promote “honest” reporting. What is less well known is that tax compliance enforcement was very lax pre-1943 and highly sensitive to the wide variations in tax rates and personal exemption during the period. Basically, the chances that you will report honestly your income at a top marginal rate of 79% is lower than had that rate been at 25%. Since the rates did vary from the high-70s at the end of the Great War to the mid-20s in the 1920s and back up during the Depression, that implies a lot of volatility in the quality of reporting. As such, the evolution measured by tax data will capture tax-rate-induced variations in reported income (especially in the pre-withholding era when there existed numerous large loopholes and tax-sheltered income vehicles).  The shift from high to low taxes in the 1910s and 1920s would have implied a larger than actual change in inequality while the the shift from low to high taxes in the 1930s would have implied the reverse. Correcting for the artificial changes caused by tax rate changes would, by definition, flatten the evolution of inequality – which is what we find in our paper.

However, we go farther than that. Using the state of Wisconsin which had a tax system with more stringent compliance rules for the state income tax while also having lower and much more stable tax rates, we find different levels and trends of income inequality than with the IRS data (a point which me and Phil Magness expanded on here). This alone should fuel skepticism.

Nonetheless, this is not the sum of our criticisms. We also find that the denominator frequently used to arrive at the share of income going to top earners is too low and that the justification used for that denominator is the result of a mathematical error (see pages 10-12 in our paper).

Finally, we point out that there is a large accounting problem. Before 1943, the IRS provided the Statistics of Income based on net income. After 1943, there shift between definitions of adjusted gross income. As such, the two series are not comparable and need to be adjusted to be linked. Piketty and Saez, when they calculated their own adjustment methods, made seemingly reasonable assumptions (mostly that the rich took the lion’s share of deductions). However, when we searched and found evidence of how deductions were distributed, they did not match the assumptions of Piketty and Saez. The actual evidence regarding deductions suggest that lower income brackets had large deductions and this diminishes the adjustment needed to harmonize the two series.

Taken together, our corrections yield systematically lower and flatter estimates of inequality which do not contradict the idea that inequality fell during the first half of the 20th century (see image below). However, our corrections suggest that the UCN is incorrect and that there might be more of small bowl (I call it the Paella-bowl curve of inequality, but my co-authors prefer the J-curve idea).

InequalityPikettySaez.png

Did Inequality Fall During the Great Depression ?

Inequality

The graph above is taken from Piketty and Saez in their seminal 2003 article in the Quarterly Journal of Economics. It shows that inequality fell during the Great Depression. This is a contention that I have always been very skeptical of for many reasons and which has been – since 2012 – the reason why I view the IRS-data derived measure of inequality through a very skeptical lens (disclaimer: I think that it gives us an idea of inequality but I am not sure how accurate it is).

Here is why.

During the Great Depression, unemployment was never below 15% (see Romer here for a comparison prior to 1930 and this image derived from Timothy Hatton’s work). In some years, it was close to 25%. When such a large share of the population is earning near zero in terms of income, it is hard to imagine that inequality did not increase. Secondly, real wages were up during the Depression. Workers who still had a job were not worse off, they were better off. This means that you had a large share of the population who saw income reductions close to 100% and the remaining share saw actual increases in real wages. This would push up inequality no questions asked. This could be offset by a fall in the incomes from profits of the top income shares, but you would need a pretty big drop (which is what Piketty and Saez argue for).

There is some research that have tried to focus only on the Great Depression. The first was one rarely cited NBER paper by Horst Mendershausen from 1946 who found modest increases in inequality from 1929 to 1933. The data was largely centered on urban data, but this flaw works in favor of my skepticism as farm incomes (i.e. rural incomes) fell more during the depression than average incomes. There is also evidence, more recent, regarding other countries during the Great Depression. For example, Hungary saw an increase in inequality during the era from 1928 to 1941 with most of the increase in the early 1930s. A similar development was observed in Canada as well (slight increase based on the Veall dataset).

Had Piketty and Saez showed an increase in inequality during the Depression, I would have been more willing to accept their series with fewer questions and doubts. However, they do not discuss these points in great details and as such, we should be skeptical.

Inequality and Regional Prices in the US, 2012

I have just completed a short piece on the impact of regional prices on the measurement and geographic distribution of low income individuals. Basically, Youcef Msaid and myself* used the March 2012-CPS data combined the BEA’s regional purchasing power parities database to correct incomes.

We found is that the level of inequality is very modestly overestimated (0.5%). Now this is a conservative estimate since we used state-level corrections for price differences. This means that we took price corrections for New York state as a whole even if there are wide differences within New York state. Obviously, with more fine-grained price-level adjustments we would find a bigger correction but it is hard to imagine that it could surpass 1-3%.

That was not our most important result. Our most important result relates to where the bottom decile of the income distribution is geographically located. We find that instead of being found disproportionately (relative to their share of the total US population) in poorer states, the bottom decile is disproportionately found in rich states. The dotted black line in the figure below illustrates the change in the number of individuals who are, nationally, in the bottom 10%. New York and California have significant increases while West Virginia has a large decrease. The dark black line shows the same for the top 10%.

fig2

Another way to grasp the magnitude of this change is to relate the change to the population shares of each decile by state. For example, New York had 6.29% of the US population in 2012 and 6.61% of all Americans in the bottom 10% of the income distribution before adjusting for regional purchasing parities. After adjusting however, New York’s share of the bottom 10% surges to 7.88%.

Why does it matter? Because most of the cost difference adjustments come from differences in housing costs. The first obvious point is that housing is a crucial aspect of any discussion of inequality. The second, but less obvious point,  is that these differences are massive barriers to migration within the United States and the poorest are those for whom these barriers are the heaviest. Unfortunately, the high-cost areas are also high-productivity areas (New York, San Francisco for example) whose high costs are largely the result of restrictions on the supply of housing. This means that high-productivity areas – which would raise the wages of low-skilled and low -income workers are inaccessible to them. It also means that those who were present before the increase in productivity of these areas capitalized the gains in more valuable real estates (even if this means lower real incomes).

In this light, the geographic reallocation of the bottom 10% is consistent with an emerging literature that argues that inequality is in great a result of housing policy (see notably Rognlie’s reply to Piketty in the Brookings Papers).  This small modification (I consider it small) that me and Youcef made has important logical ramifications.

* Thank you to my friends Rick Weber (who blogs here at NOL and whose research can be seen here) and Ryan Murphy (whose research can be found here) who provided good comments to bring the paper to the stage where we are ready to submit.