Chapter 3
DATA AND METHODOLOGY

This chapter describes 1) the set of panel data, 2) the methodology applied, and 3) the inequality indicators that result from the calculations.

3.1. Data: the IRS Publications Statistics of Income

The Internal Revenue Service publications, Statistics of Income, Individual Inco me Tax Returns, are the primary source of the data used for the 50 states and the District of Columbia, from 1913 to 2003. Two variables are extracted from the IRS tables: the number of individual returns and the total income expressed in current dollars. Both variables are ranked by size of income (i.e. for a certain number of income brackets), and by state. For instance, the first page of the 1917 IRS table looks this way:

Table 3.1. An exemple of the IRIS Tables
Table 3.1. An exemple of the IRIS Tables

Typically, the state income tables annually released by the IRS display, for each state and for a series of income brackets, the number of tax returns, the dollar amount of adjusted gross income, and some of income sources. If the IRS tables have been available under an electronic format since 1997, the tables published earlier are paper-based. All paper-based tables were scanned and tabulated in Excel worksheets. They record the first two variables (number of returns, and income amounts).

Arraying these income classes from the lowest income to the highest, it is possible to derive cumulative totals of household population and income, and to draw partition lines splitting the distribution evenly in a series of fractiles. Fractiles divide a given set of observations into equal portions, e.g. into two halves with the median, into four quarters with quartiles (0-25 percent, 25-50 percent, 50-75 percent, and 75-100 percent), into ten equal shares with deciles (0-10 percent, 10-20 percent, etc.), one hundred shares with percentiles, etc.

What is called ‘top decile’ refers to the income group that represents the richest 10 percent of the population. Only the richest 10 percent of the population can be considered consistently from 1913 to 2003. Why is it relevant to reduce the whole distribution to the top decile only? Before 1944, large exemption levels reduced the proportion of individual filing returns to about 10-15 percent (versus the vast majority after WWII). Between 1913 and 1916, this share drops as low as 1 percent.