A.2.2. Classes Grouped for Disclosure Purposes, 1917-1937

As the IRS tables display income by income brackets, it happened for 1917-1937 that several income classes, typically those at the very top of the ranking, were displayed with an income grouped together and tagged with a note specifying that is was done to conceal the identity of taxpayer. The number of returns in every income class was left visible, however. To retrieve the income data removed for disclosure purposes, the estimates were approximated based on a mid-point approach. Breaking it down, the estimation method takes the midpoint of each income class that fell under disclosure. Expressed in dollars, each midpoint was weighted by the number of returns as they appear in each income class, and then added together. The final estimate therefore represents in the classes-grouped total (which is given in the tables) the same share as one midpoint represents in the sum of midpoints.

Consider the following example made of hypothetical numbers. The table below shows an income of $20,000,000 as the total of two income classes grouped together.

Table A2.1. An Example of Classes Grouped
Net income classes in current $ Number of returns Net income in $ Midpoint of the income class weighted by number of returns, in $ Estimated net income, in $
2,000,000 under 3,000,000 5 D 12,500,000 11,627,907
3,000,000 under 4,000,000 3 13,000,000 No need of … … estimation
4,000,000 under 5,000,000 2 D 9,000,000 8,372,093
5,000,000 and over 1 7,000,000 No need of … … estimation
Total classes grouped   20,000,000 21,500,000 20,000,000

The income class [$2,000,000 - $3,000,000) records five returns while that of $4,000,000 - $5,000,000, only two. Both classes fall under the protection of the taxpayer's identity. For those two income classes, we weighted the midpoint income by the number of returns ($12,500,000 = $2,500,000 * 5, and $9,000,000 = $4,500,000 * 2). Relative to $20,000,000, which is the classes-grouped total displayed, the net income estimates should keep the same proportions as the midpoint figure in the total of $21,500,000 (obtained by adding $12,500,000 to $9,000,000). In other words, if the midpoint of the [$2,000,000 - $3,000,000) income class represents 58% of the midpoints total (58% = 100 * 12,500 / 21,500), then the net income estimate should also represent 58% of the classes-grouped total, namely: 58% * 20,000,000 = $11,627,907. Likewise for estimating the net income of the [$4,000,000 - $5,000,000) income class ($20,000,000 * $9,000,000 / $21,500,000 = $8,372,093). As this method refers to ‘relative midpoint’ for estimation, the sum of all estimates always equals the given classes-grouped total.

In a minority of cases 48 , the income estimate calculated with the ‘relative midpoint’ approach did not lie within the interval of the corresponding income class. Whenever an estimates fell below the lower bound of the income class, it was replaced with the lower bound weighted by the number of returns. Whenever the estimate exceeded the upper bound, it was replaced with the upper bound weighted by the number of returns. Either way introduces a deviation of the sum of the estimates from the given classes-grouped total. One could have used instead the ‘absolute midpoint’ approach. However, the sum of the income estimates would have deviated further apart from the classes-grouped total.

The highest income class does not have an upper bound, e.g. [$5,000,000 and over). In that case, we used the ‘relative mid-point’ approach with an assumption on the upper bound, but replaced the estimate of the highest income class by the residual term. Doing so eliminated the deviation of the sum of the estimates from the classes-grouped total.

Whatever the estimation method is, none of them can correct for the errors coming from the IRS itself. For instance, New Hampshire in 1918 recorded classes-grouped that were displayed with a total income amount that cannot be consistent with the income classes it corresponds to, as shown in the table below.

Table A2.2. A Typo in New Hampshire Table, 1918
New Hampshire, 1918 Net income classes, $ Number of returns Net income, in $
150,000 under 200,000 1 D
200,000 under 250,000 2 D
Classes grouped --- 739,319

Indeed, the total of classes grouped ($739,319) exceeds the sum of the upper bounds weighted by the number of returns (739,319 > 700,000 = 200,000 + 500,000). Likewise for Nevada 1927 (Table A2.3):

Table A2.3. Another Typo in IRS Tables
Nevada, 1927
Net income classes, $
Number of returns Net income $
60 under 70 1 D
70 under 80    
80 under 90 2 D
90 under 100 1 D
100 under 150 1 D
150 under 200    
200 under 250 1 D
Classes grouped --- 820,937

The highest estimate, that is, the sum of the upper bounds weighted by the number of returns (750,000 = 70,000 + 180,000 + 100,000 + 150,000 + 250,000), lies beneath the level of classes grouped ($820,937).

Overall, 92 out of 1,071 classes grouped (51 states times 21 years between 1917 and 1937), had to be ‘midpoint’ adjusted. This represents around 8% of all income classes during this period.

There is no net income data by size of net income and by state for years 1913, 1914, and 1915. Only the number of returns is available. The net income estimates were derived from the same estimation approach as the classes grouped estimates were based on.

Notes
48.

The minority of cases lies around 8% of all income classes displayed in the IRS tables from 1917 to 1937.