Skip to main content
U.S. flag

An official website of the United States government

Publication Date
14 May 2024

A Consistent Dataset of Net Income Distribution for 190 Countries from 1958 to 2015

Subtitle
Detailed data on income distributions can now be used to calibrate multisector, economic, and other models.
Print / PDF
Powerpoint Slide
Image
Image Caption

Detailed data on income distributions can now be used to calibrate multisector, economic, and other models.

|
Image Credit

Image courtesy of  Pexels

Science

Analysis of income distribution is of growing importance for a variety of purposes, both as determinants of future consumption, which is highly dependent on the distribution of income as demand for most commodities is non-linear in income and environmental impacts and of future vulnerability to social, economic, and environmental stressors. While datasets on income distribution collected from household surveys are available for multiple countries, these datasets often do not represent the same income concept (net income vs. consumption) and, therefore, make comparisons across countries, over time, and across datasets difficult. Sometimes the same dataset has mixed observations for net income and consumption for the same country in different years. Such inconsistencies can occur because the underlying surveys in different years might have been conducted to measure different income concepts. Previous studies that have made use of these datasets for analysis or for modeling income distributions have treated these income concepts as interchangeable, potentially leading to misinterpretation of analysis results using this data. 

Impact

In this paper, we develop and present a consistent dataset of income distributions across 190 countries from 1958 to 2015 measured in terms of a single income concept, namely, net income. This dataset is constructed by first choosing net income decile data observations from all available sources for all available countries. For countries that only have consumption distribution data, we impute the net income distribution using a regression-based approach. For countries and years where no data on income distribution is available, we impute income deciles using the GINI coefficient combined with a principal component analysis (PCA) based method that provides a better fit to data than existing methods. This PCA-based method was recently developed as a non-parametric approach to projecting income distribution.  We also present income distributions for 32 aggregated regions in addition to the 190 countries. Our aggregation method takes into account cross-country variations in income distribution within a region in addition to within-country variations. Similarly, we also aggregated the country-level income distributions for the world as a whole and showed their temporal trends. 

Summary

To our knowledge, there is no other dataset that presents consistent data at multiple geographical scales that has been documented in a peer-reviewed article. This complete and harmonized dataset may be useful for efforts related to the modeling of the net income distribution. The aggregation method we developed takes into account both within-country and across-country variations when aggregating income distributions to regional boundaries. This delineation is important to regions where there is significant diversity in the income distribution across regions, such as central Asia, where the aggregated income distribution is significantly more unequal than in any of the member countries. Finally, the data generation described above is documented as an open-source workflow of a software package called pridr (https://github.com/JGCRI/pridr), which can be used to generate and re-aggregate these data. 

Point of Contact
Marshall Wise
Institution(s)
Pacific Northwest National Laboratory
Funding Program Area(s)
Publication