National and local government administrative data, including income and other information for computing tax and records of the use of various administrative services, cover all individuals and are accurate. Therefore, the use of administrative data for empirical analysis in economics and other social sciences has expanded increasingly around the world in line with the spread of information and communications technologies over the past couple decades. In Japan, however, the academic use of administrative data had been extremely limited until recently, contributing to a relative decline in research levels compared with Europe and North America. In recent years, however, the use of administrative data for research has accelerated in Japan at last, due to growing interest in Evidence-Based Policy Making (EBPM).
For instance, the University of Tokyo Center for Research and Education in Program Evaluation (CREPE) launched a project for using local government tax data for promoting EBPM in 2021 and developed a system enabling researchers to use tax data provided by local governments for research. My RIETI project “Evaluation of the Effects of Institutional and Environmental Factors on Family Formation, Parental Labor Market Performance and Children's Academic Performance” plans to use resident tax data from the CREPE project to examine changes in income for women around childbirth and work adjustments for married women.
In the health economics field, several studies using health insurance receipt data from Japan has attracted international attention (including Iizuka and Shigeoka, forthcoming). Also, a growing number of education economists are attempting to cooperate with some local governments in developing comprehensive databases of administrative data on education. With cooperation with public schools, student, class and teacher data such as academic achievement test scores and school attendance are collected and used for research (relevant papers include Bessho et al 2019 and Oikawa et al, forthcoming).
Findings through use of administrative data
My expertise is labor economics. In this field, like in the other fields in empirical microeconomics, administrative data from local governments are expected to be quite useful in investigation of various topics including labor supply of women with children, the effect of individual income on family formation, and so on. Family composition changes such as marriage, childbirth, and divorce can be tracked accurately through the basic resident register. Labor supply can be estimated to some extent from wage income data. These data can be combined with records of the use of various institutions and policies implemented by national and local governments to examine how these institutions and policies contribute to promoting women’s social participation and mitigating the fall in the birthrate. This represents a typical example of policy evaluation for EBPM.
For instance, the abovementioned RIETI project aimed at evaluating the impacts of various institutions and external environmental factors related to child-raising generations and children uses tax data provided by local governments to accurately measure income changes (the “child penalty”), for women around childbirth. Then, the project explores factors that influence the size of the child penalty.
In fact, the accurate measurement of the child penalty had been difficult in Japan due to constraints on available data. Note that the fairly common opinion that Japan has no panel surveys is no longer true. I would like to point out that the amount of panel survey data made available by universities for research has steadily increased in the last two decades. However, the sample sizes for panel surveys by universities are limited. It is also difficult to accurately measure income from surveys in which respondents are asked to provide income data. Furthermore, the dropout rate from panel surveys rises when family composition changes. Given these limitations of panel survey data, administrative data that tracks childbirths for all registered households are indispensable for the accurate measurement of the child penalty.
Factors that may conceivably influence the size of the child penalty include personal factors such as family composition and the husband’s income, as well as regional factors such as the availability of childcare centers and labor market characteristics including the industrial structure and supply-demand balance. If information on childcare support policies is additionally made available, how specific policies mitigate the child penalty can be evaluated. For instance, gaps between households that used childcare centers or after-school childcare facilities and those that did not may be useful for designing better institutions.
Both administrative and survey data have advantages
The use of administrative data makes many new empirical studies possible. On the other hand, survey data from national censuses, labor force surveys and other direct surveys have advantages that administrative data lack. Finally, I would like to point out that however the use of administrative data for research progresses, the development and maintenance of survey data will remain very important.
The largest disadvantage of administrative data is that it fails to include information that is not required for administrative operations. For instance, educational backgrounds, which are included in most surveys, are not necessary for tax and other administrative operations, and therefore usually not included in administrative data.
In Japan, it is still extremely difficult to combine data from different data holders. For instance, data held by local governments are linked easily to the basic resident register and cover all residents, but they include little job information other than income, such as industry and job categories. This problem may be resolved if resident data is linked to data for employees who are covered by employment-, employee pension- and other social insurance systems for employees. In fact, such linked data have been used in studies from Northern Europe. In Japan, however, there are many technical and institutional obstacles to developing such data linkages.
Survey data have the advantage that questionnaires can be designed freely. Since local governments are the main sources of administrative data provided to researchers in Japan at present, large-scale nationwide government statistics are indispensable for finding trends for the whole of Japan. In Japan, there are complicated procedures for the secondary use of government statistics and many other problems regarding the use of government survey data and therefore, it is my hope that access to administrative and survey data for research purposes will be improved further.