Using Big Data to see the changes in our daily lives
On May 5, 2023, the World Health Organization ended its global health emergency declaration for the COVID-19 pandemic. For 1,191 days since the global health emergency declaration on January 30, 2020, our eating, learning, working and leisure practices were transformed. In particular, in the first half of 2020, panic consumption occurred focusing on COVID-19 infection prevention goods and paper products. We witnessed such changes to consumer behavior every day through media and video sources.
At the time, I had a gut feeling that these unprecedented changes would not be detected through official statistical surveys alone. During the COVID-19 pandemic, purchases of goods and services fluctuated in volume on a daily or weekly basis depending on the infection status and government announcements. Such fluctuations were not expected to be reflected in official statistical surveys which mainly consist of monthly and annual aggregations. In the initial phase of the pandemic especially, data had to be quickly reported to detect panic consumption and make quick decisions to secure supply routes and inventories. In addition, detailed sales trends by product were required for infection prevention goods, food products, day-to-day necessities and drugs, for which demand fluctuated greatly during the pandemic.
Therefore, we analyzed point of sale and household account book application data and continued to record and disseminate consumption behavior information in Japan throughout the pandemic.
Impetus provided by the METI Big Data Project
From 2016 to 2020, I participated in the Project for the Development of New Indicators Utilizing Big Data at the Research and Statistics Department of the Ministry of Economy, Trade and Industry. The main objective of the project was to reduce the burden on survey targets, cut survey costs and improve the speed and level of detail of survey reports by utilizing private sector big data for official statistical surveys. At the same time, we proactively released new indicators. In November 2019, we developed a METI POS retail sales index (micro) using POS data from INTAGE Inc. (hereinafter referred to as INTAGE) and GFK Marketing Service Japan., in order to ascertain the impact of the consumption tax hike that occurred at that time. The "BigData-STATS Dashboard (β version)" was launched on the METI website (Note 1) to publish the previous week’s sales trend indicators every Friday, allowing viewers to capture the actual status of consumption on a real-time basis (updates were discontinued on April 2, 2022).
During the chaotic period between the beginning of the COVID-19 outbreak and Japan’s first state of emergency declaration, this project became a platform for public-private cooperation allowing government and other public organizations to share daily sales trends for infection prevention products and day-to-day necessities. In addition, the dashboard was used as a source of information on consumption trends during the COVID-19 pandemic (Note 2), displaying conference documents produced by government and other public organizations, reports by private think tanks, and material from media outlets and researchers.
At an international symposium (Note 3) on March 23, 2020, I reported (1) a dramatic increase in sales of infection prevention products, such as masks, and their shortages, (2) hoarding of food in preparation for school closures and requests to work from home, (3) explosive buying of paper products due to widespread misinformation being spread through social media and (4) a fall in cosmetics sales because people covered their faces with masks and stayed home more, reducing the overall demand for cosmetics. These findings were a result of detailed, quick reporting and swift collection of big data. I continued to report consumption trends throughout the pandemic (Note 4).
Decomposing periods of the COVID-19 pandemic based on changes in sales trends for different goods
INTAGE's SRI+ nationwide retail store panel survey collects POS data from about 6,000 stores in Japan, including supermarkets, convenience stores, home centers, discount stores, drug stores and specialty shops. The data were classified into 344 product items, including food, beverages, day-to-day miscellaneous goods, cosmetics and pharmaceuticals, and ranked by sales value for each year.
Figure 1 is a scatterplot of rankings for a year (vertical axis) and the previous year (horizontal axis). The size of the bubble is the square of the ranking difference. Product items for which rankings rose or fell at least 30 places are named. The upper right edge of the 45-degree line represents top-ranked product for both years. The lower left end indicates the 344th place for both years. Rankings remained unchanged from the previous year for product items on the 45-degree line. Rankings rose from the previous year for those items located above the 45-degree line and fell for those located below the line.
In 2019, many product items were located on the 45-degree line. No category rose or fell 30 or more places. Many among the 344 product items are day-to-day necessities whose rankings were stable before the pandemic.
In 2020 when the pandemic started, however, rankings rose 30 places or more for infection prevention goods, such as masks, hand sanitizers, disinfectants, gargle medicines and thermometers, as well as soaps and wet tissue that were consumed as substitutes for infection prevention goods when such goods were in short supply. On the other hand, rankings fell 30 places or more for lipsticks, blushes, other lip makeup products, wet paper facial masks, sunscreen and gifts (mid-year, year-end, and other gifts). The ranking also dropped steeply for cardiotonic drugs that had sold well to inbound foreign tourists before the pandemic.
In 2021, no products rose 30 or more places. However, soaps, bactericidal disinfectants, hand sanitizers and gargle medicines among items that had sold well in 2020, fell 30 or more places.
In 2022, almost all items were located close to the 45-degree line. The concentration around the line was even stronger than in 2019 before the pandemic. For most of the product items, rankings in 2022 were similar to those in 2021. In 2022, test reagents emerged as a distinctive product item.
Figure 1 shows that the consumption trend under the pandemic changed from the panic consumption period of 2020 to the new normal period of 2021-2022.
Graphing our consumption behavior, a flower emerged
Next, let's visually examine weekly ranking changes with a rank clock. Since it is difficult for human eyes to identify changes for all 344 product items over 209 weeks, I selected items that ranked in the top 20 in terms of sales in at least one of the 209 weeks since 2019 and compiled the data in Table 1.
Figure 2 is a graph of weekly rankings for product items in Table 1, with the top-ranking item being at the center, and lower places deviating from the center. Table 1 shows that most of the product items in the top 20 rankings are food and beverage products. In particular, tobacco, beer, pastries, liquid tea, milk, ice cream and coffee drinks have stayed at or around the center.
As data are like living creatures, the graph shape changes depending on the product items, locations and times. Figure 2 forms a unique flower shape, indicating that there are items that sell well for the year-end and New Year holidays and in some seasons or some events, and poorly for others. For example, demand increases for kamaboko and other fish paste products in January and other winter months, for chocolate towards Valentine's Day, for rhinitis treatments during spring and autumn hay fever periods, for batteries just before typhoons, and for insecticides in summer. Gifts are ranked low except during mid-year and year-end gift seasons. Using weekly data on detailed items, we can see clear seasonal changes.
As shown by Table 1, the items that ranked high because of the pandemic were masks and paper products. Masks ranked 200th in the summer before the pandemic and peaked in second place in the fourth week of January 2020 in the early days of the pandemic. Figure 1 shows that masks ranked high in annual sales in 2020. On the other hand, personal care products that were selling well just before the consumption tax hike in 2019 and paper products subjected to panic buying at the beginning of the pandemic ranked high in Table 1 but failed to continue to sell well or be among the items that sold prominently in Figure 1. In addition to seasonal changes, weekly data indicate short-term shocks.
What will happen to mask consumption? - What will day-to-day life be like after the COVID-19 pandemic?
Finally, let's look at the impact of the pandemic on the seasonality of consumer behavior. Figure 3 is a cumulative density function of mask sales volume, accumulating weekly sales’ shares of annual sales. The 45-degree line represents the function of a constant sales volume, meaning that there was actually a sales increase of approximately 1.92% (1/52≒ 0.0192).
In 2018 and 2019, before the pandemic, sales concentrated in winter and in the spring hay fever season. Sales in the second and third weeks of April accounted for half of annual sales. Sales stagnated in summer before some 30% of annual sales occurred in autumn and winter. Mask sales in the two years thus indicate seasonal changes.
In 2020, cumulative sales topped 30% of annual sales in the week including the January 30 WHO global health emergency declaration and stagnated amid mask shortages later. Cumulative sales exceeded 50% at the end of the first state of emergency, before increasing at a constant pace later as masks returned to the market.
Pay attention to the trends in 2021 and 2022. In both years, trend lines mostly overlapped the 45-degree line, meaning that weekly sales remained unchanged. Panic buying and mask supply shortages seen due to the sudden increase in demand in 2020 disappeared in those two years. At the same time, seasonal changes seen before the pandemic also disappeared in the two years, resulting in a new mask consumption pattern.
On May 8, 2023, COVID-19 was downgraded to a Category 5 infectious disease. Will mask consumption return to the pre-COVID pattern or create a new pattern? Either way, we can say that masks are the most representative goods of the pandemic, as the mask market was affected first and to the largest extent under the pandemic in Japan.
Towards a society where official statistical surveys and data platforms are both active
The use of consumption-related big data, including credit card information, home scan data, household account book application information and electronic money information, as well as POS data, progressed under the pandemic. I used POS data, which for example demonstrate a decrease in makeup product sales and an increase in mask sales, to analyze voluntary restrictions on going out and mask-wearing. More directly, however, human traffic data can be used for such purposes. Artificial intelligence image recognition is capable of measuring mask-wearing rates in public.
In the current age, where it is possible to learn anything in detail, anywhere and at any time, I feel that the most difficult thing is to decide what to measure and to continue measuring those things. The existence and use of official statistical surveys that continue to measure the status of a country based on long-term hypotheses is very encouraging, because findings from big data can be checked against survey data that will be made available by government agencies one or two months or one year after the big data is available, revealing trends. This intellectual foundation has allowed me to freely utilize big data without hesitation.
Human resources and budget constraints have been significant constraints on the production of statistics for quite a long time now, but I hope that official statistical surveys will continue to develop by mobilizing digital technologies, private sector data, administrative records and by employing statistics experts. The fact that the government has the means to learn about business conditions and people’s lives and to produce statistics means that its policies can be more persuasive and that their policies can have an intellectual basis. With this in mind, collecting multiple types of data from multiple companies, instead of data from one company in one field, and being prepared to create platforms that are useful for quick political decisions during peacetime are both beneficial and important.
May 16, 2023
>> Original text in Japanese
The translated version of this column has also been reposted on the CEPR VoxEU website.