Data Science

Understanding the Societal Impact of COVID-19 during its Early Days

Introduction

COVID-19 (also known as the novel coronavirus) is a global pandemic and has affected humans in all countries of the world. While humanity has seen numerous epidemics including a number of deadly ones over the last two decades (e.g., SARS, MERS, Ebola), the grief and disruption that COVID-19 will inflict is incomparable (and perhaps unimaginable). At the time of writing this paper, COVID-19 is still rapidly spreading around the world and projections for the next few months are grim and extremely disconcerting.

With no cure in sight and with the chances of COVID-19 reemerging for a second (or multiple) time(s) even after the world manages to contain this first outbreak, it is critical that we understand and analyze the socio-economic disruptions of the first outbreak, so that we are better prepared to handle it in the future. Additionally, with ever-increasing mobility of humans and goods, it is only prudent to assume that such epidemics are likely to occur in the future. The learnings from COVID-19 will also enable humankind to prevent such epidemics from transforming into global pandemics and minimize the socio-economic disruption.

Key Findings

In this article, our goal is to analyze the socio-economic disruption caused by COVID-19 in the United States of America, understand the chain of events that occurred during the spread of the infection, and draw meaningful conclusions so that similar mistakes can be avoided in the future.

We collect 530,206 tweets from Twitter between March 14 to March 24, a time period when the virus significantly spread in the US and quantitatively demonstrate the socio-economic disruption and distress experienced by the people. Calls for closures started off with schools (e.g., #closenycschools), then moved on to bars and restaurants (e.g., #barsshut), and finally to entire cities and states (e.g., #lockdownusa). While these calls were initially mainly confined to the Seattle, Bay Area, and NY regions (e.g., #seattleshutdown, #shutdownnyc), they later expanded to include other parts of the country (e.g., #shutdownflorida, #vegasshutdown). Alongside, panic buying and hoarding escalated with essential items particularly toilet paper becoming unavailable in stores (e.g., #panicbuying, #toiletpapercrisis).

We observe increased calls for social distancing, quarantining, and working from home to limit the spread of the disease (e.g., #socialdistancingnow, #workfromhome). To slow the exponential increase in the number of infections, people also rallied for flattening the curve and staying at home for extended periods (e.g., #flattenthecurve). The challenges of working from home also surface in communications (e.g., #stayhomechallenge). With the passage of time, we see an increased fluctuation in emotions with some people expressing their anger at individuals flouting social distancing calls (e.g., #covidiots), while others rallying people to fight the disease (e.g., #fightback) and to save workers (e.g., #saveworkers).

Top COVID Hashtags
Figure 1: Top 20 COVID Hashtags
Top Trending Hahstags
Figure 2: Most trending hashtags by day

We group the hashtags into six main categories, namely 1) General COVID, 2) Quarantine, 3) School Closures, 4) Panic Buying,  5) Lockdowns, and 6) Frustration and Hope, to quantitatively and qualitatively understand the chain of events. Figure 1 shows the top 20 hashtags observed in our data. As expected, we see that hashtags corresponding directly to COVID or coronavirus are the most popular hashtags as most communications are centered around them. We observe that hashtags around social isolation, staying at home, and quarantining are also popular. Figure 2 shows the most popular hashtags by date. Similar to Figure 1, we observe that hashtags related directly to COVID and social distancing trend most on Twitter. The figures and the number of tweets highlight how the pandemic gripped the United States with its rate of spread.

Figure 3a) General COVID and Quarantine
Figure 3b) School Closures and Panic Buying
Figure 3c) Lockdowns and Frustration and Hope

We investigate the evolution of the number of tweets in various hashtag groups over time. To calculate the number of tweets in each hashtag group, we count the number of mentions of hashtags in that group across all the tweets. If the tweet contains more than one hashtag, it is counted as part of all the hashtags mentioned in it. As the number of tweets for hashtag groups vary significantly, we plot the groups that have similar number of tweets together. Similar to Figure 1, we observe from Figure 3a that the total number of tweets in the General COVID and Quarantine categories are relatively high throughout the time period of the study.

Interestingly, from Figure 3b, we observe that panic buying and calls for school closures peak around the middle of March and then decrease as school closures and rationing of many essential goods such as toilet paper, cereal, and milk take effect. From Figure 3c, we see that calls for lock downs related to schools, bars, and cities peak in the middle of March. With the virus spreading unabated, we observe intense calls for lock downs of cities and entire states around the beginning of the fourth week of March, resulting in an increased number of tweets in this category. With passage of time, we observe people increasingly expressing their frustration and distress in communications, while some hashtags attempt to inject a more positive outlook.

General COVID word frequency
Figure 4: General COVID word frequency
Panic buying word frequency
Figure 5: Panic buying word frequency
 

We next present a linguistic analysis of the tweets in the different hashtag groups and present the words that are representative of each group. We observe that words such as family, life, health and death are common across hashtag groups. We observe mentions to mental health, a possible consequence of social isolation. We also observe solidarity for essential workers and gratitude towards them (#saveworkers). We present the most semantically meaningful and uniquely identifying words in each hashtag group. To do this, we remove the common words calculated in the above step from each group. From the obtained list of words after the filtering, we then select the top 10 words. Figures 4 and 5 gives us the uniquely identifying and semantically meaningful words in the General COVID and Panic Buying hashtag group. In the General COVID group, we find words such as impact, response, resource, doctor.  The Panic Buying top words mostly resonates the shortages experienced by people during this time such as roll and tissue (referring to toilet paper), hoard, bidet (as an alternative to toilet paper), wipe, and water. Similarly, for School Closures, we find words such as teacher, schedule, educator, book, class. Top words in the Lockdowns group include immigration, shelter, safety, court, and petition, signifying the different issues surrounding lockdown.

Conclusion

Our preliminary study unearths and summarizes the critical public responses surrounding COVID-19, paving the way for more insightful fine-grained linguistic and graph analysis in the future. 

Video Explanation

Subscribe to Our monthly newsletter for exciting data science news and learnings

Subscribe to our monthly newsletter, DataTrain, our thought train on all things data.