Coronavirus Disease Symptom Search Query Correlation with Confirmed Cases

This essay sample was donated by a student to help the academic community. Papers provided by EduBirdie writers usually outdo students' samples.

Cite this essay cite-image


Coronavirus disease (COVID-19), declared as an international pandemic by the World Health Organization is known for its a broad spectrum of symptoms that affect different individuals in a variety of ways that primarily range from inflicting mild sickness to severe illnesses [1]. The high and uncontrolled spread of COVID-19 is a source of major concern for the public, nevertheless, Americans have taken the initiative to learn more about the disease by searching for knowledge online.

The internet is being used more commonly as a source for health care information since it’s widely accessible for the United States population. The internet based research of user contributed health content is known to be “Infodemiology”, and its purpose is to improve overall public health. Infodemiology analyzes the public’s search behavior online to contribute to research informatics in the public health domain. An analysis of public health will be possible by utilizing the Google COVID-19 Search Trends symptoms dataset and the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at John Hopkins University which were both essential to this research paper.

Without much federal guidance, the individual state governments had to take charge to manage the spread of COVID-19. Reducing human interaction was found to be a key solution at the start of the pandemic to flattening the rate of coronavirus infections because it was known to be an airborne disease early on [5]. The transmission of COVID-19 is from coughing or sneezing of infected individuals, meaning that the virus can live on uncleaned surfaces or directly get into someone else’s respiratory system [4]. Surgical face masks were found to be one of the most effective methods of reducing the rate of COVID-19 infections because of how they limit how far away respiratory particles can travel from you when you breathe or cough [6]. On average, people infected with COVID-19 may not feel symptoms until 5 to 6 days later with some cases taking up to even 14 days for symptoms to show. During this period of waiting to feel the symptoms, people infected with COVID-19 are contagious and are dangerous since they are unaware that they may have the disease. In this paper, our main focus will be on Orange County, California and how did the search queries of common COVID-19 top related symptoms which are Cough, Fever, or Pneumonia correlate to actual coronavirus infections in the state?


The main experimental group in this study are Google users which account for about 220 million (60% of the total United States population) Americans of the United States population who use Google search, which means this study does not reflect the entirety of the United States. Google Trends utilizes the massive volume of search data processed every day to allow users to identify specific search volume in certain regions around the world. In light of the COVID-19 Pandemic, Google has released a public dataset of COVID-19 symptom search trends to help researchers and the general public do more research on the United States. Search data is normalized per state within the United States by Google Trends. Every data point is then divided with the overall total search volume of location and date recorded to produce normalization for comparison to relative popularity. Results from the data are measured on a scale from 0-100 based on the search topic’s ratio compared to all other search results on Google. The time frame of January 2020 to June 2020 was selected to examine the overall impact that COVID-19 had on Google search results and 6-months was enough time to see how dramatically the data changed before and during the global pandemic. Moreover, the Google COVID-19 Search Trends Symptoms Dataset was cross-referenced with the John Hopkins University Center for Systems Science and Engineering (JHU CSSE) COVID-19 Dataset to track search query trend correlation with confirmed COVID-19 cases. While correlation isn’t necessarily causation in any statistical manner, being able to connect the two datasets can allow for a more holistic understanding of how symptom-related search trends are reflected in confirmed cases of COVID-19.

The goal of the current analysis was to examine search trends and confirmed COVID-19 cases over a 6-month period. Overall, 3 key search terms to investigate confirmed COVID-19 cases: “cough”, “fever”, and “pneumonia”. These health symptoms for COVID-19 were found to be the most dominant common symptoms related to the virus. By evaluating these 3 key search terms, a broader spectrum of confirmed COVID-19 cases can be connected to symptom trends on Google’s public dataset.


The following results are focused on the Google dataset’s search prevalence values to provide clear insight with cross referencing information with the Hopkin’s dataset will be shown in the Discussion.

Save your time!
We can take care of your essay
  • Proper editing and formatting
  • Free revision, title page, and bibliography
  • Flexible prices and money-back guarantee
Place Order

The results began with a minor peak in symptom search trends (Cough: 10.66, Fever: 5.96, Pneumonia: 2.73) around the end of January when worldwide interest was sparked in the coronavirus after an outbreak was declared in Wuhan China. The start of January had the following values for the 3 COVID-19 symptoms: Cough: 12.11, Fever: 5.52, Pneumonia: 2.01 and ended the month with the values Cough: 10.02, Fever: 5.76, Pneumonia: 2.12. The search prevalence for cough and fever stayed at a relative high until a drop off seen at the end of the month which followed a trend increase in both at the start of February. The month of February saw a high of 10.3 for Cough, 6.81 for Fever, and 2.54 for Pneumonia. These results were relatively modest in comparison to the spiking trend seen in March where search prevalence values hit all-time highs.

In March, Cough had spiked in trends to 16.42, Fever spiked in trends to a record high of 19.97 and Pneumonia spiked in trends to 4.63 all around when the global pandemic was officially announced. April started a downward trend for all 3 symptoms since common symptoms were becoming more well known by the general public over time. The start of April had the following values for the 3 COVID-19 symptoms: Cough: 7.75, Fever: 9.31, Pneumonia: 2.19 and ended the month with the values Cough: 3.78, Fever: 4.61, Pneumonia: 1.05. In May, there was an unusual peak in the search prevalence for Fever at 6.83 in the first week of the month. The results did begin to generally stabilize in May with the month concluding with the following values of Cough: 2.71, Fever: 3.77, Pneumonia: 0.66. June saw a hint of another upward trend near the end of the month, with values in Cough at 4.85, Fever at 6.36, and Pneumonia at1.12 in terms of search prevalence.


In this study, there was a significant increase in confirmed COVID-19 cases that ran in conjunction with the peaks of COVID-19 symptom search queries made in the month of March. Google Trends have shown that it can be utilized to identify early trending topics in the case of the novel coronavirus, however, there was a falloff in correlation when interest in the pandemic diminished. If the data was able to remain consistent throughout the entire 6 months, the research would be compelling enough to develop more and longer research periods. The data’s correlation with the confirmed COVID-19 cases data set from the University of John Hopkins saw a poor correlation in the same month of April where search prevalence started to trend downward and confirmed cases started to trend upward. This doesn’t necessarily show that the search query data isn’t connected with confirmed cases though, because people typically search for symptoms when they start experiencing them. The United States inability to provide free COVID-19 testing with quick processing times to the general public caused a lot of discrepancies within the data that make the connection between search queries and confirmed cases difficult to see. It seems like most of the symptoms were queried a couple weeks prior to the confirmed case spike we see in the Hopkin’s dataset, which suggests that the infected had tried to self-diagnose themselves before being confirmed positive for COVID-19.

The Orange County public’s general interest in the 3 major symptoms of COVID-19: cough, fever, pneumonia all peaked in search query trends simultaneously this year on March 12th. This is concurrent with the World Health Organization’s (WHO) official declaration that the world was now in a global pandemic on March 11th. Shortly after the confirmation of the global pandemic from the World Health Organization, confirmed cases of COVID-19 in Orange County began to rapidly rise in late March and early April. There were 419 confirmed cases that increased from 187 cases to 606 cases from March 25th to April 1st, a span of just a single week in Orange County.

Orange County wasn’t able to provide free tests until the summer of 2020. The lack of testing goes hand in hand with the confirmed COVID-19 case count and makes it a little more obvious as to why the two graphs (Figure 1 and Figure 2) don’t directly correlate with one another. The novel coronavirus has been disproportionately affecting different age groups and demographics with underlying health conditions and the data reflects about a hospitalization rate of 4.6 per 100,000 in the population during March 1-28, 2020 where cases were just beginning to show up [2]. When evaluating the general public of Orange County, there is a 98 men to 100 women (98:100) ratio, which means there are slightly more women in the community. To gain insight about those who are most at risk, additional research from peer reviewed journals found that adult men with chronic comorbidities are more likely to get infected due to their compromised immune systems [3]. On top of all of these results and analyses, this information alone is not able to account for all of the asymptomatic cases that could have transferred to adults, young adults, or even children. Overall, the results showed a weak correlation between COVID-19 symptoms and confirmed COVID-19 cases after the initial announcement of a global pandemic, which means the search [image: image1.png]trends were not viable for long term research practices.


After the evaluation of both the datasets, the final conclusion is that the correlation between COVID-19 symptom query trends and actual confirmed cases of COVID-19 were initially strongly correlated, but grew looser and weaker as time grew further away from the initial declaration of the global pandemic. The top 3 coronavirus symptoms did not grow concurrently with the confirmed cases, although results were initially promising as mentioned in the Discussion section. The end of June saw a slight upward trend in search prevalence, but not nearly enough to reach the rates of the continuously growing number of confirmed COVID-19 cases. Overall, the data confirms that search query trends are only temporarily valuable since general interest dies down in every topic as time passes. Despite the temporary value this data provides, the space of infodemiology is still growing in the world of public health. Ultimately, this information will still prove invaluable when it comes to predicting future global health crises.


  1. Sanyaolu, Adekunle, et al. “Comorbidity and Its Impact on Patients with COVID-19.” SN Comprehensive Clinical Medicine, Springer International Publishing, 25 June 2020,
  2. Garg S, Kim L, Whitaker M, et al. “Hospitalization Rates and Characteristics of Patients Hospitalized with Laboratory-Confirmed Coronavirus Disease 2019 - COVID-NET, 14 States, March 1–30, 2020.” Centers for Disease Control and Prevention, Centers for Disease Control and Prevention, 16 Apr. 2020,
  3. Liu, Kai, et al. “Clinical Features of COVID-19 in Elderly Patients: A Comparison with Young and Middle-Aged Patients.” The Journal of Infection, The British Infection Association. Published by Elsevier Ltd., June 2020,
  4. Peeri, Noah C, et al. “The SARS, MERS and Novel Coronavirus (COVID-19) Epidemics, the Newest and Biggest Global Health Threats: What Lessons Have We Learned?” International Journal of Epidemiology, Oxford University Press, 1 June 2020,
  5. Zhang, Renyi, et al. “Identifying Airborne Transmission as the Dominant Route for the Spread of COVID-19.” PNAS, National Academy of Sciences, 30 June 2020,
  6. Leung, Nancy H. L., et al. “Respiratory Virus Shedding in Exhaled Breath and Efficacy of Face Masks.” Nature News, Nature Publishing Group, 3 Apr. 2020,
Make sure you submit a unique essay

Our writers will provide you with an essay sample written from scratch: any topic, any deadline, any instructions.

Cite this paper

Coronavirus Disease Symptom Search Query Correlation with Confirmed Cases. (2022, July 08). Edubirdie. Retrieved April 20, 2024, from
“Coronavirus Disease Symptom Search Query Correlation with Confirmed Cases.” Edubirdie, 08 Jul. 2022,
Coronavirus Disease Symptom Search Query Correlation with Confirmed Cases. [online]. Available at: <> [Accessed 20 Apr. 2024].
Coronavirus Disease Symptom Search Query Correlation with Confirmed Cases [Internet]. Edubirdie. 2022 Jul 08 [cited 2024 Apr 20]. Available from:

Join our 150k of happy users

  • Get original paper written according to your instructions
  • Save time for what matters most
Place an order

Fair Use Policy

EduBirdie considers academic integrity to be the essential part of the learning process and does not support any violation of the academic standards. Should you have any questions regarding our Fair Use Policy or become aware of any violations, please do not hesitate to contact us via

Check it out!
search Stuck on your essay?

We are here 24/7 to write your paper in as fast as 3 hours.