2.3 Visual Data Analysis
We also explore time-series data using visual data analysis to provide a clear and understandable outcome of this extreme outbreak of COVID‐19. This segment will analyze various time-series data using several visual data analysis approaches with the R programming language. We have created a graph and given awareness of how SARS‐CoV‐2 spread around the globe from 22 January 2020 to 17 April 2020; it allows individuals to grasp the epidemiological essence of COVID‐19.
Figure 1. Global confirmed infected, recovered, deaths and active cases as at 17-04-2020
Figure 1 indicates that the confirmed infected cases have been crossed by 2,000,000 cases. Many cases, such as death, recovery and active, have also been shown. New cases reported on a single day do not actually represent new cases on that day, as the number of confirmed infected cases or deaths announced by any organization – including WHO, ECDC, Johns Hopkins University and others – does not reflect the total number of new cases or deaths on that day. This is due to the long chain of reporting that occurs between a new case or death and its inclusion in national or international statistics.
The steps in this chain vary among countries, but for many countries the reporting chain contains several of the following steps: 1. Doctor or laboratory diagnoses the case of COVID-19 on the basis of a test or combination of symptoms and epidemiological likelihood (such as a positive family member check). 2. The doctor or laboratory shall send a report to the health department of the city or district. 3. The Health Department receives a report and reports every individual case, including patient details, in the reporting system. 4. The Ministry or some other government agency gathers these data and releases the latest figures. 5. International data organizations such as the WHO or the ECDC can then compile statistics from hundreds of these national accounts.
This reporting chain will take several days to complete. This is why the numbers published at any given date do not generally reflect the number of new cases or deaths at that particular date. Confirmed deaths to date, we know the total number of confirmed deaths due to COVID-19 to date. Limited research and difficulties in the classification of the cause of death mean that the number of confirmed deaths might not be an accurate count of the actual total number of deaths from COVID-19. Death or recovery for all cases is not yet established in the current epidemic of final outcomes. The time from symptom onset to death for COVID-19 varies from 2 to 8 weeks (https://github.com/CSSEGISandData/COVID-19, 2020). It means that certain people who are already infected with COVID-19 will be killed at a later date. As discussed below, this needs to be held in mind when comparing the current number of deaths with the current number of incidents.
Regression and generalized linear models of data from the COVID-19 time series are used to analyze confirmed infected, deaths and recovered cases. The fitted models have yielded better statistical results; the graphic findings shown below represent all three cases in the USA.From the models results obtained, on the confirmed case, the exponential model coefficients are -0.807 and 0.17, the GLM Poisson model coefficients are 3.469 and 0.119, and the GLM Gamma model coefficients are -0.433 and 0.17, both of which are statistically significant, as shown in Figure 2. In case of death, the exponential model coefficients are -2.774 and 0.144, the GLM Poisson model coefficients are -2.424 and 0.151, both of which are statistically reasonable, as shown in Figure 3. In the recovered case, the exponential model coefficients are -2,204 and 0,137, the GLM Poisson model coefficients are -2,864 and 0,163, both of which are statistically significant, as shown in Figure 4.
Figures 2, 3 and 4 below display the numerous incidents, such as confirmed infected, deaths, and recovered cases in the United States.
Figure 2. US – Confirmed infected case
Figure 3. US – Death case
Figure 4. US – Recovered case
From Figures 2, 3 and 4, we can understand that all cases, such as confirmed infected, deaths and recovered, are exponentially increased, the same thing is reflected in the upper part of the graph, i.e. the output of linear and generalized linear models.
Figures 5, 6 and 7 show confirmed, fatal and recovered cases in Spain. Here, too, the count has risen exponentially; the same trend is statistically reflected in the upper part of the chart. In the confirmed case, the exponential model coefficients are -2,278 and 0,185, the GLM Poisson model coefficients are 4,159 and 0,093, both of which are statistically significant, as shown in Figure 5. In case of death, the exponential model coefficients are-2.919 and 0.152, the GLM Poisson model coefficients are 1.329 and 0.104, both of which are statistically fine, as shown in Figure 6.
Figure 5. Spain – Confirmed case
Figure 6. Spain – Deaths case
In the recovered case, the exponential model coefficients are -2.876 and 0.165, the GLM Poisson model coefficients are 0.914 and 0.124, both of which are statistically appropriate, as shown in Figure 7. In the event of an outbreak of an infectious disease, it is important not only to monitor the number of deaths, but also the pace of growth at which the number of deaths is that. If there is a fixed number of deaths over a fixed duration, we call that ”linear” growth. But if they continue to double within a fixed time span, we call that ”exponential” growth.
Figure 7. Spain – Recovered case
Figures 8, 9 and 10 show that the rate of growth in all cases, such as confirmed infected, deaths, and recovered, is shown in the US. Looking at the rate of death growth, we can understand that it’s exponential growth in the US.
Figure 8. US – Rate of growth in confirmed case
Figure 9. US – Rate of growth in deaths case
Figure 10. US – Rate of growth in recovered case
Figures 11,12 and 13 show that the growth rate of all cases, such as confirmed infected, deaths, and recovered, has risen in Spain. If we look at the rate of death growth, we can understand that it is exponential growth in the US, the last day of change is 31,451 as of the April 17, 2020 study.
Figure 11. Spain – Rate of growth in confirmed case
Figure 12. Spain – Rate of growth in deaths case
Figure 13. Spain – Rate of growth in recovered case
Figure 14 indicates that changes every day occurred in confirmed cases between 23 January 2020 and 15 April 2020 from the USA and Spain. By this we will conclude that the reported cases will increase exponentially on 20 March 2020 and that the last day of change is 31,451 in the US. In Spain, the confirmed case rises linearly from 03 March 2020 to 15 April 2020, the last day of change is 7,304.
Figure 14. US vs Spain - Changes per day
Figure 15. Confirmed, deaths, recovered and active cases in Globe, US and Spain
Figure 15 shows that confirmed, deaths and recovered cases of the United States and Spain, along with global time series data for all these cases, occurred between 22 January 2020 and 17 April 2020. The chart brought more clarification to the above analysis findings, especially with regard to the United States and Spain. In the same way, our proposed research is capable of estimating the number of cases for a given country across the globe and can also compare the different cases between countries. For example, Figure 16 shows confirmed infected, deaths, recovered and active cases across the globe and throughout the United States, France, China, Spain, Germany, Italy and India. From this Figure, all cases, such as confirmed, deaths, recovered, and active deaths, can be understood to be minimal in India as at present (17 April 2020).
Figure 16. Confirmed, deaths, recovered and active cases in Globe, US, France, China, Spain, Germany, Italy, and India
It is clear that the real-time analysis of these data is extremely useful in documenting the epidemiological behavior of this severe disease. We believe that this method of data analysis will certainly boost understanding of the situation and inform behavior.