2.3 Visual Data Analysis
We also explore time-series data using visual data analysis to provide a
clear and understandable outcome of this extreme outbreak of COVID‐19.
This segment will analyze various time-series data using several visual
data analysis approaches with the R programming language. We have
created a graph and given awareness of how SARS‐CoV‐2 spread around the
globe from 22 January 2020 to 17 April 2020; it allows individuals to
grasp the epidemiological essence of COVID‐19.
Figure 1. Global confirmed infected, recovered, deaths and active cases
as at 17-04-2020
Figure 1 indicates that the confirmed infected cases have been crossed
by 2,000,000 cases. Many cases, such as death, recovery and active, have
also been shown. New cases reported on a single day do not actually
represent new cases on that day, as the number of confirmed infected
cases or deaths announced by any organization – including WHO, ECDC,
Johns Hopkins University and others – does not reflect the total number
of new cases or deaths on that day. This is due to the long chain of
reporting that occurs between a new case or death and its inclusion in
national or international statistics.
The steps in this chain vary among countries, but for many countries the
reporting chain contains several of the following steps: 1. Doctor or
laboratory diagnoses the case of COVID-19 on the basis of a test or
combination of symptoms and epidemiological likelihood (such as a
positive family member check). 2. The doctor or laboratory shall send a
report to the health department of the city or district. 3. The Health
Department receives a report and reports every individual case,
including patient details, in the reporting system. 4. The Ministry or
some other government agency gathers these data and releases the latest
figures. 5. International data organizations such as the WHO or the ECDC
can then compile statistics from hundreds of these national accounts.
This reporting chain will take several days to complete. This is why the
numbers published at any given date do not generally reflect the number
of new cases or deaths at that particular date. Confirmed deaths to
date, we know the total number of confirmed deaths due to COVID-19 to
date. Limited research and difficulties in the classification of the
cause of death mean that the number of confirmed deaths might not be an
accurate count of the actual total number of deaths from COVID-19. Death
or recovery for all cases is not yet established in the current epidemic
of final outcomes. The time from symptom onset to death for COVID-19
varies from 2 to 8 weeks
(https://github.com/CSSEGISandData/COVID-19, 2020). It means that
certain people who are already infected with COVID-19 will be killed at
a later date. As discussed below, this needs to be held in mind when
comparing the current number of deaths with the current number of
incidents.
Regression and generalized linear models of data from the COVID-19 time
series are used to analyze confirmed infected, deaths and recovered
cases. The fitted models have yielded better statistical results; the
graphic findings shown below represent all three cases in the USA.From
the models results obtained, on the confirmed case, the exponential
model coefficients are -0.807 and 0.17, the GLM Poisson model
coefficients are 3.469 and 0.119, and the GLM Gamma model coefficients
are -0.433 and 0.17, both of which are statistically significant, as
shown in Figure 2. In case of death, the exponential model coefficients
are -2.774 and 0.144, the GLM Poisson model coefficients are -2.424 and
0.151, both of which are statistically reasonable, as shown in Figure 3.
In the recovered case, the exponential model coefficients are -2,204 and
0,137, the GLM Poisson model coefficients are -2,864 and 0,163, both of
which are statistically significant, as shown in Figure 4.
Figures 2, 3 and 4 below display the numerous incidents, such as
confirmed infected, deaths, and recovered cases in the United States.
Figure 2. US – Confirmed infected case
Figure 3. US – Death case
Figure 4. US – Recovered case
From Figures 2, 3 and 4, we can understand that all cases, such as
confirmed infected, deaths and recovered, are exponentially increased,
the same thing is reflected in the upper part of the graph, i.e. the
output of linear and generalized linear models.
Figures 5, 6 and 7 show confirmed, fatal and recovered cases in Spain.
Here, too, the count has risen exponentially; the same trend is
statistically reflected in the upper part of the chart. In the confirmed
case, the exponential model coefficients are -2,278 and 0,185, the GLM
Poisson model coefficients are 4,159 and 0,093, both of which are
statistically significant, as shown in Figure 5. In case of death, the
exponential model coefficients are-2.919 and 0.152, the GLM Poisson
model coefficients are 1.329 and 0.104, both of which are statistically
fine, as shown in Figure 6.
Figure 5. Spain – Confirmed case
Figure 6. Spain – Deaths case
In the recovered case, the exponential model coefficients are -2.876 and
0.165, the GLM Poisson model coefficients are 0.914 and 0.124, both of
which are statistically appropriate, as shown in Figure 7. In the event
of an outbreak of an infectious disease, it is important not only to
monitor the number of deaths, but also the pace of growth at which the
number of deaths is that. If there is a fixed number of deaths over a
fixed duration, we call that ”linear” growth. But if they continue to
double within a fixed time span, we call that ”exponential” growth.
Figure 7. Spain – Recovered case
Figures 8, 9 and 10 show that the rate of growth in all cases, such as
confirmed infected, deaths, and recovered, is shown in the US. Looking
at the rate of death growth, we can understand that it’s exponential
growth in the US.
Figure 8. US – Rate of growth in confirmed case
Figure 9. US – Rate of growth in deaths case
Figure 10. US – Rate of growth in recovered case
Figures 11,12 and 13 show that the growth rate of all cases, such as
confirmed infected, deaths, and recovered, has risen in Spain. If we
look at the rate of death growth, we can understand that it is
exponential growth in the US, the last day of change is 31,451 as of the
April 17, 2020 study.
Figure 11. Spain – Rate of growth in confirmed case
Figure 12. Spain – Rate of growth in deaths case
Figure 13. Spain – Rate of growth in recovered case
Figure 14 indicates that changes every day occurred in confirmed cases
between 23 January 2020 and 15 April 2020 from the USA and Spain. By
this we will conclude that the reported cases will increase
exponentially on 20 March 2020 and that the last day of change is 31,451
in the US. In Spain, the confirmed case rises linearly from 03 March
2020 to 15 April 2020, the last day of change is 7,304.
Figure 14. US vs Spain - Changes per day
Figure 15. Confirmed, deaths, recovered and active cases in Globe, US
and Spain
Figure 15 shows that confirmed, deaths and recovered cases of the United
States and Spain, along with global time series data for all these
cases, occurred between 22 January 2020 and 17 April 2020. The chart
brought more clarification to the above analysis findings, especially
with regard to the United States and Spain. In the same way, our
proposed research is capable of estimating the number of cases for a
given country across the globe and can also compare the different cases
between countries. For example, Figure 16 shows confirmed infected,
deaths, recovered and active cases across the globe and throughout the
United States, France, China, Spain, Germany, Italy and India. From this
Figure, all cases, such as confirmed, deaths, recovered, and active
deaths, can be understood to be minimal in India as at present (17 April
2020).
Figure 16. Confirmed, deaths, recovered and active cases in Globe, US,
France, China, Spain, Germany, Italy, and India
It is clear that the real-time analysis of these data is extremely
useful in documenting the epidemiological behavior of this severe
disease. We believe that this method of data analysis will certainly
boost understanding of the situation and inform behavior.