Research Methadology Assignment

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 17

pollution, economic growth, and covid-19 deaths in India: a machine learning evidence

assignment

Submitted to
Mr. Usman Ahmad Raza
Submitted by

Azka Basheer
{F17-1082}

Jalal khan mullah


{F17-1118}

Tauseef Ahmad
{F17-1071}

Department of software engineering


Lahore Leads University

[18-october-2020]
Pollution, economic growth, and COVID-19 deaths in India: a machine learning evidence

Abstract

Basically there is defined different two levels which explore the relationship between pollution emission,
economic growth and COVD-19 deaths in India.

Using a time series level and annual data for the years from 1980 to 2018, stationarity and

Toda-Yamamoto causality tests were performed. Its highlight that there is no-direction causality
between economic growth and pollution emission. Then, a D2C algorithm on proportion-based causality
is applied, implementing the Oryx 2.0.8 protocol in Apache. The basic concept is that predefined
pollution concentration, caused by economic growth, could promote COVID-19 by making the
respiratory system trustier to infection. We use data (from January 29 to May 18, 2020) on confirmed
deaths (total and daily) and air pollution concentration levels for 25 major Indian cities. We verify a ML
causal link between PM2.5, CO2, NO2, and COVID-19 deaths. The require careful policy design.

Keywords Economic growth. Pollution. COVID-19. Time series. Machine learning. India.

Introduction:

In recent years, India’s economic growth has increased very quickly; this country exposes one of the
highest growth rates for a developing country. Numerous production sectors have grown and in addition
to the steel and metallurgical sectors, the textile and oil refining industries have also expanded. This
economic growth has increased the number of jobs in cities and the populations of large urban areas,
such as Delhi and Mumbai. However, the growth of towns and the coherent need for more supplies
have damaged the softened environment of India, where there are high levels of fog, fine dust, and
water pollution.
Air impurity in India has consume quickly with the increasing population, increasing number of
vehicles, increasing energy exploit, poor conveyance framework, poor land use, industrialization and
especially with the insufficient environmental guidelines. According to Conibearetal. (2018), sulfur
dioxide (SO2), nitrogen dioxide (NO2), and particulate matter (PM) contribute in part to the toxins
causing environmental defilement. Many Indian urban communities including Mumbai, Kolkata, and
Pune are at risk of air pollutant. India’s air contamination emergency is generally due to the poison,
winter air quality in Delhi, and in a few urban areas in the north and central India (Awasthi et al.
2016).
According to Gurjar et al. (2016), the air quality due to pollution is so weak in India that about 1.2 million
deaths can be directly point to it. Gurjar et al. (2016) protestation that one out of eight (about 12.5%)
demises in 2017 were attributable to high rates of respiratory disease, slug, heart disease, diabetes, and
lung cancer, all conditions for which a certain percentage of cases result from difficult air pollution.
Supposedly, out of the 1.2 million deaths, about 51.4% were people below 70 years of age (Solgi and
Keramaty 2016). More than three quarters of the population in India are helpless to air pollution deadly
higher than the minimum standard set by the Indian government, which is four times higher than the
demark set by the World Health Organization (WHO).

Table 1 list of variables

Variable Description Source


PCGDP GDP per capita in 2000 US$ (converted at Geary Khamis PPPs) FRED
CO2 CO2 emissions (metric tons per capita) OGD
NO2 NO2 emissions (metric tons per capita) World Bank
PM2.5 Primary particulate matter (metric tons per capita) OGD

Since March 2020, India has been assumed, like much of the world, by the COVID-19 outbreak. In no
time, India has experienced an unheard-of increase in infections. The toll of the coronavirus
emergency in India has beat the kickoff of 100 thousand cases. According to reports from Johns
Hopkins University, the number of cases in India has reached 101,139, however the deaths caused to
date by the virus total 3164. The country’s Ministry of Health has confirmed the perfection of these
data. The Indian Medical
Research Council has announced that 2,404,267 tests have been accomplish nationwide. The breathless
enlarge of the virus hatefulness the lockdown measures exact raises many questions. In fact, many
scholars are looking for a relationship between pollution and the spread of COVID-19. The literature on
this topic is very liberal and the few subsit studies are very recent. However, relevant scientific literature
highlights that revelation to air pollution may be relevant to virus infection widely (Chen et al. 2010; Ye
et al.2016; Chen et al. 2017; Peng et al.2020); and more recent literature focuses on COVID-19
redundancy (Consticini et al., 2020; Setti et al.2020; Wu et al.2020). These latter studies concluded that
air pollution is a heavy determinant of COVID-19 infection spread. However, the feedback channel
remains less investigated. Mitra et al. (2020) studied the atmospheric carbon dioxide (CO2) levels for the
city of Kolkata (India) comparing the April 2019 and April 2020 times. Targeting on Chinese provinces,
Huang et al. (2020) analyzed the modification in primary and secondary pollution emissions during the
COVID-19 lockdown. Using a different method, Wang et al. (2020) define the influences of emission
reductions due to reduced human activities during the COVID-19 outbreak in China on air pollution.
Becchetti et al. (2020) analyzed the data of all the municipalities and all the Italian provinces, both in
terms of deaths and daily infections, in relation to pollution levels. In their study, consequential causal
variables for contagion and death with COVID-19 are represented by the gathered provision of three
factors: the lockdown measures, the level of local pollution—especially fine dust but also NO2—and the
types of local production structures, in fact no digitalize activities, which therefore in the most dreadful
period of the epidemic crisis had greater resistance to close down. The study calculates the difference
between provinces most exposed to fine dust (Lombardy) and least exposed (Sardinia) to be around
1200 cases and 600 deaths per month, a figure that shows a doubling in mortality for the most exposed
province. Basically, according to the research, coronavirus infections were higher where the air pollution
was higher, however the authors specify that a causal link was not published. Becchetti et al. (2020)
discuss statistical relevance, however, which suggests a strong correlation between pollution and
infection/mortality. Studies on the relationship between COVID-19 and pollution present statistical
analyses but fail to take into account the relationship with economic growth. Furthermore, these studies
do not create the most modern techniques, which are based on machine learning (ML) approaches. This
paper, starting from the underlying assumption that economic growth in developing countries create
pollution, first verifies the causal link through an econometric level. It mostly the presence of causality in
the Toda-Yamamoto test between economic growth and PM2.5, NO 2, CO 2. Shortly, the (short-term)
causal link is verified between PM2.5, NO 2, CO 2 (resulting from unsustainable economy growth), and
COVID-19 deaths, through a complex causality algorithm (D2C). The rest of this paper is organized as
follows: “Methods” shows the time series and ML methods; “Results” analyses in detail the results
obtained by our algorithm; “Discussion” presents a discussion of the results and “Conclusions” reports
our conclusions.
Table 2 Exploratory data analysis

Variable Mean SD Minimum Maximum Ex. kurtosis 10Trim IQR


PCGDB 3.1942 1.9372 2.1931 8.1946 0.8745 3.95 1.6910
CO2 0.9164 0.1911 0.1975 1.9782 0.1794 0.86 0.4937
NO2 5.1946 0.1943 5.1416 6.5946 − 0.9634 5.10 0.5795
PM2.5 0.8512 0.1845 0.1867 1.9245 0.1864 0.84 0.4765

Causality analysis Machine learning

DATASET
Causality Causality
Processing
analysis predicated

The variables
are expanded
Significance Test

Fig. 1 The ML process

ML evidence:
According to Sundararajan et al. (2017), we defined an algorithm able of generating causal effects
between inputs concerning one or more targets. We used an algorithm in ML that could identify causal
effects between the variables. Hu et al. (2012) defined a D2C algorithm on proportion-based causality
using the Oryx 2.0.8 protocol in Apache. However, since an algorithm in ML needs many variables
(remembering that the data are not clarify as a time series), we completed mathematical
transformations. So, in addition to the general logarithmic transformation, we generated the square of
the considered values, the first difference and the first difference computed in logarithmic terms. In this
way, our model calculates a combination of 37,040-11 variables with an artificial intelligence level.
Actually, we adopted a confirmable strategy similar to Magazzino et al. (2020a, b, c), and Mele and
Magazzino (2020). We used a dataset with daily incremental variables in time series (not considered as

such by our neural networks) from January 29 to May 18 2020. We performed the analysis in ML
following the process shown in Fig. 1. This figure shows that starting from our dataset, we increased the
variables through mathematical transformations to obtain a large dataset necessary for our D2C
algorithm. Basically, the causality model was processed and we analyzed those variables considered
significant. Once the D2C commands were imported into the Oryx software, the analysis generated the
causalities mentioned above typical of ML process. Finally, we completed the predictive linear
regression test and perform training test to verify the accuracy of the algorithm.

Methods:

Data source and strategy: time series analysis:


The econometric analysis goal to analyze the presence of a causal relationship among pollution (CO2
emissions, PM2.5, and NO 2) per capita gross domestic product (GDP). In our study, we use annual data
from 1980 to 2018. Table 1 shows the sources of data used in our empirical analyses: CO2 is CO2
emissions (metric tons per capita); PM2.5 is primary matter (PM); NO2 is NO2 concentrations levels;
PCGDP is GDP per capita in 2000 US$. For this work, to avoid distortions in the analysis, values for the
variables used were calculated in logarithmic terms. Table 2 presents an exploratory data analysis.
Means are positive for all variables; 10-Trim values are near the means; the interquartile range shows
the absence of outliers. The correlation analysis shows that in our dataset, the variables are strongly
correlated: corr (CO2, PCGDP) = 0.9745; corr (CO2, NO 2) = 0.9875; corr (PCGDP, NO2) = 0.9754; corr
(CO2, PM2.5) = 0.9912; corr (PCGDP, PM2.5) = 0.9245; corr (NO2, PM2.5) = 0.9675 with all significant
variables (0.000).
Table 3 Results for unit roots and stationarity tests

ADF ERS PP KRSS


Level
PCGDP − 2.765 (− 3.124) − 1.008 (− 3.120) − 2.120 (− 3.040) 0.450***
(0.145)
CO2 − 2.790 (− 3.005) − 1.142 (− 3.150) − 2.150 (− 3.680) 0.375***
(0.145)
NO2 − 1.195 (− 3.086 − 1.790 (− 3.710) − 3.746 (− 3.785) 0.150***
(0.145)
PM2.5 − 2.456 (− 3.142) − 1.158 (− 3.145) − 2.125 (− 1.580) 0.350***
(0.145)
First differences
PCGDP − 6.350*** (− − 5.522*** − 6.488*** (− 0.314* (0.460)
2.008) (−2.950) 2.960)
CO2 − 3.052*** (− − 1.792 (−2.622) − 8.350*** (− 0.240 (0.460)
2.378) 2.644)
NO2 − 3.480** (− 1.900) − 2.378*** − 8.005*** (− 0.110 (0.460)
(−2.005) 2.916)
PM2.5 − 3.050** (− 2.250) − 1.850 (−2.422) − 8. 125** (− 0.250 (0.460)
2.486)
*p < 0.1; **p < 0.05; ***p < 0.01. 5% Critical values are given in parentheses.

Results:
Subsequently, tests (ADF, ERS, PP, and KPSS) were performed for each time series of each
variable, first on levels and then on the first differences. The tests failed to reject the null
hypothesis for all the variables relative to the 5% significance level, except for the KPSS test.
However, this last test, using a different approach, rejected the I (0) value at the 95%
confidence level, indirectly confirming the previous tests. To verify the causal relationship
between each of PM2.5, CO2, NO2, and the per capita economic, we used the Toda Yamamoto
test. This is necessary to test the non-Granger causality allowing, however, for the causal
inferential analysis on a VAR which contains or does not present co-integration processes. Table
4 presents the result of the test carried out on our historical data series.

The peculiarity of our result is that India registers a direction of causality. In particular,
there is a unidirectional causality from economic growth to PM 2.5, CO2, and NO2. The results
obtained confirm the hypothesis that the economic growth of a developing country behaves
like a bell curve. The relationship between economic development and environmental
sustainability is best represented by the so-called environmental Kuznets curve (EKC). The basis
of this theory is the idea that the curve represents a mechanism according to which developing.
However, for India, the downward phase of the curve has not yet been observed. Polluting
emissions, therefore, still derive from unsustainable economic growth. Polluting emissions may
have influenced the spread of COVID-19 in Indian Territory, also causing the death of many
people. This statement requires an empirical verification through the most current
methodologies in ML. Therefore, as reported in the following section, we next estimated the
D2C algorithm, aiming to verify the causal link between polluting emissions and COVID-19
deaths. Table 5 presents the results of causality and significance tests to determine the
relationship between the variables of interest in the study. In the model, n-filtered factors were
used (which do not appear in the table), which performed the task of training the classification
of our model. The self-learning machine worked as explained here. It started from a set of
commands with functionality to be preset. Subsequently, as shown in the Appendix Table 6, we
sequentially imported various features and parameterized our variables from letter a to letter
m. Hence, ten classifiers were trained and tested to achieve the predictive causal link between
our variables. These ten classifiers worked through a binary calculation sequence, alternating
the values [0] with those of [1].
Table 4 Toda-Yamamoto causality tests results

Dep. Variable PCGDP CO2 NO2 PM2.5

PCGDP - 2.125(0.250) 2.265(0.300) 2.150(0.450)

CO2 11.460***(0.000) - 0.607(0.450) 0.845(0.350)

NO2 11.159***(0.000) 2.305(0.280) - 3.402(0.450)

PM2.5 10.260***(0.000) 3.120(0.380) 2.848(0.500) -

*p < 0.1; **p < 0.05; ***p < 0.01

Table 5 Rank of predictor and significant causality results

In Appendix Table 6: D2C core commands AC: Average Causality value. AUPRC: Area Under the
Precision Recall Curve. True P-val. < 0.05. False P-val. ≥ 0.05. ln is the logarithmic transformation; (s) is
the square of the considered values; there are 7 unused variables
After this test (ADF, ERS, PP, and KPSS) performed first on the level and the first on the
difference for any time series of every reachable (Table in 3). Other than for test KPSS, the test
did not to reject if the null supposition for all variables relate to 5% significance level. But this
last test when use different method such as 95% confidence level at rejected I (0) and
confirming the pervious test.

When we used the Toda Yamamoto test then we verify causal relationship among every PM
, CO2, NO2 and per capita economic. It’s necessary to test no-Ganger which allowing
2.5

reasonably, but for the causal inferential analysis on the VAR which contain and not present
processes co-integration. Which the result of the test carried out on our historical data series
it’s present in table 4.

Features our results show that India is heading towards death. Especially, there is a one-
way street from economic growth PM 2.5, CO2 and NO2.The results confirm the assumption that
the economic growth of a developing country behaves like a bell curve. The relationship among
environmental sustainability and economic development are best represented by called
environmental Kuznets curve. This theory of the base is the idea that the curve represents a
mechanism according to which developing countries more pollute. Achieving a mature and
deserving stage of economic growth is essential to reducing environmental damage. However,
for India curve has not observed the downward phase. Polluting emission so, still derive from
unsustainable economic growth. Pollution could affect areas of India that have affected the
spread of the coronavirus and caused many deaths. In this statement required very verification
experimental. So, we reported following section, next estimated D2C algorithm that purpose is
to confirm the relationship between pollution emission and COVD-19.

Table 5 present the result of test of reason and importance which determine relationship
between the variables interest of study. In the model, n-filtered used which our model task
performed of training classification. Here explained worked the self-learning machine. Here it
started from command with functionality to plan. After this show in table 6, we imported
sequentially various parameterized and feature our variables from letter A to m. Therefore, ten
divisions trained and tested which achieve the link among variables. The division of ten worked
through binary calculation the values [0] and [1]. Between our variables we used algorithm
worked by performing average over 19000 loops, for each combination of reason. The closing
percentage within calculation of average was higher than 70%. Our algorithm completed every
cycle for every variable. For all pair of values which the value of the average was uniform. The
results predictive causality, we parameterized the AUPRC. It was divided into true or false with
respect to a p value lower and higher than 5%. It turned out that the same relationship was
important in AUPRC. This attribute was due to an indirect cause from PM 2.5 up to the price of
death. We applied the importance test which tells the target of death (fig: 2).

As shown in figure 2, the PM 2.5 emissions is the largest influence the target variable. Its
result detecting make firm from the D2C causality prototype. One way to access the
disadvantages of our model is to linear regression analyze it (Fig.3). The line of projection
approves the fit-of goodness the explain with the algorithm in the final architecture. The model
of second period algorithm goodness begin with the investigation of perform training. The
method of e Quasi-Newton was used for training (fig.4). Its form on Newton’s method, but does
not need calculation of second derivatives.

Fig. 2 Importance test results. Source: our elaboration with BGML

Figure .4 in every iteration of show the pick and training error. The blue line show training
error and orange line show pic error. The training error of initial value of 6.78169 and after its
value 29 epochs is 0. 0781.The selection error of initial value of 5.44918 and value of final after
108 epochs is 0.0008.
Discussion:
The result of obtained from our D2C model show that is a one-method causal link from PM 2.5

gathering to COVID-19 deaths. This is result important. It high level of point out of fine
particulate which are particulates to increase in pandemic deaths, which we explain here.
Composed of solid and liquid particles of microscopic dimensions for Atmospheric PM, drooping
in the air.

The particular term of PM 2.5 involve all powders with an aerodynamic diameter less than or
equal to 2.5 μm. It is true that plants are the especially of every smoke of all kinds, including the
engines of cars and motorcycles, for the production of electricity and for the burning of wood
for household and many contribute of industry. These shorter particles can be inhaled, reaching
into the deepest part of the human respiratory system and lungs. A few days ago inhaling at
high can be some causes. For long time exposure which generate of effect cough and decrease
in the lung or cardiac capacity, asthma and other condition.

Fig.3 Predictive linear regression test. Sources: our elaboration with NN


designers
Fig. 5 Concentration (μg/m3) PM2.5 in Delhi. Sources: our elaboration on hourly
data.

According to Wu et al. The rate from COVID-19 in (2020) a 1 μg/m3 increases in air PM 2.5
corresponds to a 15% increase in the mortality. That is why patients who have been suffering
from air pollution for a long time come which causes more deaths. Coronavirus has been
predicted to have fewer deaths in clean areas. Significantly increasing the risk of mortality in
patients affected by the virus for this PM 2.5, therefore, can greatly aggravate the symptoms of
COVID-19 infection.

The India economic growth processes for this the relationship of PM 2.5 and COVID-19 death.
As introduced above, India has a highest economic growth rates. This has led to an increase in
the number of jobs in the cities and an increase in the population in the big cities. As the result
of this development and more suppliers the environment has been affected by the
environment, which has resulted in a high level of smog and water pollution.
In especially, we believe that the best agriculture issue from the process of social and economic
urbanization between the deaths of the COVID-19. After the 2018 the most of people
population has lived in urban areas and the Predict the United Nation (UN) that by 2030, 60%
will do so. For urbanize main future challenges in developing countries such as India. Migration
from rural areas to cities is an important factor in label of pollution and environmental impact.
In most cites air quality is below the established limit according to the WHO (2018). The WHO
reports on its own that many countries, such as India, are characterized by global economic
growth and overall urban pollution levels. In point of fact Delhi holds the currently highest
negative records in tag of emission pollution pieces into the atmosphere. In our database, we
daily and hourly basis observed very high level of PM 2.5 in Delhi (fig.5) with values above the
limit recommended by the WHO (10 μg/m3). This high level concentration PM 2.5 could provoke
the COVID-19 pass in India. Our model keeps out the other variables which with directly
causality for COVID-19 deaths. Therefore, the polices of design support for rapid reduction of
polluting particles, because their task as a vehicle for the virus (2020) could set up the number
of deaths across the country.

Conclusions:

This paper is plan to study the relation between economic growth, polluted emanation, and
COVID-19 deaths. So we put-upon two ideas framework. The econometric framework
corroborated the same directional reasons links between economic growth and PM2.5, CO2,
and NO2. Our ML administration, with D2C algorithm, display an immediate connection
between attention of PM2.5 and COVID-19 deaths. These outcomes complete those of many
new part, which is discourse available data analyze on the relation between air pollution levels
and the COVID-19 epidemic (Bianchi and Cibella 2020; Cheng et al. 2020; Conticini et al. 2020;
Magazzino et al. 2020b; Schwartz et al. 2020). The specific attention paid for potentially result is
good and particularly important, the spreading of the epidemic, and the prognosis of
respiratory infections. It is under by hypothesis of high focus of PM (PM10, PM2.5) it makes the
respiratory system is very much susceptible to disease and complex of coronavirus infection.
Much highest other constant is exposure to PM over time (as with the elderly), highest and
prospect that respiratory system is predisposed for more severe disease. This air pollution
condition is improved by fast economic growth, as demonstrated by time series frame. Such air
pollution is the characteristic in developing countries and their intensive polluting process.
Therefore, it’s important to put in point and measuring range it destroyed by pollution. They
can be reassigned down. And manage the emanation developed for motor vehicles, it’s good to
set on place find to adopt vehicles that are less emissive. So measures would admit the use of
vehicles that consume less fossil fuel, which output in less emanation due to less combust.
There should also be support for clean fuel with less toxic emissive capability to adequately cut
down or less the toxic gas emanation. In the far flung run, the country should think of select
advance technology which have already been follow by growing countries, that is used in
electrically driven vehicles. There are pollutants-free vehicles that need to replace the existing
fuel-depend on vehicles, in command to fulfil a less number of fossil gases emanation such as
carbon monoxide and sulfur dioxide, which are spread in the atmosphere. It can prop, for long
time, in cut down or less the respiratory-related diseases that have an upmarket value of
medication to plurality community. & How agricultural functions are carrying out be
streamlined to meet the present and environmentally friendly means. Farmers should be
educated by good pattern such as the use of natural plantation, so it’s spread no pollution in
the environment. Continuously use current fertilizers will have affected the environment by
having soil affected by chemicals. The use of manure should be Putin with current plantation;
granger should get the way to animals and chicken. The branch-let and bushes should be
ground and out of sight in the soil to create natural humus collection for the harvest. Burning
break at all cost it even leads to kill the living organisms that should otherwise perform aeration
of the soil to maximize its output. The government could set aside funds to educate farmers on
this, and the air pollution that has always been witnessed will be a story of the past in the state
of Punjab and other states. & Dust which would always cause smog during winter can be
minimized by increasing green cover in major parts of the country. Those places with bare land
should be covered by planting either grass or any land cover such as trees to prevent any dust
source. Planting of trees will serve a better position as it will not only create a soil cover but also
serve as windbreakers, thus stop ash from moving from one point to another. If this is achieved,
issues of formation of smog during winter will never be completed again; on the other hand,
will be contained by putting in place measures that curb wrongdoers such as smokers near
forest areas. Clear instructions should be displayed to the public on the effect of causing a fire
in a forest. Firefighting departments should also set up near forest areas and major towns to
curb any possible incidence of fire occurrences. & Municipalities should create rules regarding
guidelines for the emanation and emission limits of virus substances. Power plants should
ensure that the gases pass up to the atmosphere are treated to assure that no dangers
substance is pass off to the atmosphere. Such mechanisms will only be possible if the
environmental agency sets out measurement that will be able to guide the population and also
if standards will be established. This will go a long way in insure that most of the bad gases that
would spread respiratory diseases are under control. Such control mechanisms will lead to an
increase in the life expected people live in India, because harsh illness like cancer and asthma
will considerably minify. Permanent review of power plants should be periodical manage to
able the rate of pollution. & worthless or unused management should be controlled by set up
scheme that deal with the execution of unused medication and reprocess. The long-term goals
should be to insure that landfills are no longer available in major towns and that all the
worthless is medication and reused. united management should do away with landfills and
insure that every frim, industry, corporation, and even at separate level medication their
unused before discard it to the environment. Such measures will insure that no toxins are fain
of. Localized management should also insure that wrongdoer are penalizing sternly to service as
examples. Such act of punishing would let in close of factoring company that detached
untreated useless. Such mechanisms will unsure other firm from discard of untreated useless.
The expenses that municipalities would expend on transection of useless from towns to landfills
should rather be used in the medication of the useless at the informant level.

You might also like