Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8


Faculty of economics and management

(Semestral project)

Subject: Statistics IB
Academic year: 2020/2021
Study programme: Bachelor
Author: Alen Ospanov

Descriptive statistics..........................................................................................4
Statistical method..............................................................................................5
Chi^2 testing......................................................................................................6
Polynomial regression.......................................................................................6
Air pollution is one of the main environmental issues nowadays. There are
many reasons behind regularly increasing this air pollution. Most of the air
pollution is caused by the automobiles, transport means, industrialization,
growing cities, etc. The release of several harmful gases or dangerous elements
from such sources is causing the whole atmospheric air pollution. Ozone layer
is also getting affected too much by the air pollution which causes serious
disturbances to the environment. Increasing need of the ever growing human
population is main cause of pollution. Daily human being activities causing
dangerous chemicals to release, making atmosphere dirtier than ever and
forcing the climate change negatively.
Industrialization process releases many harmful gases, particles, paint and
batteries contains lead, cigarettes releases carbon monoxide, transport means
releases CO2 and other toxic substances to the atmosphere. All the pollutants
are being in contact with the atmosphere, destroying the ozone layer and calling
harmful rays of sun to the earth. In order to reduce the level of air pollution we
should bring some huge changes to our habits on daily basis. We should not cut
trees, use public transportation, avoid spray cans, and so many activities in the
favor to reduce the effects of air pollution.

The purpose of my work is to analyze and calculate the Pollution Index for the
certain countries in this list for the year 2021.

Country Pollution Index

Helsinki, Finland 13,34
Vienna, Austria 17,55
Ljubljana, Slovenia 23,21
Bishkek, Kyrgyzstan 74,54
Milan, Italy 67,21
Kiev, Ukraine 65,62
Barcelona, Spain 65,21
Paris, France 64,36
Warsaw, Poland 60,25
Minsk, Belarus 41,12
Berlin, Germany 39,15
Almaty, Kazakhstan 78,74
Tirana, Albania 88,01
Tbilisi, Georgia 74,82
Istanbul, Turkey 69,11

I found all data from this website

Descriptive statistics

n 15
min 13,34
max 88,01
m 3,872983346
n 19,2797111

mean 33,74485555
mode 146,3742444
median 121,7246208

variance 16767,27998
standart 129,4885322

n = number of observations in dataset

min = the lowest value of dataset
max = the highest value of dataset
m = the population mean
h = size of class interval
mean = middle value of dataset
mode = the most commonly occurring value in dataset according to formula
median = value in the middle of the dataset according to formula
variance = the average of the squared differences from the mean
standart deviation = average distance of each value of the dataset from mean
Statistical method


Multiple R 0,997620045
R Square 0,995245754
Adjusted R Square 0,994880042
Standard Error 7,401007756
Observations 15

R Square is a statistical measure that represents the proportion of the variance

for a dependent variable that's explained by an independent variable or variables
in a regression model.

When investing, a high R-squared, between 85% and 100%, indicates that
stocks or funds are moving relatively in line with the index.

df SS MS F Significance F
Regression 1 149064,3261 149064,326 2721,39763 1,732E-16
Residual 13 712,0739054 54,7749158
Total 14 149776,4

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95,0%Upper 95,0%
Intercept 411,5561869 5,112203478 80,5046569 6,2561E-19 400,511943 422,600431 400,511943 422,600431
Pollution Index -4,405327226 0,084446601 -52,167017 1,732E-16 -4,587763 -4,2228914 -4,587763 -4,2228914

Chi^2 testing
ni expect.freq.
LL UL ni xi xi*ni (xi-average)^2*ni pi pi*n Chi^2
1 96,42 115,6997111 3 106,059856 318,179567 4186,1 0,11 16,4862043 11,03211526
2 115,6997111 134,9794222 3 125,339567 376,0187 980,1 0,24 34,771145 29,02998035
3 134,9794222 154,2591333 2 144,619278 289,238555 2,9 0,32 46,3718862 42,45814533
4 154,2591333 173,5388444 8 163,898989 1311,19191 3357,0 0,22 32,0247121 18,02316881
16 2294,62873 8526,0 0,90 100,5434097 test statistics
average 143,4142958
variance 532,8772048
standard dev. 23,08413318
CV 7,814727903

Polynomial regression

Regression Statistics
Multiple R 0,997846094
R Square 0,995696828
Adjusted R Square 0,994979633
Standard Error 1,659634318
Observations 15

df SS MS F Significance F
Regression 2 7647,94046 3823,97023 1388,32035 6,34939E-15
Residual 12 33,0526328 2,75438607
Total 14 7680,99309

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95,0% Upper 95,0%
Intercept 103,0965383 8,8222532 11,6859646 6,4946E-08 83,87449987 122,318577 83,8744999 122,3185768
Rank -0,255761891 0,02695213 -9,4894883 6,2896E-07 -0,31448553 -0,1970383 -0,3144855 -0,197038251
x^2 -0,001350983 0,00120456 -1,1215547 0,28399373 -0,003975501 0,00127353 -0,0039755 0,001273534

y = 103,1 - 0,255x - 0,0013x^2 + u

Intercept - expected value of dependent variable if the independent variable is

equal to zero.
Significance level is higher than P-value it means that we accept H1.
99% of the variance of the dependent variable being studied is explained by the
variance of the independent variable.

I analysed the indices of these countries and cities and came to the conclusion
that the cleanest city to live in was Helsinki, Finland.
It was all very interesting to analyse many of the values with chi-square,
regression analysis.
In this project I got a good knowledge in using these two statistical methods.
Also, as a result, I learned that my city of Almaty is among the polluted cities.

You might also like