Professional Documents
Culture Documents
3 RD Literature Paper
3 RD Literature Paper
3 RD Literature Paper
Abstract: Prevention and control of air pollution has become an essential activity in many cities. Air is polluted at unacceptable
levels by industries and heavy vehicular traffic in cities which affects human health conditions to a great extent. Forecasting,
Predicting and controlling air pollution is the need of the hour to protect human beings from health hazards. Air pollution poses
threats not only to humans but also to entire flora and fauna. The prime objective of this paper is to propose a new method to
predict air pollution using data collected on monthly basis and provide recommendations to prevent and control air pollution.
This research work comprises of two phases. The first phase preprocesses the chosen dataset using python coding. The
second phase analyzes the preprocessed data to predict air pollution levels. Kaggle dataset containing monthly air pollution data
collected over the period 2000 to 2010 is subjected to the proposed method. Predictions for a future month are made by
computing Air Quality Index(AQI) metric and computed threshold value for the previous two months. The proposed method
shows acceptable accuracy in performance.
Index Terms: Air Pollution, Air Quality Index, Analysis, Pollution Forecasting, Prediction, Prevention, Control.
—————————— ——————————
2541
IJSTR©2020
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 01, JANUARY 2020 ISSN 2277-8616
2542
IJSTR©2020
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 01, JANUARY 2020 ISSN 2277-8616
In this proposed work, data sets for the years 2000 to 2010 are
Aggregation of Air Quality Index: collected from the Kaggle website and preprocessed using big
data analytics and python coding. After preprocessing, AQI
values of NO2, CO, SO2, O3 are computed on monthly basis.
Then New AQI is calculated using the formula (1) for every
month in the years from 2000 to 2010. This
Threshold Calculation
The proposed work is carried out in two Phases. Phase I Threshold value is computed as an average value of all the
computes AQI (Air Quality Index) value which is used in Phase AQI values of a the chosen month.
II to make prediction. Phase II makes prediction based on the
AQI values and Threshold value computed. Prediction
Prediction for the chosen month is made by comparing the
Phase I
actual AQI value with the Threshold value. The Threshold
Step 1: Pre-processing value is compared with all the average values of the previous
Step 2: AQI Value Computation two month’s AQI. Average value lesser than the Threshold
value indicates absence of air pollution. Average value greater
Phase II than or equal to the Threshold value indicates the presence of
Step 1: Analysis of New AQI air pollution. The process is repeated for all the months in an
year.
2543
IJSTR©2020
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 01, JANUARY 2020 ISSN 2277-8616
4 RESULT AND DISCUSSION Confusion Matrix has the information on actual class and
predicted class. Performance of this proposed work is
Table 1 shows the difference between calculated and actual
evaluated using the data in the matrix.
AQI values for a given month. The threshold value is 4.38
which is computed based on the Actual AQI values. The Table.2. Actual Class and Predicted Class - Illustration
threshold value is compared with each average values of the Actual
days in previous two months. Either the presence or absence
of air pollution is determined based on the difference value
compared with threshold value. TRUE FALSE
Threshold=4.38
It is evident from the data shown in table 1, for the given
month (for example April), the Difference value is found to be Accuracy Rate:
lesser than the Threshold value for 10 days. Hence, it is Accuracy Rate is the proportion of the total number of
concluded that the air pollution is not present on those days predictions that are correct. It is determined by the following
and it is present for remaining 20 days equation.
Classification Accuracy Rate = (TP + FN) / (TP + TN + FP +
Confusion Matrix FN)
TABLE 1
PREDICTION OF AIR POLLUTION FOR A MONTH BASED ON AQI VALUE
Day AQI Actual February March Average Difference Prediction
Value(April) (A) (T - A) If (T-A) >=0 or (T-A)<0
2544
IJSTR©2020
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 01, JANUARY 2020 ISSN 2277-8616
5 CONCLUSION
Air pollution is dangerous for nature as well as for human
beings. Prediction and remedial actions is the need of the
hour. In this research work, the data set chosen from Kaggle
website is preprocessed first to separate pollutant parameters
NO2, CO, SO2, O3. The prediction of air pollution is performed
in two phases. The first phase computes AQI (Air Quality
Index) values for all the days in a month. The second phase
computes threshold value of AQI as an average of previous
months average AQI values. Air pollution for the days in
chosen month is predicted by comparing the threshold value
with the average of the previous two month values. Big data
analytics are used to handle huge data volumes and Python
coding is used to implement computational procedures.
Prediction accuracy and error rate are computed. The results
are found to be encouraging. Further research work is in
progress to include other environmental parameters.
ACKNOWLEDGMENT
This article has been written with the financial Support of
RUSA-Phase2.0 grant sanctioned vide Letter NO.F,24-
51/2014-U,Policy (TN Multi-Gen),Dept of Edn. Govt of India,
Dt. 09.10.2018
REFERENCES
[1] https://en.wikipedia.org/wiki/Air_pollution.
[2] Shweta Taneja,Dr.Nidhi Sharma ―Predicting Trends in air
pollution in Delhi using data mining‖,2016 IEEE.
[3] Peijiang Zhao, Koji Zettsu ―Convolution Recurrent Neural
Networks Based Dynamic Transboundary Air Pollution
Predictiona‖, 2019 the 4th IEEE International Conference
on Big Data Analytics.
[4] HOW can affect the human being atmospheric And
environment pollution.
[5] https://en.wikipedia.org/wiki/Air_pollution.
[6] Polaiah Bojja, Vivith Kumar Karumuri ―Development and
Evaluation of Pollution Forecasting Model Using Soft-
Computing Methods for PMIO and S02 in Ambient Air‖
IEEE WiSPNET 2016 conference.
[7] Yi-Ting Tsai, Dept. of Computer Science and Information
ngineering National Taipei University.‖ Air pollution
forecasting using RNN with LSTM‖, 2018 IEEE 16th Int.
Conference.
[8] Ranjana Waman Gore, ―An Approach for Classification of
Health Risks Based on Air Quality Levels‖ 978-1-5090-
4264-7/17/$31.00 ©2017 IEEE.
[9] Ling Wang ―Prediction of Air Pollution Based on FCM-HMM
Multi-model‖Proceedings of the35th Chinese Control
Conference July 27-29, 2016, Chengdu, China
2545
IJSTR©2020
www.ijstr.org