Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976

6480(Print), ISSN 0976 6499(Online) Volume 5, Issue 6, June (2014), pp. 23-26 IAEME
23











DATA MINING PREDICTION USING DATA MINING EXTENSIONS (DMX):
A CASE STUDY ON E-GOVERNANCE BIRTH REGISTRATION DATA
MINING MODEL


Pushpal Desai
1


1
(M.Sc. (I.T.) Programme, VNSGU, Surat, India)




ABSTRACT

In this work, implementation of Data Mining Extensions (DMX) query on various Data
Mining Models is discussed. In last few years, many private companies have extensively used Data
Mining for prediction analysis. Similarly, in this paper, implementation of DMX prediction queries
on Data Mining Models for e-governance data is discussed. The results derived from DMX
predication queries indicate that prediction analysis could be used by administrators for future
planning and decision making.

KEYWORDS: Data Mining Extensions (DMX), Prediction Query, Microsoft SQL Server Analysis
Services.

I. INTRODUCTION

Data Mining is successfully implemented in several domains such as Banking, Insurance,
Credit Card Fraud Detection, Loan Approval, Customer Relationship Management, Weather
Forecasting, Oil and Gas Exploration, Mining, Network Security, Telecommunication, Medical
Science etcDepending upon the problem, different Data Mining approaches like Clustering,
Classification, Association Rules Mining, Time Series Analysis, Regression, Sequence Analysis.
Besides data mining algorithm, Data Mining Extensions (DMX) is successfully implemented in
different areas like Heart disease decision support system using data mining classification modeling
techniques [6], Risk assessment of complication of arterial high blood pressure [7], Prediction
control strategies for industrial processes [8] etcSimilarly, in this work, DMX is used on Birth
registration e-governance data mining model.


INTERNATIONAL JOURNAL OF ADVANCED RESEARCH
IN ENGINEERING AND TECHNOLOGY (IJARET)


ISSN 0976 - 6480 (Print)
ISSN 0976 - 6499 (Online)
Volume 5, Issue 6, June (2014), pp. 23-26
IAEME: http://www.iaeme.com/IJARET.asp
Journal Impact Factor (2014): 7.8273 (Calculated by GISI)
www.jifactor.com

IJARET
I A E M E
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976
6480(Print), ISSN 0976 6499(Online) Volume 5, Issue 6, June (2014), pp. 23-26 IAEME
24

II. METHODOLOGY

In this work, Data Mining is used to make prediction based on different Data Mining Models.
Data Mining Extensions (DMX) language is specially designed to work with Microsoft SQL Server
Analysis Services. We can use DMX language for creating new data mining model, train data mining
model, browse data mining model and predict from data mining model [1]. There are mainly two
types of DMX statements. The data definition statements allow creating new data mining structure
and models and drop existing data mining structure and models. The data manipulation statements
work with existing data mining models and structures. The data manipulation statements allow
browsing and prediction from the existing data mining models [2]. In this work, DMX data
manipulation statements are considered for making prediction from the existing data mining models.
We can use DMX prediction query for "Prediction join", "Natural prediction join", "Empty
prediction join" and "Singleton query" [3]. In this work, Empty prediction join DMX queries are
implemented and "Prediction join", "Natural prediction join" and "Singleton query" DMX queries
are not considered. The DMX empty prediction join query is used for most likely prediction from the
content of the mining model [3]. Typically, prediction queries are used to predict unknown column
values [3]. However, we can use regular prediction query to create prediction from the cases from the
data sources [3]. In this type of DMX query, we do not pass any information to the mining model
input columns and mining model returns the most likely prediction [5]. The Predict function is
used to predict Delivery Method ID state and PredictProbability function to predict probability
for different states from the data mining model. The Association Rules model for Birth Registration
e-governance data contains various input fields such as Religion, Father Education, Mother
Education, Year and Delivery Method ID and Delivery Attention ID as predict only fields. In DMX
Query 1.1, Association Rules model for Birth Registration e-governance data is used to predict most
likely Delivery Method ID state. Many times, besides the most likely outcome, the data owners are
also interested in knowing probability of other states of particular attribute. In this scenario
PredictProbability function can be utilized [4]. In the same query, PredictProbability function is used
to predict probability of various states such as Delivery Method =1 for Caesarean, Delivery Method
= 2 for Forceps / Vaccum and Delivery Method =3 for Natural.

DMX Query 1.1
SELECT
Predict([AM_ReligionID_FatherEducationID_Input_DevliveryMethodPredict].[Delivery Method
ID]) as [Delivery Method ID],
PredictProbability([AM_ReligionID_FatherEducationID_Input_DevliveryMethodPredict].[Delivery
Method ID],1) as [Method 1: Caesarean],
PredictProbability([AM_ReligionID_FatherEducationID_Input_DevliveryMethodPredict].[Delivery
Method ID],2) as [Method 2: Forceps/Vaccum],
PredictProbability([AM_ReligionID_FatherEducationID_Input_DevliveryMethodPredict].[Delivery
Method ID],3) as [Method 3: Natural]
From [AM_ReligionID_FatherEducationID_Input_DevliveryMethodPredict]
Similarly, in the DMX Query 1.2, the Association Rules mining model is used to predict most likely
Delivery Attention ID along with different states. In the same query PredictProbability function is
used to predict probability of various states such as Delivery Attention =1 for Doctor, Nurse or
Trained Midwife, Delivery Attention=2 for Institutional-Government, Delivery Attention =3 for
Institutional-Private or Non-Government, Delivery Attention = 4 for Relatives or Other, and
Delivery Attention = 5 for Traditional Birth Attendant.


International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976
6480(Print), ISSN 0976 6499(Online) Volume 5, Issue 6, June (2014), pp. 23-26 IAEME
25

DMX Query 1.2
SELECT
Predict([Asso_FT_MT_EDU_DEL_METHOD].[Delivery Attention ID]) as [Delivery Attention ID],
PredictProbability([Asso_FT_MT_EDU_DEL_METHOD].[Delivery Attention ID],1) as
[Method1:Doctor, Nurse or Trained Midwife],
PredictProbability([Asso_FT_MT_EDU_DEL_METHOD].[Delivery Attention ID],2) as
[Method2:Institutional-Government],
PredictProbability([Asso_FT_MT_EDU_DEL_METHOD].[Delivery Attention ID],3) as
[Method3:Institutional-Private or Non-Government],
PredictProbability([Asso_FT_MT_EDU_DEL_METHOD].[Delivery Attention ID],4) as
[Method4:Relatives or Other],
PredictProbability([Asso_FT_MT_EDU_DEL_METHOD].[Delivery Attention ID],5) as
[Method5:Traditional Birth Attendant]
From [Asso_FT_MT_EDU_DEL_METHOD]

III. RESULTS

The data mining prediction queries were executed on data mining models. These DMX query
were executed by using Predict and PredictProbability functions. The Predict function returns
predicted values or set of values for a specified column and PredictProbability functions returns
probability of specified state. In both DMX queries, scalar column is given to the predict function
and its result is also the scalar value [4].
The result of DMX Query 1.1 predicted most likely value 3 for Delivery Method ID
attribute. The result indicates that the most likely delivery methods as Natural with 0.77
probability.


Fig 1: The result of DMX Query 1.1

The result of DMX Query 1.2 predicted value 3 for Delivery Attention ID attribute. The
result indicates that the most likely delivery attention method Institutional-Private or Non -
Government with 0.54 probability.


Fig 2: The result of DMX Query 1.2

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976
6480(Print), ISSN 0976 6499(Online) Volume 5, Issue 6, June (2014), pp. 23-26 IAEME
26

IV. CONCLUSION

This work demonstrates the use DMX query for making prediction from existing data mining
models. The predictions derived from DMX queries can be utilized by top level management for
planning and decision making. The work presented in the paper is very limited considering full
potential of DMX queries. However, in future, other DMX query types like "Prediction join",
"Natural prediction join" and "Singleton query" can be considered to take full advantage of DMX
and extend research areas.

V. ACKNOWLEDGEMENT AND LIMITATIONS

All results are based on data provided by the municipal corporation for the research purpose
only. Hence results may change, if data mining algorithms and DMX queries are applied on actual
data sets.

VI. REFERENCES

[1] Data Mining Extensions (DMX) References SQL Server 2012 Books Online, Microsoft.
[2] http://technet.microsoft.com/en-us/library/ms132058.aspx, Last access date: 15th April, 2014.
[3] http://technet.microsoft.com/en-us/library/ms131992.aspx, Last access date: 15th April, 2014.
[4] Jamie MacLennan, ZhaoHui Tang and Bogdan Crivat, Data Mining with SQL Server 2008,
Wiley Publication
[5] Brian Larson, Delivering Business Intelligence with Microsoft SQL Server 2008
[6] Sellappan Palaniappan and Rafiah Awang, "Web-Based Heart Disease Decision Support
System using Data Mining Classification Modeling Techniques" in the proceedings of
iiWAS2007, pp. 157-167.
[7] MBUYI MUKENDI Eugne, KAFUNDA KATALAYI Pierre and MBELU MUTOBABevi,
DATA MINING AND NEURAL NETWORKS II DMX USE FOR RISK ASSESSMENT OF
COMPLICATIONS OF ARTERIAL HIGH BLOOD PRESSURE, IJCSI International Journal
of Computer Science Issues, Vol. 9, Issue 5, No 1, September 2012, ISSN (Online):
1694-0814, pp. 377-383.
[8] Waldemar and Konrad, The use of Data Mining Approach to Predict Control Strategies for
Industrial Process, Automatuka, 2007, pp 287-293.
[9] P.N.Santosh Kumar, Dr. C.Venugopal and Dr. C.Sunil Kumar, Applications of Data Mining
in Medical Databases, International Journal of Computer Engineering & Technology
(IJCET), Volume 4, Issue 6, 2013, pp. 284 - 289, ISSN Print: 0976 6367, ISSN Online:
0976 6375.
[10] Vijay Arputharaj J and Dr.R.Manicka Chezian, Data Mining with Human Genetics to
Enhance Gene Based Algorithm and DNA Database Security, International Journal of
Computer Engineering & Technology (IJCET), Volume 4, Issue 3, 2013, pp. 176 - 181,
ISSN Print: 0976 6367, ISSN Online: 0976 6375.
[11] Chaitrali S. Dangare and Dr. Sulabha S. Apte, A Data Mining Approach For Prediction of
Heart Disease using Neural Networks, International Journal of Computer Engineering &
Technology (IJCET), Volume 3, Issue 3, 2012, pp. 30 - 40, ISSN Print: 0976 6367, ISSN
Online: 0976 6375.

You might also like