Application of Visual Data Mining in Higher-Education Evaluation System

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

2009 First International Workshop on Education Technology and Computer Science

Application of Visual Data Mining in Higher-education Evaluation System

Hanjun Jin, Tianzhen Wu Zhiliang Liu,Jianlin Yan


(Department of Computer Science, Huazhong (Department of Computer Science, Huazhong
Normal University, Wuhan) Normal University, Wuhan)
wutianzhen1984@163.com wutianzhen1984@163.com

Abstract Visual data mining technology used in higher-


education evaluation system can make the evaluation
In this paper, we design a Visual Data Mining method more flexible, more diverse and more visual,
(VDM) model by combing visualization technology and thus the efficiency would be improved.
data mining algorithm and apply it in the higher- Ⅱ. Research progress of visual data mining
education evaluation system. It provides a visual at home and aboard
environment for users to participate each step of
evaluation. Besides, some hidden information also be To evolve existed problems in the field of visual
mined in a visual way by the mining function of the data mining, a lot of work has been done in seeking
system and that would be helpful to instruct the further visualization method, standard of mining language and
work. mining system by foreign researchers and many visual
Keywords: Visualization, Data mining, Higher- data mining system have been developed, such as
education evaluation SAS/EM of SAS, Clementine of SPSS and intelligent
Miner of IBM. But all of them were expensive and
Ⅰ. Introduction haven’t proved very effective in our county. So it will
be of great value for us to develop a visualization data
Higher-education is a way to evaluate the education mining system which is suitable for the higher-
quality, management level of a higher-school and etc. education system of our county and research on the
I 1 t’s also an important way to improve education related algorithm.
quality and increase education benefit. Tradition In this paper, we will introduce the architecture and
evaluation method is to respond in a passive attitude by design program of a visual data mining model designed
reviewing works and sorting out materials[1]. The by ourselves, then we will also introduce the possibility
technologies of database, data mining provide strong and outlook of its using in the education system by
support of education evaluation. Data mining is a some real data.
process to cramp out potential and valuable knowledge.
Visualization is to visualize data information in a Ⅲ. Architecture and work station of VDM
graphic form and provide the observer a quantification
method to understand information hidden in data[2]. The main aim of this model is to realize education
The aim of visualization is to enhance the evaluation in a visual environment. This system
acknowledging ability and establish a feedback circle emphasizes the users’ participating and instructing
between users and graphs and use human knowledge to mining process. Visualization offers great analysis
avoid fault observe and unwise decision or action[3]. ability to demonstrate the hidden pattern and trend.
Visualization in data mining is a typical interactive
form. In this way, some details can be observed and the
intelligibility and reliability are increased.

Supported by Natural science foundation of Hubei Province


(No.2007ABA328)
Hanjun Jin, male, professor, doctor ,research field: Data mining,
Genetic algorithm and Virtual reality.
Tianzhen Wu ,female, Postgraduate ,research field: Data mining,

978-0-7695-3557-9/09 $25.00 © 2009 IEEE 101


DOI 10.1109/ETCS.2009.285

horized licensed use limited to: SVKM's NMIMS Mukesh Patel School of Technology Management & Engineering. Downloaded on August 26,2020 at 09:17:53 UTC from IEEE Xplore. Restrictions ap
information. At present there are many feasible
visualization technologies applied in data mining, such
as pixel technology, parallel coordinates technology
,magic eye technology and etc. Different technologies
are applied differently in every step of data mining
such as visualization of pretreatment, visualization of
mining process and visualization of mining result. In
this system, we adopt the following visualization
technologies.
.
1) Column graphs. Column graphs can show the
compare among different elements or express the
Figure 1. Architecture of VDM situation of one element at different time. In column
As shown in Figure 1, this is a multi-level graphic, discrete numeric field are arranged in
architecture based on component and module horizontal coordinate and successive values in vertical
technology, including application level, middle level coordinate. Column graph is drawn in a vertical way in
(WEB level, middleware level) and data level. In this the graph. A very common application is analyzing the
model, the data level, WEB level, and middleware variation of a successive value at discrete time unit. For
level locate in different environment, and new example, the variation of students’ amount according to
developed algorithms can add to the mining system the time of a higher-school can be shown in a column
directly without compiling. graph, as shown in Figure 2.
Ⅳ. Main functions and technical feature

A. Data access in VDM

First of all, the system will interactively select data


items through data link and pretreatment modules.
These items are used to participate interactively visual
sorting, and some of them need to be treated(treatment,
compression and transformation of defective or
uncertain data) and then express these data in view.
This part adopt client/server model , access relations
and multi-dimension through some interfaces such as
ODBC, ADO and OLED and select them automatically
according the source and target relations defined by Figure 2. Column graph
original data. The accessible data resources are some
database and formative data files such as ACCEEE, 2) Scatter graph. Scatter graph can map every record
EXCEL, SQL server. of the data items into graphic entity in a coordinate
system of 2-dimension or 3-deminsion. It’s a very
B. Data mining algorithm in VDM common method in data visualization and usually used
to discover and evaluate relationship between cause
We have realized many algorithms including and effect. For example, in our system, 2-dimension
association rules, decision tree algorithm and clustering scatter graph can express the relationship between
algorithm in the system. The system also allows users number of students and the pass rate of CET4, as
change the existed algorithms and add new algorithms shown in Figure 3.
only if the new added algorithms are added in the form
of DLL. Mining algorithms are compiled in C++ for it
is more stable.

C. Visualization technology in VDM

Visualization technology focuses on how to express


great amount multi-dimension data onto a screen and
be convenient for users to discover internal

102

horized licensed use limited to: SVKM's NMIMS Mukesh Patel School of Technology Management & Engineering. Downloaded on August 26,2020 at 09:17:53 UTC from IEEE Xplore. Restrictions ap
Important evaluation indexes in higher-education
evaluation. So we take a database of graduate
information in a higher-school as data resource, there
are nearly 5000 records in the database. After the data
integration, we can get the structure of data table as
following.

Table 1. Data table of graduates’ information


Type of Value and range
Name of numeric field
numeric field
Figure 3. Scatter graph Name Char(32) John.Smith,…
Number Int(8) 200641796,…
3). Parallel coordinates. Parallel coordinates is one of Deparment Char(40) Chinese,
the earliest data visualization technologies which Score of CET4 Int(4) 0~100
express n-dimension space in a 2-dimension method. Score of CET6 Int(4) 0~100
Its main idea is to map n-dimension space onto a 2- GPA Int(4) 0~100
dimension plane with n parallel coordinate of equal Ranking Int(8)
distance. And each coordinate line accord to one 0~200
Placement Char(16)
attribute. The value range is homogeneous distributed Dispatch, further
on the coordinate (and nominal attribute arranged in the study,
coordinate in turn) thus every record in the database undetermined
can be converted into a graphic form and expressed as
a broken line on a n-parallel coordinate. Its advantage B. Pretreatment of data
is to express data relationship visually and easy to In VDM, data pretreatment module of related
understand without any vector and other visual icon. In algorithm is used before constructing a mining model,
our system, each evaluation index can be indicated as a and all the data of continuation field type should be
dimension of the parallel coordinate. Parallel converted into scatter type. In the example, the data
coordinate can be used to discover the internal resource is pretreated by the pretreatment function of
associations among evaluation indexes and to instruct association rule algorithm in this way:
the further work of the higher-school. For example, the Score of CET4. high(>79), middle(60 ~ 79),
association rules among the number of students , entry low(<60)
score, pass rate of CET4 and pass rate of postgraduate Score of CET6. high(>79), middle(60 ~ 79),
education examination, as shown in Figure 4. low(<60)
GPA. high(>79), middle(60~79), low(<60)
Ranking. former(<20), middle(20~50),hind(>50)

C. Construction and training of mining model

Inputting the converted transaction database and


support threshold determined by experts of education
evaluation and then constructing a mining model of
association rules among numeric field of graduates’
information with the association rules algorithm in
VDM, and training the model with the data in the
information management database of the graduates.
We can get the following result shown in figure 5. In
Figure 4. Parallel coordinates the figure, 5 black vertical axes of equal distance
Ⅴ. Application example indicate the concerned attributes, and the horizontal
architrave linking the quadrate points and rounded
A. Explanation of data source points on the vertical axes indicate an association
rule. The round point indicate the former item of the
Generally speaking, the pass rate of CET 4 and rule and the quadrate indicate the hinder item of the
CET6, GPA, and the placement of the graduate are rule. The color information express the value of

103

horized licensed use limited to: SVKM's NMIMS Mukesh Patel School of Technology Management & Engineering. Downloaded on August 26,2020 at 09:17:53 UTC from IEEE Xplore. Restrictions ap
support and confidence. For each horizontal architrave, Reference
the color of round point provide the value range of
support and that of quadrate point provide the value [1]Guan Lijuan, Research and Implementation of Higher
range of confidence. The deeper the color of the point Education Evaluation Sys tem Based on Computer Network,
is, the bigger the value is. As we can see in the figure 5, Computer Education,563-564
the forth architrave express the a great number of [2] Jiawei Han , Micheline Kamber . Data Mining Concept s
graduates who got a high score of CET4 and CET6 and Techniques [M] . Morgan Kaufmann publishers , 2000.
have gone to further study, that is, the association rule: [3]Tom Soukup, Ian Davidson, Visual Data Mining
Techniques and Tools for Data Visualization and Mining[M].
score of CET4. high^ score of CET6.high=>further Beijing, Publishing house of electronics industry,2004
study , has a higher confidence. [4]Wang Jiacai,Chen Qi,VISMiner:An Interactive Visual
Data Mining Prototyped System[J], Computer
Engineering,2003.1:17-18
[5]Wang Xiaotong, Du Fang, On Visualized Simulation and
Its Appfication, Computer Engeneering, 1998.8:20-21
[6]Gen Xuehua ,Fu Desheng, RESEARCH ON VISUAL
DATA M INING TECHNIQUE, ComputerApp lications and
Software,2006.2:85-87
[7]Dong Liyan,Liu Guangyuan, Visual DataMining
Techniques, Journal of Jilin University ( Information Science
Edition),2006.11:567-569

Figure 1. Association rules expressed with parallel


coordinate
Ⅵ. Conclusion
Visualization technology is very useful in
discovering interesting knowledge and the related
characteristics. In this paper, we have designed a
experiment model of visual data mining based on the
multi-level architecture and applied it in higher-
education evaluation system. This model is base on the
windows 2003 and the mining algorithms are compiled
in the Visual c++ 6.0, visualization adopts the
OPENGL interface. The model has the following
features,
1. Visual. It supports the visualization of data,
mining process, and mining result.
2. Interactive. It makes full use of human’s role
played in the process of data mining. Users can
select the areas they are interested in and
analysis deeply.
3. Integrative. It realized the integration of
system and database, system and information
manipulation and analysis, system and every
mining task and algorithm.
At present, there are a few problems in the VDM,
for instance, it can only deal with discrete data, this
asks the users for a better understanding of the system
and data. The experts of higher-education evaluation
need the help of the researchers of data mining field to
finish the task of model construction. These problems
are still in need of solution.

104

horized licensed use limited to: SVKM's NMIMS Mukesh Patel School of Technology Management & Engineering. Downloaded on August 26,2020 at 09:17:53 UTC from IEEE Xplore. Restrictions ap

You might also like