Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

2011 Seventh International Conference on Computational Intelligence and Security

Pattern Recognition with Spatial Data Mining in web:


An infrastructure to engineering of the urban cadaster
Andr Fabiano de Moraes Lia Caetano Bastos
Department Information Technology - IT Department Civil Engineering - GeoEngineering
Institute Federal of Science Technology, IFC Federal University of Santa Catarina, UFSC
Camboriu SC, Brazil Florianpolis SC, Brazil
e-mail: ecv3afm@ecv.ufsc.br e-mail: ecv1lcb@ecv.ufsc.br

Abstract - In recent times are constant the problems of process of decision support for Bana and Costa [3] is initially
interoperability of data, mainly due to the constant emergence an open system with specific components: the actors, values,
of different structures CAD/GIS (Computer Aided Design/ objectives, actions and characteristics. The activity of
Geographic Information Systems) produced by engineers, decision support can then be seen as a process of interaction
managers and other professionals through the fast with an ill-structured problem, in which the elements and
development of collaborative technologies on the web. their relationships emerge in a more or less chaotic manner.
However, structures are also detected for manipulating objects Thus, the concept of MCDA provides four dimensions
that determine the cognitive domain of a problem, especially in that are necessary for the existence of a problem: the
the construction of urban or rural properties and land
existence of a dissatisfaction, the existence of a proponent,
management. To this end, the study presents an investigation
the importance to merit the effort of resolution and, finally,
into the relevant aspects that influence in projects of
engineering. It also investigates the demands for detecting
the existence of a possible solution. In the constructivist
predefined structures for agile actions and the exchange of paradigm, the objective of the models is the generation of
experiences. Identifying factors critical to development and knowledge to decision makers.
assisting in the recognition of integration structures. During To define the dimensions to be included in the building
the discussion are also presented the results extracted with the of a strategic model, an elementary point of view (EPV) of
framework, providing webmapping applications through the each is initially presented, which represents a set of
implementation of collaborative strategies and combining the minimum specifications for the structure and functioning of
main techniques used in geographical data mining. Finally, the the method. From the results obtained with this method
article discusses priority issues in the implementation of spatial structure, in line with the proposals of the Thematic Group
data mining and the advancement of civil engineering Geo-coding Electronic Government [4], the INDE, and also
technology for knowledge management. with the specifications of standard spatial data proposed by
OGC [5], it was possible to support the development of the
Keywords: Infrastructure; Collaboration; Interoperability; Fundamentals Point of View (FPV) and Elementary Point of
Spatial Data Mining; Free Software View (EPV) as detailed descriptors, which allowed us to
understand the requirements and prioritize the development
I. INTRODUCTION of a strategic environment for integration and collaboration.
The use of different technologies for the process of B. Elaboration of Requirements for Descriptors
engineering, especially in the case of GIS as a tool for
Descriptors to assist in identifying the individual profile
decision making, there is a strong belief within public
of existing technologies (solutions) were constructed in
institutions in the spatial analysis process. The lack of tools
conjunction with users and decision makers to meet each
to facilitate integration and interoperability between different
elementary point exactly, enabling the formulation of a tree
sources of geographic data has, however, hampered these
with all of the key points of views. Thus, it was possible to
processes, as has the discontinuation of projects requiring the
select 18 descriptors, as follows: D1 - Hardware; D2 -
technical knowledge of professionals, resulting in the
Network; D3 - Portability Operating Systems; D4 -
reduction of resources that usually affect institutions [1].
Programming Language; D5 - Databases; D6 - Browsing
The paper presents two contributions with the purpose to
Interface; D7 - OGC Standard; D8 - ISO Standard; D9 -
minimize the demand of the users about the lack of solutions.
Libraries; D10 - Exchange of data; D11 - Metadata; D12 -
First, the fundamental aspects of an interoperable spatial data
Ontologies; D13 - Integration and relationship; D14 - Query
infrastructure are investigated, and the requirements for
processing; D15 - Statistical; D16 Data Mining; D17 -
preparing the framework through the Multicriteria Decision
Collaboration; and D18 - Organizational Routines.
Support-Constructivist (MCDA-C) are also assessed. The
These points, in turn, provide strong indicators for the
second contribution presents a case study using the
planning and construction of essential facilities and
framework to detect data structures of objects cognitive and
innovative new frameworks. For this purpose, graphics are
patterns recognition geographical [2].
given that allow for the visualization of conflicts between the
A. Reasons for the problem domain with MCDA-C descriptors and their forecast for future events. To validate
The structuring of the decision support aims to build a the model and obtain the desired confidence about building
structure that is accepted by the actors shaping a reality. The the profile of an individual impact, the descriptors were also

978-0-7695-4584-4/11 $26.00 2011 Crown Copyright 1331


DOI 10.1109/CIS.2011.296
evaluated by an analysis of the robustness and overall through the metadata that enables communication between
sensitivity, individualizing each elementary point built. different distributed services. For this strategy, the
The equation for the sensitivity analysis deals with the implementation of ontologies becomes a fundamental
change in the replacement rates as a function of the change alternative to the reading and identification of conceptual
in the rate of the substitution of one criterion. The sum of the structures, in particular spatial patterns, designed to assist in
replacement rates that is adopted by the multicriteria model organization, integration between disparate systems, and
discussed in this paper is equal to 1. supporting systematic isolation.


All of the replacement rates should have a value between
A. Preparation of the Core Framework
0 and 1: { 1 > wi > 0 i }.
For the overall evaluation results that are obtained, we This preparation is usually performed during the
investigated the elements VG1(i3Geo), VG2(CartoWeb), encoding of framework called Integration Collaborative
VG3(AlovMap), VG4(MapStraction) and VG5(SpringWeb): Geospatial Framework Web (OpenICGFw-2.7)
{VG(a) = w1 . v1 (a) + w2 . v2 (a) + + wn . vn (a)}, i.e.: corresponding, taking into account the earlier discussions
n and the obtained the primary requirements for the core of the
VG (a ) = wi .vi (a ) framework.
i =1
For the repository principal of framework was initially
VG (a) overall value of the action adopted PostgreSQL database, in particular the automated
vi (a) partial value of the action of the criteria features for handling spatial data. Also the possibility of
wi replacement rates of the criteria developing applications on three tiers (client, server and
n number of criteria in the model database) allowed for the appearance of various systems
To calculate the new rates of the replacement model after webmapping. For the documentation of the architecture of
modifying one rate, we used the following equation: the project, documentation was prepared through numerous
wn .(1 wi ') artifacts containing specifications of the Unified Modeling
wn ' = Language - UML, together with specification of the OMT-G
(1 wi ) to the geographic extent [6].
Checks of the sensitivity of each elementary point (EPV)
are performed as illustrated, and changes in individual rates, B. Infrastructure of Providing Interoperability for Spatial
beyond the actions sensitive to this variation, undergo a Data
single analysis. Thus, it was possible to establish a flexible To promote the integration of data through the
framework for diagnosing the reflection of evolutionary framework OpenICGFw, some situations are initially
changes and the technological point of view on each restricted in this paper, so that a simple way to integrate the
elementary feature for reuse. information is specified [7], generating updated knowledge
II. ARCHITETURE OF THE FRAMEWORK for collaboration, planning and coordination of joint actions
between public and private institutions. Some challenges,
After the structuring of the requirements, a technology however, must still be faced, such as data stored on
infrastructure is critical for establishing technical criteria that workstations and not on a particular server, essential data
are feasible for implementation. and products that are not cataloged, and a lack of physical
This flexibility is in contrast to other types of information
infrastructure and technology, limiting the storage of large
services, which require specific environments and software
amounts of data.
to enable approximations between the institutions through
dynamic mechanisms of communication, as in Fig. 2. These challenges require the development of systems
integration and collaboration to address problems inherent in
the architecture, engineering, construction and facilities
management. For this, there are implementations with a free
and open source solution for managing such data. However,
to identify the processing of input data, metadata are needed
to clarify the characteristics of one or more data sets and
their storage, either local or remote. From this diagnosis, it is
possible to proceed to the question of how to perform other
procedures in addition to the new modules, as in Fig. 3.
One of the important differences of this work is the
module for the exchange of spatial data, which aims to
provide access to these resources through a web interface in
a simplified form for the user.
Figure. 2: Layered definition of the technological platform for structuring
This paper allows for the specification of a simple way
the framework. to integrate information and generate knowledge that is
updated for collaboration, planning and coordination of joint
Sort operational links between the client, server, actions between public and private institutions. Currently,
geographic objects and operations in GIS requires a drill however, some challenges remain to be addressed.

1332
specific features of a parser, that is responsible for reading
and writing files with the encoding XML/GML and can thus
transform the data.
C. Providing Geo-Semantic Web - OWL
The proposed standards of geo-semantic for
interoperability of data from the framework, considers two
aspects of implementation.
One aspect is syntactic and another semantic. With
respect to syntactic arguments are addressed in the
implementations of XML (eXtensible Markup Language) for
notation of the framework, WSDL (Web Service Description
Figure 3: Approximation procedures for interoperability of spatial data Language) interface for web services and SOAP (Simple
through the framework. Object Access Protocol) for message format for services
web. At the end of the scheme is designed OWL (Web
An alternative that reduces the production of spatial data Ontology Language) for the concentration of the topics [8]
without defined standards is a centralized storage and For the semantic aspects are evaluated functions,
cataloging structure that enables distributed management. combinations of translation and optimization tasks.
For this particular challenge, it is crucial to identify in As the ontology is an explicit specification of a context
advance the validity of data in the process of generating [9]. The need for customization to a specific semantic
metadata. interoperability of spatial data led to the creation of a
From these metadata, we obtain the identification of vocabulary for the information management of rural and
previously known structures, significantly reducing the loss urban properties.
of any unique feature. Thus, significant progress is achieved This allowed the exchange of content between different
in the process of interoperability between spatial data. computer systems through a shared ontology, as in Fig 5.
Data integration is directly linked to the exchange of
data between the different models. To convert the data, one
solution explored refers to the copy of the original data for
processing and generates a new set of data to create
opportunities to run complex computational processes with
a specific notation in accordance with the OGC and W3C.
Thus, communication between data is treated as Fig. 4.

Figure 4: Scheme to interoperate with cognitive objects.


Figure 5: Ontology proposal to mapping of the cognitive objects of Urban
Several aspects of this situation favor the already Technical Cadastre.
consolidated storage of the data being transmitted to a
specific repository. However, initially in the integration From the data identification through the cataloging and
framework OpenICGFw, as part of having a repository on storage (metadata), were aggregate the specifications of the
the server, providing the link between the aforementioned created ontology (utc-owl) , where are executed the queries
steps using a file server technology about the objects cognitive [10]. Mainly using the SQL
Even with the completion of the reading process, language, beyond the processes and techniques of spatial
processing and storage, the need arose for the standardization data mining, as discussed [11].
of data conversions to the framework, resulting in the
development of a specific notation for conversion full III. CASE STUDY
projects. The case study presents a summary of several
To carry out the implementation of this notation within experiments conducted to assess the implementation of the
the framework, we designed a program in C++ that contains modules within the OpenICGFw framework, mainly

1333
focusing on the manipulation of spatial data for the territorial However, there are also the dimensions contained in
area of the city of Itaja, State of Santa Catarina, Brazil. different data formats, such as the extensions (.shp), (.pdf),
The data were obtained from different sources and were (.tiff) and (.dwg).
subject to periodic updates. The Brazilian Institute of
Geography and Statistics (IBGE) [12] data were obtained A. Advances with standardization of structures
from census data, freely available in the (.shp) format. In the From this point it is noticed a gap be fulfiled, which
City of Itaja [13], the data were obtained in part through the does not represent just spatial data structuration. But,
Urban Technical Cadastre (UTC). Partial data of the especially for Brazil, it repesents an adequate spatial
mapping of rural areas, such as the topography, hydrography, infrastructure for urban technical cadastre (UTC) that
and elevation, were obtained from the Company for possibilitates portabilty of previously stablished standards
Agricultural Research and Rural Extension of Santa Catarina for data interchange, providing interaction among various
(EPAGRI) [14]. Through the National Institute for Space geographic information systems.
Research (INPE) [15], satellite imagery was obtained to Through data infrastructure for Brazilian system of
monitor natural phenomena and their temporality. Finally, urban technical cadastre , as Fig. 6. It is craving to reduce
environmental engineering data, such as vegetation and miss information beyond conflicts related to property right
native forests, were obtained from the Ministry of and land subdivision. Besides other technical aspects that
Environment (MMA) [16]. After identifying the area
are observed by public managers.
observed in the case study, we present a survey on key issues
with data and metadata.
In the first evaluation, the aspects of different sources for
integration with the geographic information system (S1) of
IBGE/Estatcart were analyzed, allowing for the crossing of
statistics from this system with other sources close to the
OpenICGFw framework (f).
In the second evaluation, recovery activities and
conversion and import of data sets of the scheme relating to
the registration information sheet (S2) of Itajai/BIC were


performed, making possible the intersection of spatial data,
where (fn) (S1) (S2).
Even after the union of the EPAGRI data (d1) with the


MMA data (d2) and the union of the INPE data (d3), it was
possible to generate the function (fn) {(d1) U (d2) U


(d3)}.
For the experiment (fn) (S1) (S2), all records (S1)
without loss of data or discard were selected. For (S2),
Figure. 6: Classes diagram UML OMT-G for integration structure.
49,413 records of members of the original database were
selected. There were 87 with records that had been disposed With the study realized for research and final
because of null values and therefore did not meet standards, implementation, where are illustrated CNEFE (Address
resulting in a total of 49,325 records. National Cadastre for Statistical Finalities of IBGE) main
The main point of this experiment was to detect hitherto classes, they are integrated to the proposed model for spatial
unknown knowledge, assessing the conservation status of data infrastructure for multipurpose cadastre. Beyond of
urban and rural properties in the city (S2) by the data attributes that belongs classes, each problem dominion and

presented. After the implementing the mining process on the its relationships are standardized, generating specific
spatial data (S1) (S2), we obtained an automatic thematic portable for other finalities, making the cadastral
generation of the rules. model more flexible and really multipurpose.
Adopting the classification technique, modifications were
made in the source code of the algorithm (C 4.5), which can B. Results
be tailored in their implementation on the server and can be After the object recognition of case study and application
viewed directly in the browser (browser). The experiment of the algorithm of association, found new rules for the
allowed for the creation of a derivation of C4.5 for the web, revaluation of 15% of rural and urban properties in "good"
giving rise to a new algorithm that was initially called use condition.
OWC4.5-2011 (open web classification). The aim was to However, the possibility of identifying clusters of objects
build the decision tree through dynamic structures. cognitive that can contain and provide information about a
The processing capability of consultations within the region mainly assists in the reuse of a project, as it reduces
OpenICGFw framework on a set of metadata should also be high costs, adds standards to the process of collecting and
emphasized, as it extends the process to other sources and preparing the data, and makes them available for the
returns the existing designs to represent the spatial dimension development of new projects in a collaborative manner, as
requested. illustrated in Fig. 7.

1334
assisting in the investigation of large volumes of data and the
reuse of existing structures and knowledge. However, there
are factors that prevent collaboration and the dissemination
of spatial data within the institution. From this work came
new initiatives and experiments. This work also highlighted
the need for improvements, including new libraries and
increasing the compatibility for integration for the extensive
list of tools commonly adopted in the engineering
laboratories of public institutions. During the development of
algorithms for rule extraction, were found several problems
to be discussed in more detail in future work, including the
processes of research on ontologies for rule extraction and
identification of spatial patterns.
ACKNOWLEDGMENT
The authors acknowledge the Graduate Program in Civil
Engineering at the Federal University of Santa Catarina
(UFSC) for the opportunity and support structure the project
[200739557].
REFERENCES
[1] D. Steudler, M. Trhnen, FLOSS in Cadastre and Land Registration,
Edition by Food and Agriculture Organization of the United Nations
FAO, Roma, Italy, 2010, pp. 4-49.
[2] G. M. Neunzert, Subdividing the Land: Metes and Bounds and
Rectangular Survey Systems, CRC Press, 2010.
Figure. 7: Individualized evaluation of the rules through maps (cartogram). [3] C. A Bana e Costa, Processo de apoio deciso: actores e aces;
estruturao e avaliao, Publicao CESUR, v. 618, 31, 1993.
Another concern relates to the interpretation of the results [4] eGOV - Governo Eletrnico, 2010, Available from:
<http://www.governoeletronico.gov.br>
obtained by the process of data mining, which aims to
provide for the visualization of the processes of association, [5] OGC - Open Geospatial Consortium, Inc. 2009, Available from:
<http://www.opengeospatial.org>
classification or clustering of data that is mainly stored in
[6] K. A. V. Borges, C. A. Davis Jr., A. H. F. Laender, OMT-G: An
different formats and documentations. Object-Oriented Data Model for Geographic Applications,
Finally, for analyses of high complexity, the display Geoinformatica, vol. 5 3, 2001, pp. 221-260.
module also provides a spatial representation through maps doi:http://dx.doi.org/10.1023/A:1011482030093.
(cartogram), allowing for access to different queries without [7] P. V. Oosterom, S. Zlatanova, Creating Spatial Information
the presence of experts with technical knowledge, bringing Infrastructures: Towards the Spatial Semantic Web, CRC Press, 2008.
decision-makers an innovative mechanism for monitoring [8] W3C Standards Web Semantic. Available from:
and auditing projects. http://www.w3.org/standards/semanticweb/
[9] W3C Geospatial Vocabulary. Available from:
IV. CONCLUSIONS http://www.w3.org/2005/Incubator/geo/XGR-geo-20071023/
Through the case study, the primary processes [10] C. Jeang-Kuo, C. Wen-Ting. Appending mining data of spatial
object for query, Granular Computing, 2009, GRC '09. IEEE
responsible for the exchange of spatial data were established, International Conference on , vol., no., pp.53-56, 17-19 Aug. 2009,
and during the experiments, new needs and consequent doi: 10.1109/GRC.2009.5255165
improvements in the source code of the core of the [11] H. Jin; B. Miao. The research progress of spatial data mining
framework were implemented. technique, Computer Science and Information Technology
Significant advances were achieved through an extensive (ICCSIT), 2010 3rd IEEE International Conference on , vol.3, no.,
series of experiments, which mainly used ranking pp.81-84, 9-11 July 2010, doi: 10.1109/ICCSIT.2010.5564659
algorithms, association and clustering. However, more [12] IBGE Instituto Brasileiro de Geografia e Estatstica, Geocincias,
significant results were obtained with the classification 2009, Available from:
http://www.ibge.gov.br/servidor_arquivos_geo/
method, reaching the goals initially set for the paper.
[13] PREFEITURA Municipal de Itaja Mapas, 2010, Available from:
Surveying the needs for identification and adaptation <http://www.itajai.sc.gov.br/mapas.php>
algorithms stands out as the greatest contribution of this [14] EPAGRI Empresa de Pesquisa Agropecuria e Extenso Rural de
work, enabling the creation of rules through a new algorithm Santa Catarina Mapas Digitais, 2010, Available from:
for analysis of cognitive objects toward the spatial <http://ciram.epagri.sc.gov.br/mapoteca/>
representation. Through research, it became possible to test [15] INPE Instituto Nacional de Pesquisas Espaciais Imagens de
the applicability of spatial data mining resources on the web. Satlite, 2010, Available from: <http://www.dgi.inpe.br/CDSR/>
This test provided engineering departments with innovative [16] MMA Ministrio do Meio Ambiente do Brasil. Mapas Interativos
collaborations within the OpenICGFw framework, mainly Geoprocessamento, 2009, Available from: http://www.mma.gov.br

1335

You might also like