Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

ARTICLE IN PRESS

Computers & Geosciences 33 (2007) 465–475


www.elsevier.com/locate/cageo

Spatial estimation model of porosity


B. Tutmeza,, Z. Hatipoglub
a
Inonu University, Department of Mining Engineering, 44280 Malatya, Turkey
b
- iftlikköy, 33342 Mersin, Turkey
Mersin University, Department of Geological Engineering, C
Received 25 April 2006; received in revised form 10 July 2006; accepted 18 July 2006

Abstract

This paper addresses a spatial estimation model which uses fuzzy clustering algorithm and assesses the aquifer porosity
based on point cumulative semimadogram (PCSM) measure. In order to obtain the estimated porosity values, the model
employs standard regional dependence function (SRDF) which provides weights for different regional locations depending
on the distances from the reference site. The proposed methodology has three stages: (1) structure identification; (2) spatial
dependence measure; and (3) interpolation. The model has been tested using a real data set which was taken from an
aquifer in Turkey. The performance evaluations indicate that the new methodology can be applied in geological based
domains.
r 2006 Elsevier Ltd. All rights reserved.

Keywords: Aquifer porosity; Fuzzy clustering; Point semimadogram; Standard weighting

1. Introduction also be represented in percent terms by multiplying


the fraction by 100%. There are several ways to
The acquisition of groundwater requires expen- determine porosity. It can be measured by using
sive operations. For that reason determination of argon or mercury porosimeter in the laboratory,
the most productive drilling location and depth is which is a widely used technique, or pressure tests or
important to reduce cost. The productivity of wells geophysical well logs (Taud et al., 2005). Porosity is
depends upon mainly porosity and hydraulic con- also determined by means of X-ray computerized
ductivity of aquifer materials. Porosity indicates the tomography (Akin et al., 2000; Clausnitzer and
storage groundwater capacity and hydraulic con- Hopmans, 1999).
ductivity determines groundwater flow capacity of In general, geological systems have unisotropic
aquifer. These two properties reveal the water and heterogeneous characteristics due to their
production of aquifer. Porosity is defined the ratio natural occurrences. These properties show difficul-
of the non-solid volume to the total volume of the ties for spatial modelling of regionalized variables.
material and its value ranges between 0 and 1. It can Heterogeneity is a qualitative characteristic denot-
ing that the properties observed at different sites do
Corresponding author. Tel.: +90 422 3410010; not have the same value (S- en, 1998). Although the
fax: +90 422 3410046. classical semivariogram technique provides a mea-
E-mail address: btutmez@inonu.edu.tr (B. Tutmez). sure of spatial dependence among a multitude of

0098-3004/$ - see front matter r 2006 Elsevier Ltd. All rights reserved.
doi:10.1016/j.cageo.2006.07.008
ARTICLE IN PRESS
466 B. Tutmez, Z. Hatipoglu / Computers & Geosciences 33 (2007) 465–475

locations, it has practical difficulties in heteroge- and requires extreme cautions and laboratory work.
neous systems as discussed by S- en (1989). In order Therefore, it is estimated using the available data.
to eliminate the drawbacks of classical semivario- The problem of spatial estimation considered in this
gram, S- en (1989) proposed a new measure which is study can simply be formulated as follows: given a
the point cumulative semivariogram (PCSV) and it region sampled at n locations xi with values of
has been applied in different areas (S- en, 1998; S- en porosity p(xi ), i ¼ 1,y,n, what are the values p(xk)
and Habib, 1998; S- en and Habib, 2000; Tarawneh at unsampled locations xk, k ¼ 1,y, N?
and S- ahin, 2003). In any optimization method, the main concept is
According to this technique, the PCSV for a that the estimation at any location is considered as a
reference (pivot) location is obtained by ranking the weighted average of the measured values at the site.
squared differences and then taking summations. The weighting factor is related to the distance
In the present paper, we study the modelling of between pivot (reference) and each location in
aquifer porosity using fuzzy clustering based point addition to the porosity values measured at the
cumulative semimadogram (PCSM) measure which locations. There is a reverse relation between the
is a new spatial estimation approach. Use of fuzzy weighting factor and distance. As a general hypoth-
set theory in engineering systems obtains many esis, if the distance between data values is small, the
advantages. Fuzzy systems provide the possibility of data values are close to each other. On the other
integrating (logical) information processing with the hand, great distance implies dissimilar data values.
attractive mathematical properties of general func- The main interest of the paper is evaluating the
tion approximators (Tutmez et al., 2006). The most regional dependence of porosity. For regional
attractive characteristic of fuzzy algorithms is able prediction, consider n irregularly scattered measure-
to describe complex multivariable problems in a ment locations with a point a, where the regional
transparent effective way (Setnes et al., 1998). Due estimation of the porosity is desired. Supposing that
to these superiorities of fuzzy algorithms, we used the measurements at different locations are denoted
the fuzzy clustering in the first step for grouping by Pi (i ¼ 1,2,3,y,n)and the reference site under
data and identifying the model structure. Thus, investigation by Pr, then the following weighting
information taken from the clustering application average expression as
has been directly used in spatial measure and Pn
W ðd i;k ÞPi
interpolation. Pr ¼ Pi¼1 n , (1)
The proposed model consists of three main i¼1 W ðd i;k Þ

stages: (1) structure identification (2) measuring where W(di,k) is the weighting factor between the
the spatial variability; and (3) standard weighted location at i and the reference location at k that
interpolation. The model is developed by using real- corresponds to the distance di,k and Pi is the
world data regarding the water sources in Mersin porosity value at i. In order to reflect the spatial
region of southern Turkey. The performance dependence behavior of the phenomenon, regional
evaluations showed that the proposed model is very covariance and semivariogram functions are among
transparent and it has a high prediction capacity. the early alternatives for the weighting functions
The paper is organized as follows: Section 2 that take into account the spatial correlation of the
describes how porosity is modelled based on phomenon considered (S- en and S- ahin, 2001). The
weighting function. Section 3 gives comprehensive regional covariance function requires a set of
information on the proposed spatial model. Section assumptions such as the Gaussian distribution of
4 presents the case study for estimating the aquifer the regionalized variable. Similarly, the semivario-
porosity values from the spatial coordinates. In gram function does not always yield a clear pattern
addition, the performance evaluation of the model of the regional correlation structure. In order to
is given in the last part of this section. Section 5 eliminate these drawbacks, the cumulative semivar-
concludes the paper. iogram function has been proposed by S- en (1989).

2. Problem and estimation method 3. A spatial model for porosity estimation

The true value of the aquifer parameters is mostly The construction of the proposed estimation
unknown until it has been measured. For example, model proceeds in three main stages: (1) structure
the determination of the porosity is often difficult identification (fuzzy clustering); (2) measuring the
ARTICLE IN PRESS
B. Tutmez, Z. Hatipoglu / Computers & Geosciences 33 (2007) 465–475 467

fuzzy set methodology provides a systematic basis


for quantifying the uncertainties which resulting
from vagueness. Therefore, the fuzzy c-means
(FCM) clustering technique (Bezdek et al., 1984),
which is the most popular fuzzy clustering algo-
rithm, was used to identify the structure.
Fuzzy clustering is used to partition sample
points into subgroups which are characterized by
cluster centers. Each data point belongs to a cluster
center with a degree which is determined by the
membership grade. First the input and output
variables of the fuzzy model are chosen. Second, a
regression matrix X and an output vector y are
constructed from the data set
X ¼ ½x1 ; . . . ; xn T ,
y ¼ ½y1 ; . . . ; yn T . ð2Þ
Fig. 1. Flow chart of the proposed model.
Following the parameter selection, data set is
partitioned in the Cartesian product space X  y by
spatial variability (cumulative semimadogram ana-
using FCM clustering. In the clustering algorithm,
lysis); and (3) standard weighting and interpolation
both data coordinates and its measured value
(Fig. 1).
(porosity) should be represented and clustering
must be carried out in the three dimensional space.
3.1. Structure identification
This point is very important because the variables
used in porosity estimation are characterized by
Data clustering is a method which can be employed
their spatial distribution. The FCM clustering
for describing the adjacent sites. By this method, data
algorithm partitions 3-dimensional vectors into c
are classified and adjacent locations used for the
fuzzy clusters. The algorithm minimizes an objective
estimations are determined. Clustering algorithms are
function based on distance measures between
used extensively not only to organize and categorize
cluster centers and data points. The fuzzy partition
data, but are also useful for data compression and
matrix satisfies probabilistic property as
model construction. For numerical data one assumes
that the members of each cluster bear more X
c
mik ¼ 1; 8k ¼ 1; . . . ; N. (3)
mathematical similarity to each other than to
i¼1
members of other clusters. One of the simplest
similarity measures is distance between pairs of For partitioning a collection of N data points into
feature vectors in the feature space. If one can c classes, we define an objective function Jm for a
determine a suitable distance measure and compute fuzzy c-partition,
the distance between all pairs of observations, then N X
X c
one may expect that the distance between points in J m ðU; vÞ ¼ ðmik Þm ðd ik Þ2 , (4)
the same cluster will be considerably less than the k¼1 i¼1
distance between points in different clusters (Ross, where, mA[1,N] is a weighting exponent, mik is the
2004). Clustering techniques are validated on the membership of the kth data point in the ith
basis of the following assumptions (Jang et al., 1997): class. The term, dik is a Euclidean distance measure
(in 3-dimensional feature space, R3) between
 similar inputs to the target system to be modelled the k th sample data xk and ith cluster center vi,
should produce similar outputs; given by
 these similar input–outputs pairs are bundled " #1=2
into clusters in the training data set. X
3
d ik ¼ dðxk  vi Þ ¼ jjxk  vi jj ¼ ðxkj  vij Þ2 .
j¼1
Assessment of quantity of an aquifer is an
uncertain process. As a soft computing technique, (5)
ARTICLE IN PRESS
468 B. Tutmez, Z. Hatipoglu / Computers & Geosciences 33 (2007) 465–475

Cluster means (prototypes) and elements of range parameter (Deutsch and Journel, 1998). The
membership matrix are computed as follows: point semimadogram (PSM) has been proposed by
PN m Tutmez (2005). This function is similar to the PSV;
mik xk
ci ¼ Pk¼1N m
, (6) instead of squaring the difference between Zm and
k¼1 mik Zm+h, the absolute difference is taken. If the
and experimental variogram includes the outlier values,
the PSM is more convenient than the point
1 semivariogram (PSV) due to the advantages of
mik ¼ Pc 2=ðm1Þ
. (7)
p¼1 ðd ik =d pk Þ absolute difference measure Tutmez (2005). The
PCSM measure can be obtained from data by
executing the following steps:
3.2. Point cumulative semimadogram
(a) calculate distance between the concerned loca-
Earth science data have spatial correlation that
tion and the remaining locations. If there are N
cannot be handled with classical statistical techni-
locations, the number of different distances
ques. The spatial variability in any phenomenon
N1, hi(i ¼ 1,y, N1).
within an area can be measured by comparing the
(b) for each pair (pivot and any other location), find
relative change between two locations. Two numer-
the half of absolute differences between data
ical values z(x) and z(x+h) at two points x and
values, in this case the porosity values. By this
x+h separated by the vector h are spatially
way, each distance will have its half of absolute
correlated. As the distance between these values
value.
increases, one would expect that the spatial correla-
(c) plot distances versus corresponding successive
tion decreases and vice versa. This correlation
cumulative sums of half of absolute differences.
modelled by the squared-difference, V(d), represents
By using this procedure, a non-decreasing
the relative change in the best possible way:
function which is the sample PCSM at the pivot
V ðdÞ ¼ ðZðxÞ  Zðx þ hÞÞ2 . (8) location is obtained. Its mathematical expres-
sion is given as
In general, variance and correlation techniques
which are used to quantify the degree of regional 1NX1

variability cannot account correctly for regional gðhi Þ ¼ jZc  Z i j, (10)


2 i¼1
dependence caused by either irregularity of sam-
pling positions or non-normal distribution func-
tions. In order to eliminate these drawbacks, the where g(hi)is the PCSM value; Zc and Zi are the
semivariogram measure was proposed. The semi- porosity values at pivot location and other
variogram function is effective especially for regular adjacent locations, respectively.
data points. However, in practice, measurement (d) apply previous steps by considering different
locations are mostly irregulary spaced. pivot locations, to give N sample PCSMs.
On the other hand, cumulative experimental
semivariogram (CESV) is a simple but effective
3.3. Standard weighting and estimation
way of assessing the regional heterogeneous beha-
viours proposed by S- en (1989) as an extension of
It is mentioned in Section 3.2 that calculation of a
classical semivariogram. The CESV technique is the
sample PCSM leads to a non-decreasing function
successive summation of the semivariograms for
with distance. In this section, the standard regional
irregularly spaced distances. The CESV function is
dependence function (SRDF) (S- en and S- ahin, 2001;
expressed as follows:
Tarawneh and S- ahin, 2003) is applied. The SRDF
1 X provides weights for different regional locations
NðhÞ
gðhÞ ¼ ½ZðxÞ  Zðx þ hÞ2 , (9) depending on the distance from the pivot location.
2NðhÞ i¼1
This function value is calculated using the following
where g(h) is the CESV value at distance h, N(h) is steps:
the total number of equally spaced observations.
For this study, the PCSM function is used.  find the maximum PCSM value, (gm)which is
Madograms are particularly useful for establishing taken at the greatest distance, (dm),
ARTICLE IN PRESS
B. Tutmez, Z. Hatipoglu / Computers & Geosciences 33 (2007) 465–475 469

 divide all the PCSM values by(gm). The result Formation occur with a small surface area to the
appears as a scaled form of the sample, northwest of the Tarsus (Fig. 2). Ophiolitic rock
 PCSM values within limits of zero and one, and which contains limestone blocks is situated north-
 subtract the dimensionless PCSM values from west of Mersin and it appears in valleys. The
one at each distance. The resulting non-decreas- prevaling rock type is the sedimentary rock belong-
ing function is named as the SRDF. ing to the Tertiary age. Karaisali, Guvenc, Kuzgun
and Handere Formations deposited in this age
In the final stage, each porosity value is multiplied include the intercalation of sandstones, siltstones,
by the corresponding standard weight and contribu- conglomerates, limestones, claystones, marl and
tions for each location are calculated. For a pivot gypsum (Senol et al., 1998). Lying above all is
location, estimated porosity value is taken from caliche and secondary calcium carbonate deposi-
dividing the total contributions by the total tion. These formations form the low productive
standard weights. Hillside Aquifer (Hatipoglu, 2004). Study area is
located on the south coast of this basin and it is
4. Case study comprised of Neogene sediments of a fan-delta type
alluvial deposition system. It is characterized as a
In this section, in order to evaluate a regional and flat or gentle surface and it forms the most
functional dependence of aquifer porosity, fuzzy productive Coastal Aquifer in the area.
clustering based PCSM analysis is employed to
examine the structural and spatial distribution of 4.2. Study area and sampling
the process. In addition, performance of the
proposed methodology is compared with perfor- In this study, the coastal area between Mersin and
mance of an established model (PCSV) in literature. Tarsus cities is investigated. This area is located in
Southern part of Turkey and it contains agricultur-
4.1. Geological and hydrogeological setting al, industrial and settlement areas. Groundwater is
used as a main source of water in this region.
The basin fill took place from the Carboniferous Because groundwater is widely used for water
to Quaternary. Outcrops of the Karahamzausagi supplies, efficient groundwater management is im-

Fig. 2. Locality map of study area (Tutmez et al., 2006).


ARTICLE IN PRESS
470 B. Tutmez, Z. Hatipoglu / Computers & Geosciences 33 (2007) 465–475

portant for this area, and requires understanding of Determination of the optimal number of clusters is
groundwater storage capacity. an important step in clustering. If the number of
In order to estimate porosity distribution of the clusters is unknown, various methods (Pal and
area, 32 well logs were used. These wells were drilled Bezdek, 1995; Kaymak and Babuska, 1995) might
by General Directorate of State Hydraulic be employed to find a suitable number of clusters. In
Works and General Directorate of Rural Services this study, a novel cluster validity approach which
between 1957 and 1997. Location of the wells has been proposed by Tutmez (2005) especially for
was measured with Garmin Etrex GPS (Fig. 3). In evaluating the geological data is used. It is based on
our study well logs were split into 1 m depth reproducing the variability of the sample data in the
and according to the description of the unit value of cluster centers with minimum number of
on well logs, effective porosity (pe) values are clusters as follows:
designated to these units from Spitz and Moreno
Minimize nc under Std½pðxÞ  Std½pðcÞ, (12)
(1996). Some (pe) values used in the data set are
given in Table 1. To estimate the effective porosity where nc is the optimal number of cluster, Std. is the
distribution of the aquifer between 0 and 20 m standard deviation of porosity. The number of
elevation, the average porosity value was calculated clusters was determined experimentally by using the
for this depth. FCM clustering under (12). The appropriate num-
ber of clusters resulted to be three (Fig. 4). Fig. 5
shows the outcome of this operation.
4.3. Data clustering
4.4. Measuring the spatial dependence
In the first stage of the application, data set was
partitioned by using the FCM clustering algorithm In this stage, the spatial variability is modelled
In pattern recognition, it is often suggested that the by PSM function. Functional analyses were
data should be appropriately normalized before
clustering (Jain and Dubes, 1988). The values were Table 1
scaled by using a linear transformation between 0.03 pe ranges in Spitz an d Moreno ( 1996) and pe values in data set
and 0.3. The data set Z to be clustered is formed by
combining X and p Unit pe value (Spitz and Moreno, 1996) Using pe value

Gravel 0–15–0.37 0.30


Z ¼ ½X ; pT , (11)
Sand 0.08–0.4 0.35
where, T denotes the transpose of the matrix, X is Silt 0.004–0.05 0.04
Clay 0.007–0.05 0.03
the input matrix (x,y), and p is the porosity vector.

Fig. 3. Map of study area showing location of wells.


ARTICLE IN PRESS
B. Tutmez, Z. Hatipoglu / Computers & Geosciences 33 (2007) 465–475 471

Fig. 4. Cluster validity application.

SRDF is appraised for the location no: 25, the


other location weightings with respect to the pivot
location (no: 25) can be obtained easily. The sixth
column in Table 5 includes the SRDF weighting for
location no 25, which can also be taken from the
graph in Fig. 7.
Both of the graphs shown in Fig. 7 explain the
spatial dependence in terms of distance. In Fig. 7,
the closest location to the pivot contributes the
highest weight, and the furthest ones relatively
contribute the least weights. SRDF graph of the
PCSM has more smooth structure than of the PCSV
due to absolute values. In other words, the PCSM
function is more suitable approximation tool than
PCSV for this application.
In order to obtain the estimated values, spatial
Fig. 5. Clustered data set.
interpolations have been carried out for each
location using Eq. (1). The seventh column in Table
carried out based on the distance measures between 5 is the porosity contribution which was calculated
pivot locations and other locations within the same multiplying the porosity by the SRDF values as
clusters. The calculated distances for each cluster weights. For location no 25, substitution of these
were summarized in Tables 2–4. The PCSM values values in Eq. (1) leads to the estimation of porosity
were calculated using the porosity values at the as 0.418/7.004 ¼ 0.060 (see Table 5).
corresponding distances indicated in Tables 2–4. The same methodology was applied to PCSV and
Fig. 6 shows experimental PCSM and PCSV outcomes of this application have been presented in
structures for location no: 25. PCSM graph has Table 6. As can be noticed from Table 6, PSCV
more smooth structure than PCSV. In addition, model with extreme values (0.113 and 0.146) has
contrary to PCSV figure, there is no drastic change been resulted in a drastic increase in contri-
in PCSM figure. bution. On the other hand, the effects of these
extreme values in proposed model were more
4.5. Weights and estimations limited due to absolute differences. Note that
extreme porosity value (0.300) on location 14, the
In the last step, the weights for different regional furthest location from the pivot, has a negative
locations depending on the distance from the pivot effects on the estimations carried out by PSCV
location have been calculated by the SRDF. Use of due to squared differences. As an example
this algorithm, for example, provided that the of this; although the location no. 31 is one of the
ARTICLE IN PRESS
472 B. Tutmez, Z. Hatipoglu / Computers & Geosciences 33 (2007) 465–475

Table 2
Dimensionless distances for cluster 1

No. x y 1 2 8 9 10 11 12 15 21 24 30 32

1 0.114 0.132 0.000 0.082 0.066 0.031 0.108 0.103 0.139 0.071 0.083 0.055 0.132 0.056
2 0.104 0.061 0.082 0.000 0.047 0.065 0.067 0.116 0.146 0.032 0.091 0.050 0.109 0.103
8 0.075 0.079 0.066 0.047 0.000 0.056 0.043 0.090 0.117 0.065 0.066 0.069 0.079 0.100
9 0.124 0.103 0.031 0.065 0.056 0.000 0.092 0.083 0.121 0.052 0.061 0.054 0.115 0.081
10 0.055 0.042 0.108 0.067 0.043 0.092 0.000 0.092 0.107 0.092 0.074 0.106 0.048 0.142
11 0.104 0.065 0.103 0.116 0.090 0.083 0.092 0.000 0.039 0.116 0.025 0.133 0.074 0.158
12 0.085 0.045 0.139 0.146 0.117 0.121 0.107 0.039 0.000 0.151 0.061 0.169 0.071 0.194
15 0.133 0.073 0.071 0.032 0.065 0.052 0.092 0.116 0.151 0.000 0.092 0.037 0.128 0.093
21 0.102 0.069 0.083 0.091 0.066 0.061 0.074 0.025 0.061 0.092 0.000 0.109 0.068 0.137
24 0.123 0.106 0.055 0.050 0.069 0.054 0.106 0.133 0.169 0.037 0.109 0.000 0.145 0.058
30 0.046 0.030 0.132 0.109 0.079 0.115 0.048 0.074 0.071 0.128 0.068 0.145 0.000 0.176
32 0.111 0.163 0.056 0.103 0.100 0.081 0.142 0.158 0.194 0.093 0.137 0.058 0.176 0.000

Table 3
Dimensionless distances for cluster 2

No. x y 3 4 5 14 17 18 20 25 27 31

3 0.192 0.193 0.000 0.085 0.136 0.245 0.151 0.160 0.108 0.145 0.160 0.150
4 0.169 0.116 0.085 0.000 0.065 0.281 0.153 0.099 0.128 0.081 0.099 0.220
5 0.189 0.057 0.136 0.065 0.000 0.274 0.131 0.056 0.126 0.022 0.055 0.265
14 0.258 0.162 0.245 0.281 0.274 0.000 0.202 0.294 0.158 0.272 0.295 0.305
17 0.300 0.098 0.151 0.153 0.131 0.202 0.000 0.117 0.082 0.117 0.118 0.219
18 0.237 0.044 0.160 0.099 0.056 0.294 0.117 0.000 0.146 0.039 0.002 0.261
20 0.234 0.134 0.108 0.128 0.126 0.158 0.082 0.146 0.000 0.125 0.147 0.210
25 0.209 0.049 0.145 0.081 0.022 0.272 0.117 0.039 0.125 0.000 0.038 0.265
27 0.235 0.043 0.160 0.099 0.055 0.295 0.118 0.002 0.147 0.038 0.000 0.263
31 0.290 0.300 0.150 0.220 0.265 0.305 0.219 0.261 0.210 0.265 0.263 0.000

Table 4
Dimensionless distances for cluster 3

No. x y 6 7 13 16 19 22 23 26 28 29

6 0.059 0.048 0.000 0.004 0.012 0.072 0.011 0.096 0.065 0.067 0.028 0.096
7 0.059 0.051 0.004 0.000 0.010 0.068 0.012 0.093 0.062 0.064 0.026 0.092
13 0.057 0.047 0.012 0.010 0.000 0.072 0.013 0.083 0.053 0.065 0.018 0.094
16 0.078 0.116 0.072 0.068 0.072 0.000 0.077 0.114 0.096 0.030 0.081 0.031
19 0.048 0.046 0.011 0.012 0.013 0.077 0.000 0.092 0.059 0.074 0.021 0.099
22 0.042 0.046 0.096 0.093 0.083 0.114 0.092 0.000 0.038 0.099 0.072 0.114
23 0.030 0.047 0.065 0.062 0.053 0.096 0.059 0.038 0.000 0.087 0.038 0.102
26 0.095 0.097 0.067 0.064 0.065 0.030 0.074 0.099 0.087 0.000 0.075 0.049
28 0.042 0.044 0.028 0.026 0.018 0.081 0.021 0.072 0.038 0.075 0.000 0.098
29 0.068 0.138 0.096 0.092 0.094 0.031 0.099 0.114 0.102 0.049 0.098 0.000

furthest location from the pivot, weight 4.6. Performance evaluation


of the location no 31 was found to be very high
(0.812). However, it is expected that the weighting In order to evaluate the performance of the
factor calculated at the location no 31 should be identified spatial model, we have plotted the measured
lower. porosity values versus the estimated porosity values
ARTICLE IN PRESS
B. Tutmez, Z. Hatipoglu / Computers & Geosciences 33 (2007) 465–475 473

(Fig. 8). The large determination coefficient indicates


that the model has good prediction capability.
We have also compared the performance of the
presented spatial model against the performance of
the PCSV model (Fig. 9).
In addition to determination coefficient, perfor-
mances of the models have been compared each
other using the following performance indexes
namely, the root mean square error (RMSE) and
the variance account for (VAF).
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u N
u1 X
RMSE ¼ t ðp  p̄i Þ2 , (14)
N i¼1 i

Fig. 6. Experimental PCSV and PCSM for location no: 25. Fig. 7. SRDF graphs for location no: 25.

Table 5
Porosity estimation for location no: 25 by proposed model

Location Porosity PCSM Scaled distance Distance ratio SRDF weighting Contribution Estimation

25 0.058 0.000 0.000 0.000 1.000 0.058 0.058


5 0.057 0.001 0.022 0.079 0.998 0.057
27 0.031 0.014 0.038 0.140 0.944 0.029
18 0.031 0.028 0.039 0.144 0.889 0.028
4 0.037 0.038 0.081 0.298 0.847 0.031
17 0.113 0.066 0.117 0.431 0.736 0.083
20 0.146 0.110 0.125 0.460 0.559 0.082
3 0.066 0.114 0.145 0.535 0.543 0.036
31 0.030 0.128 0.265 0.977 0.487 0.015
14 0.300 0.249 0.272 1.000 0.000 0.000
7.004 0.418 0.060
ARTICLE IN PRESS
474 B. Tutmez, Z. Hatipoglu / Computers & Geosciences 33 (2007) 465–475

Table 6
Porosity estimation for location no: 25 by PCSV

Location Porosity PCSV Scaled distance Distance ratio SRDF weighting Contribution Estimation

25 0.058 0.000 0.000 0.000 1.000 0.058 0.058


5 0.057 0.000 0.022 0.079 1.000 0.057
27 0.031 0.000 0.038 0.140 0.990 0.031
18 0.031 0.001 0.039 0.144 0.980 0.030
4 0.037 0.001 0.081 0.298 0.974 0.036
17 0.113 0.002 0.117 0.431 0.932 0.105
20 0.146 0.006 0.125 0.460 0.824 0.120
3 0.066 0.006 0.145 0.535 0.823 0.054
31 0.030 0.007 0.265 0.977 0.812 0.024
14 0.300 0.036 0.272 1.000 0.000 0.000
8.334 0.516 0.062

Fig. 8. Scatter plot of measured and estimated values for spatial


Fig. 9. Scatter plot of measured and estimated values for PCSV
model.
model.

 
varðp  p̄Þ Table 7
VAF ¼ 1  100%, (15) Performance indices
varðpÞ
Estimation model VAF (%) RMSE
where pi is the measured, and p̄i is the estimated
porosity value, respectively. Var denotes the Proposed spatial model 81.22 0.034
variance and N is the number of experiments. PCSV 68.37 0.043
The results are presented in Table 7. Performance
evaluation shows that the spatial model outper-
forms the PCSV model. This also indicates a
good generalization capability for the proposed 5. Conclusions
spatial estimation model. On the other hand, it
should be borne in mind that there are similar We have presented a spatial model which employs
drawbacks for both PCSM and PCSV. For exam- the fuzzy clustering and estimates the aquifer
ple, there are two separate scatter groups occurred porosity based on PCSM and SRDF. This method
for both models. provides detailed information about porosity at and
ARTICLE IN PRESS
B. Tutmez, Z. Hatipoglu / Computers & Geosciences 33 (2007) 465–475 475

near the measurement locations as well as among and Machine Intelligence. Prentice-Hall International (UK)
the locations. Fuzzy clustering based point mado- Limited, London, 614 pp.
gram method is easy to formulate and can be Kaymak, U., Babuska, R., 1995. Compatible cluster merging for
fuzzy modeling. In: Proceedings FUZZ-IEEE/IFES’95, Yo-
applied to heterogeneous systems without requiring kohama, Japan, 897–904.
substantial computing. Use of this approach with Pal, N.R., Bezdek, J.C., 1995. On cluster validity for the fuzzy c-
the SRDF in the same algorithm has been resulted means model. IEEE Transactions on Fuzzy Systems 3 (3),
in successful estimations. 370–379.
It has been observed that the proposed spatial Ross, T.J., 2004. Fuzzy Logic with Engineering Applications,
second ed. Wiley, Canada Ltd., 650 pp.
model outperforms more PCSV. In addition to the S- en, Z., 1989. Cumulative semivariogram model of regionalized
numerical prediction power, another attractive variables. Mathematical Geology 21, 891–903.
property of the spatial model is its transparency. S- en, Z., 1998. Point cumulative semivariogram for identification
In the future, the proposed method may be applied of heterogeneities in regional seismicity of Turkey. Mathe-
successfully for assessing the electrical conductivity matical Geology 30 (7), 767–787.
S- en, Z., Habib, Z., 1998. Point cumulative semivariogram of
of groundwater. areal precipitation in mountainous regions. Journal of
Hydrology 205, 81–91.
Acknowledgements S- en, Z., Habib, Z., 2000. Spatial precipitation assessment with
elevation bu using point cumulative semivariogram technique.
Water Resources Management 14, 311–325.
The authors would like to thank to anonymous
S- en, Z., S- ahin, A.D., 2001. Spatial interpolation and estimation
referees due to their valuable comments and of solar irradiation by cumulative semivariograms. Solar
contributions. Energy 71, 11–21.
Senol, M., Sahin, S., Duman, T.Y., 1998. The geological
investigation of Mersin Region. General Directorate of
References Mineral Research and Exploration of Turkey, (Unpublished
Report), Ankara, 46pp. ( in Turkish).
Akin, S., Schembre, J.M., Bhat, S.K., Kovscek, A.R., 2000. Setnes, M., Babuska, R., Verbruggen, H.B., 1998. Transparent
Spontaneous imbibition characteristics of diatomite. Journal fuzzy modelling. International Journal of Human-Computer
of Petroleum Science and Engineering 25, 149–165. Studies 49, 159–179.
Bezdek, J.C., Ehrlich, R., Full, W., 1984. FCM: the fuzzy c- Spitz, K., Moreno, J., 1996. A Practical Guide To Groundwater
means clustering algorithm. Computers & Geosciences 10 and Solute Transport Modeling. Wiley, New York, 461pp.
(2–3), 191–203. Tarawneh, Q.Y., S- ahin, A.D., 2003. Regional wind energy
Clausnitzer, V., Hopmans, J.W., 1999. Determination of phase- assessment technique with applications. Energy Conversion
volume fractions from tomographic measurements in two- and Management 44, 1563–1574.
phase systems. Advances in Water Resources 22 (6), 577–584. Taud, H., Martinez-Angeles, Parrot, J.F., Hemandez-Escobedo,
Deutsch, C.V., Journel, A.G., 1998. GSLIB: Geostatistical 2005. Porosity estimation method by X-ray computed
Software Library and User’s Guide, second ed. New York, tomography. Journal of Petroleum Science and Engineering
Oxford University Press, 340pp. 47, 209–217.
Hatipoglu Z., 2004. Hydrogeochemistry of Mersin-Tarsus Coast- Tutmez, B., 2005. Reserve estimation using fuzzy set theory.
al Aquifer. Ph.D. Dissertation, Hacettepe University, Ankara, Ph.D. Dissertation, Hacettepe University, Turkey ( in
142 pp. (in Turkish). Turkish).
Jain, A., Dubes, R., 1988. Algorithms for Clustering Data. Tutmez, B., Hatipoglu, Z., Kaymak, U., 2006. Modeling
Prentice-Hall, Englewood Cliffs, NJ, 320pp. electrical conductivity of groundwater using an adaptive
Jang, J.-S.R., Sun, C.T., Mizutani, E., 1997. Neuro-Fuzzy and neuro-fuzzy inference system. Computers & Geosciences 32
Soft Computing: A Computational Approach to Learning (4), 421–433.

You might also like