Professional Documents
Culture Documents
Spatial Estimation Model of Porosity
Spatial Estimation Model of Porosity
Abstract
This paper addresses a spatial estimation model which uses fuzzy clustering algorithm and assesses the aquifer porosity
based on point cumulative semimadogram (PCSM) measure. In order to obtain the estimated porosity values, the model
employs standard regional dependence function (SRDF) which provides weights for different regional locations depending
on the distances from the reference site. The proposed methodology has three stages: (1) structure identification; (2) spatial
dependence measure; and (3) interpolation. The model has been tested using a real data set which was taken from an
aquifer in Turkey. The performance evaluations indicate that the new methodology can be applied in geological based
domains.
r 2006 Elsevier Ltd. All rights reserved.
0098-3004/$ - see front matter r 2006 Elsevier Ltd. All rights reserved.
doi:10.1016/j.cageo.2006.07.008
ARTICLE IN PRESS
466 B. Tutmez, Z. Hatipoglu / Computers & Geosciences 33 (2007) 465–475
locations, it has practical difficulties in heteroge- and requires extreme cautions and laboratory work.
neous systems as discussed by S- en (1989). In order Therefore, it is estimated using the available data.
to eliminate the drawbacks of classical semivario- The problem of spatial estimation considered in this
gram, S- en (1989) proposed a new measure which is study can simply be formulated as follows: given a
the point cumulative semivariogram (PCSV) and it region sampled at n locations xi with values of
has been applied in different areas (S- en, 1998; S- en porosity p(xi ), i ¼ 1,y,n, what are the values p(xk)
and Habib, 1998; S- en and Habib, 2000; Tarawneh at unsampled locations xk, k ¼ 1,y, N?
and S- ahin, 2003). In any optimization method, the main concept is
According to this technique, the PCSV for a that the estimation at any location is considered as a
reference (pivot) location is obtained by ranking the weighted average of the measured values at the site.
squared differences and then taking summations. The weighting factor is related to the distance
In the present paper, we study the modelling of between pivot (reference) and each location in
aquifer porosity using fuzzy clustering based point addition to the porosity values measured at the
cumulative semimadogram (PCSM) measure which locations. There is a reverse relation between the
is a new spatial estimation approach. Use of fuzzy weighting factor and distance. As a general hypoth-
set theory in engineering systems obtains many esis, if the distance between data values is small, the
advantages. Fuzzy systems provide the possibility of data values are close to each other. On the other
integrating (logical) information processing with the hand, great distance implies dissimilar data values.
attractive mathematical properties of general func- The main interest of the paper is evaluating the
tion approximators (Tutmez et al., 2006). The most regional dependence of porosity. For regional
attractive characteristic of fuzzy algorithms is able prediction, consider n irregularly scattered measure-
to describe complex multivariable problems in a ment locations with a point a, where the regional
transparent effective way (Setnes et al., 1998). Due estimation of the porosity is desired. Supposing that
to these superiorities of fuzzy algorithms, we used the measurements at different locations are denoted
the fuzzy clustering in the first step for grouping by Pi (i ¼ 1,2,3,y,n)and the reference site under
data and identifying the model structure. Thus, investigation by Pr, then the following weighting
information taken from the clustering application average expression as
has been directly used in spatial measure and Pn
W ðd i;k ÞPi
interpolation. Pr ¼ Pi¼1 n , (1)
The proposed model consists of three main i¼1 W ðd i;k Þ
stages: (1) structure identification (2) measuring where W(di,k) is the weighting factor between the
the spatial variability; and (3) standard weighted location at i and the reference location at k that
interpolation. The model is developed by using real- corresponds to the distance di,k and Pi is the
world data regarding the water sources in Mersin porosity value at i. In order to reflect the spatial
region of southern Turkey. The performance dependence behavior of the phenomenon, regional
evaluations showed that the proposed model is very covariance and semivariogram functions are among
transparent and it has a high prediction capacity. the early alternatives for the weighting functions
The paper is organized as follows: Section 2 that take into account the spatial correlation of the
describes how porosity is modelled based on phomenon considered (S- en and S- ahin, 2001). The
weighting function. Section 3 gives comprehensive regional covariance function requires a set of
information on the proposed spatial model. Section assumptions such as the Gaussian distribution of
4 presents the case study for estimating the aquifer the regionalized variable. Similarly, the semivario-
porosity values from the spatial coordinates. In gram function does not always yield a clear pattern
addition, the performance evaluation of the model of the regional correlation structure. In order to
is given in the last part of this section. Section 5 eliminate these drawbacks, the cumulative semivar-
concludes the paper. iogram function has been proposed by S- en (1989).
The true value of the aquifer parameters is mostly The construction of the proposed estimation
unknown until it has been measured. For example, model proceeds in three main stages: (1) structure
the determination of the porosity is often difficult identification (fuzzy clustering); (2) measuring the
ARTICLE IN PRESS
B. Tutmez, Z. Hatipoglu / Computers & Geosciences 33 (2007) 465–475 467
Cluster means (prototypes) and elements of range parameter (Deutsch and Journel, 1998). The
membership matrix are computed as follows: point semimadogram (PSM) has been proposed by
PN m Tutmez (2005). This function is similar to the PSV;
mik xk
ci ¼ Pk¼1N m
, (6) instead of squaring the difference between Zm and
k¼1 mik Zm+h, the absolute difference is taken. If the
and experimental variogram includes the outlier values,
the PSM is more convenient than the point
1 semivariogram (PSV) due to the advantages of
mik ¼ Pc 2=ðm1Þ
. (7)
p¼1 ðd ik =d pk Þ absolute difference measure Tutmez (2005). The
PCSM measure can be obtained from data by
executing the following steps:
3.2. Point cumulative semimadogram
(a) calculate distance between the concerned loca-
Earth science data have spatial correlation that
tion and the remaining locations. If there are N
cannot be handled with classical statistical techni-
locations, the number of different distances
ques. The spatial variability in any phenomenon
N1, hi(i ¼ 1,y, N1).
within an area can be measured by comparing the
(b) for each pair (pivot and any other location), find
relative change between two locations. Two numer-
the half of absolute differences between data
ical values z(x) and z(x+h) at two points x and
values, in this case the porosity values. By this
x+h separated by the vector h are spatially
way, each distance will have its half of absolute
correlated. As the distance between these values
value.
increases, one would expect that the spatial correla-
(c) plot distances versus corresponding successive
tion decreases and vice versa. This correlation
cumulative sums of half of absolute differences.
modelled by the squared-difference, V(d), represents
By using this procedure, a non-decreasing
the relative change in the best possible way:
function which is the sample PCSM at the pivot
V ðdÞ ¼ ðZðxÞ Zðx þ hÞÞ2 . (8) location is obtained. Its mathematical expres-
sion is given as
In general, variance and correlation techniques
which are used to quantify the degree of regional 1NX1
divide all the PCSM values by(gm). The result Formation occur with a small surface area to the
appears as a scaled form of the sample, northwest of the Tarsus (Fig. 2). Ophiolitic rock
PCSM values within limits of zero and one, and which contains limestone blocks is situated north-
subtract the dimensionless PCSM values from west of Mersin and it appears in valleys. The
one at each distance. The resulting non-decreas- prevaling rock type is the sedimentary rock belong-
ing function is named as the SRDF. ing to the Tertiary age. Karaisali, Guvenc, Kuzgun
and Handere Formations deposited in this age
In the final stage, each porosity value is multiplied include the intercalation of sandstones, siltstones,
by the corresponding standard weight and contribu- conglomerates, limestones, claystones, marl and
tions for each location are calculated. For a pivot gypsum (Senol et al., 1998). Lying above all is
location, estimated porosity value is taken from caliche and secondary calcium carbonate deposi-
dividing the total contributions by the total tion. These formations form the low productive
standard weights. Hillside Aquifer (Hatipoglu, 2004). Study area is
located on the south coast of this basin and it is
4. Case study comprised of Neogene sediments of a fan-delta type
alluvial deposition system. It is characterized as a
In this section, in order to evaluate a regional and flat or gentle surface and it forms the most
functional dependence of aquifer porosity, fuzzy productive Coastal Aquifer in the area.
clustering based PCSM analysis is employed to
examine the structural and spatial distribution of 4.2. Study area and sampling
the process. In addition, performance of the
proposed methodology is compared with perfor- In this study, the coastal area between Mersin and
mance of an established model (PCSV) in literature. Tarsus cities is investigated. This area is located in
Southern part of Turkey and it contains agricultur-
4.1. Geological and hydrogeological setting al, industrial and settlement areas. Groundwater is
used as a main source of water in this region.
The basin fill took place from the Carboniferous Because groundwater is widely used for water
to Quaternary. Outcrops of the Karahamzausagi supplies, efficient groundwater management is im-
portant for this area, and requires understanding of Determination of the optimal number of clusters is
groundwater storage capacity. an important step in clustering. If the number of
In order to estimate porosity distribution of the clusters is unknown, various methods (Pal and
area, 32 well logs were used. These wells were drilled Bezdek, 1995; Kaymak and Babuska, 1995) might
by General Directorate of State Hydraulic be employed to find a suitable number of clusters. In
Works and General Directorate of Rural Services this study, a novel cluster validity approach which
between 1957 and 1997. Location of the wells has been proposed by Tutmez (2005) especially for
was measured with Garmin Etrex GPS (Fig. 3). In evaluating the geological data is used. It is based on
our study well logs were split into 1 m depth reproducing the variability of the sample data in the
and according to the description of the unit value of cluster centers with minimum number of
on well logs, effective porosity (pe) values are clusters as follows:
designated to these units from Spitz and Moreno
Minimize nc under Std½pðxÞ Std½pðcÞ, (12)
(1996). Some (pe) values used in the data set are
given in Table 1. To estimate the effective porosity where nc is the optimal number of cluster, Std. is the
distribution of the aquifer between 0 and 20 m standard deviation of porosity. The number of
elevation, the average porosity value was calculated clusters was determined experimentally by using the
for this depth. FCM clustering under (12). The appropriate num-
ber of clusters resulted to be three (Fig. 4). Fig. 5
shows the outcome of this operation.
4.3. Data clustering
4.4. Measuring the spatial dependence
In the first stage of the application, data set was
partitioned by using the FCM clustering algorithm In this stage, the spatial variability is modelled
In pattern recognition, it is often suggested that the by PSM function. Functional analyses were
data should be appropriately normalized before
clustering (Jain and Dubes, 1988). The values were Table 1
scaled by using a linear transformation between 0.03 pe ranges in Spitz an d Moreno ( 1996) and pe values in data set
and 0.3. The data set Z to be clustered is formed by
combining X and p Unit pe value (Spitz and Moreno, 1996) Using pe value
Table 2
Dimensionless distances for cluster 1
No. x y 1 2 8 9 10 11 12 15 21 24 30 32
1 0.114 0.132 0.000 0.082 0.066 0.031 0.108 0.103 0.139 0.071 0.083 0.055 0.132 0.056
2 0.104 0.061 0.082 0.000 0.047 0.065 0.067 0.116 0.146 0.032 0.091 0.050 0.109 0.103
8 0.075 0.079 0.066 0.047 0.000 0.056 0.043 0.090 0.117 0.065 0.066 0.069 0.079 0.100
9 0.124 0.103 0.031 0.065 0.056 0.000 0.092 0.083 0.121 0.052 0.061 0.054 0.115 0.081
10 0.055 0.042 0.108 0.067 0.043 0.092 0.000 0.092 0.107 0.092 0.074 0.106 0.048 0.142
11 0.104 0.065 0.103 0.116 0.090 0.083 0.092 0.000 0.039 0.116 0.025 0.133 0.074 0.158
12 0.085 0.045 0.139 0.146 0.117 0.121 0.107 0.039 0.000 0.151 0.061 0.169 0.071 0.194
15 0.133 0.073 0.071 0.032 0.065 0.052 0.092 0.116 0.151 0.000 0.092 0.037 0.128 0.093
21 0.102 0.069 0.083 0.091 0.066 0.061 0.074 0.025 0.061 0.092 0.000 0.109 0.068 0.137
24 0.123 0.106 0.055 0.050 0.069 0.054 0.106 0.133 0.169 0.037 0.109 0.000 0.145 0.058
30 0.046 0.030 0.132 0.109 0.079 0.115 0.048 0.074 0.071 0.128 0.068 0.145 0.000 0.176
32 0.111 0.163 0.056 0.103 0.100 0.081 0.142 0.158 0.194 0.093 0.137 0.058 0.176 0.000
Table 3
Dimensionless distances for cluster 2
No. x y 3 4 5 14 17 18 20 25 27 31
3 0.192 0.193 0.000 0.085 0.136 0.245 0.151 0.160 0.108 0.145 0.160 0.150
4 0.169 0.116 0.085 0.000 0.065 0.281 0.153 0.099 0.128 0.081 0.099 0.220
5 0.189 0.057 0.136 0.065 0.000 0.274 0.131 0.056 0.126 0.022 0.055 0.265
14 0.258 0.162 0.245 0.281 0.274 0.000 0.202 0.294 0.158 0.272 0.295 0.305
17 0.300 0.098 0.151 0.153 0.131 0.202 0.000 0.117 0.082 0.117 0.118 0.219
18 0.237 0.044 0.160 0.099 0.056 0.294 0.117 0.000 0.146 0.039 0.002 0.261
20 0.234 0.134 0.108 0.128 0.126 0.158 0.082 0.146 0.000 0.125 0.147 0.210
25 0.209 0.049 0.145 0.081 0.022 0.272 0.117 0.039 0.125 0.000 0.038 0.265
27 0.235 0.043 0.160 0.099 0.055 0.295 0.118 0.002 0.147 0.038 0.000 0.263
31 0.290 0.300 0.150 0.220 0.265 0.305 0.219 0.261 0.210 0.265 0.263 0.000
Table 4
Dimensionless distances for cluster 3
No. x y 6 7 13 16 19 22 23 26 28 29
6 0.059 0.048 0.000 0.004 0.012 0.072 0.011 0.096 0.065 0.067 0.028 0.096
7 0.059 0.051 0.004 0.000 0.010 0.068 0.012 0.093 0.062 0.064 0.026 0.092
13 0.057 0.047 0.012 0.010 0.000 0.072 0.013 0.083 0.053 0.065 0.018 0.094
16 0.078 0.116 0.072 0.068 0.072 0.000 0.077 0.114 0.096 0.030 0.081 0.031
19 0.048 0.046 0.011 0.012 0.013 0.077 0.000 0.092 0.059 0.074 0.021 0.099
22 0.042 0.046 0.096 0.093 0.083 0.114 0.092 0.000 0.038 0.099 0.072 0.114
23 0.030 0.047 0.065 0.062 0.053 0.096 0.059 0.038 0.000 0.087 0.038 0.102
26 0.095 0.097 0.067 0.064 0.065 0.030 0.074 0.099 0.087 0.000 0.075 0.049
28 0.042 0.044 0.028 0.026 0.018 0.081 0.021 0.072 0.038 0.075 0.000 0.098
29 0.068 0.138 0.096 0.092 0.094 0.031 0.099 0.114 0.102 0.049 0.098 0.000
Fig. 6. Experimental PCSV and PCSM for location no: 25. Fig. 7. SRDF graphs for location no: 25.
Table 5
Porosity estimation for location no: 25 by proposed model
Location Porosity PCSM Scaled distance Distance ratio SRDF weighting Contribution Estimation
Table 6
Porosity estimation for location no: 25 by PCSV
Location Porosity PCSV Scaled distance Distance ratio SRDF weighting Contribution Estimation
varðp p̄Þ Table 7
VAF ¼ 1 100%, (15) Performance indices
varðpÞ
Estimation model VAF (%) RMSE
where pi is the measured, and p̄i is the estimated
porosity value, respectively. Var denotes the Proposed spatial model 81.22 0.034
variance and N is the number of experiments. PCSV 68.37 0.043
The results are presented in Table 7. Performance
evaluation shows that the spatial model outper-
forms the PCSV model. This also indicates a
good generalization capability for the proposed 5. Conclusions
spatial estimation model. On the other hand, it
should be borne in mind that there are similar We have presented a spatial model which employs
drawbacks for both PCSM and PCSV. For exam- the fuzzy clustering and estimates the aquifer
ple, there are two separate scatter groups occurred porosity based on PCSM and SRDF. This method
for both models. provides detailed information about porosity at and
ARTICLE IN PRESS
B. Tutmez, Z. Hatipoglu / Computers & Geosciences 33 (2007) 465–475 475
near the measurement locations as well as among and Machine Intelligence. Prentice-Hall International (UK)
the locations. Fuzzy clustering based point mado- Limited, London, 614 pp.
gram method is easy to formulate and can be Kaymak, U., Babuska, R., 1995. Compatible cluster merging for
fuzzy modeling. In: Proceedings FUZZ-IEEE/IFES’95, Yo-
applied to heterogeneous systems without requiring kohama, Japan, 897–904.
substantial computing. Use of this approach with Pal, N.R., Bezdek, J.C., 1995. On cluster validity for the fuzzy c-
the SRDF in the same algorithm has been resulted means model. IEEE Transactions on Fuzzy Systems 3 (3),
in successful estimations. 370–379.
It has been observed that the proposed spatial Ross, T.J., 2004. Fuzzy Logic with Engineering Applications,
second ed. Wiley, Canada Ltd., 650 pp.
model outperforms more PCSV. In addition to the S- en, Z., 1989. Cumulative semivariogram model of regionalized
numerical prediction power, another attractive variables. Mathematical Geology 21, 891–903.
property of the spatial model is its transparency. S- en, Z., 1998. Point cumulative semivariogram for identification
In the future, the proposed method may be applied of heterogeneities in regional seismicity of Turkey. Mathe-
successfully for assessing the electrical conductivity matical Geology 30 (7), 767–787.
S- en, Z., Habib, Z., 1998. Point cumulative semivariogram of
of groundwater. areal precipitation in mountainous regions. Journal of
Hydrology 205, 81–91.
Acknowledgements S- en, Z., Habib, Z., 2000. Spatial precipitation assessment with
elevation bu using point cumulative semivariogram technique.
Water Resources Management 14, 311–325.
The authors would like to thank to anonymous
S- en, Z., S- ahin, A.D., 2001. Spatial interpolation and estimation
referees due to their valuable comments and of solar irradiation by cumulative semivariograms. Solar
contributions. Energy 71, 11–21.
Senol, M., Sahin, S., Duman, T.Y., 1998. The geological
investigation of Mersin Region. General Directorate of
References Mineral Research and Exploration of Turkey, (Unpublished
Report), Ankara, 46pp. ( in Turkish).
Akin, S., Schembre, J.M., Bhat, S.K., Kovscek, A.R., 2000. Setnes, M., Babuska, R., Verbruggen, H.B., 1998. Transparent
Spontaneous imbibition characteristics of diatomite. Journal fuzzy modelling. International Journal of Human-Computer
of Petroleum Science and Engineering 25, 149–165. Studies 49, 159–179.
Bezdek, J.C., Ehrlich, R., Full, W., 1984. FCM: the fuzzy c- Spitz, K., Moreno, J., 1996. A Practical Guide To Groundwater
means clustering algorithm. Computers & Geosciences 10 and Solute Transport Modeling. Wiley, New York, 461pp.
(2–3), 191–203. Tarawneh, Q.Y., S- ahin, A.D., 2003. Regional wind energy
Clausnitzer, V., Hopmans, J.W., 1999. Determination of phase- assessment technique with applications. Energy Conversion
volume fractions from tomographic measurements in two- and Management 44, 1563–1574.
phase systems. Advances in Water Resources 22 (6), 577–584. Taud, H., Martinez-Angeles, Parrot, J.F., Hemandez-Escobedo,
Deutsch, C.V., Journel, A.G., 1998. GSLIB: Geostatistical 2005. Porosity estimation method by X-ray computed
Software Library and User’s Guide, second ed. New York, tomography. Journal of Petroleum Science and Engineering
Oxford University Press, 340pp. 47, 209–217.
Hatipoglu Z., 2004. Hydrogeochemistry of Mersin-Tarsus Coast- Tutmez, B., 2005. Reserve estimation using fuzzy set theory.
al Aquifer. Ph.D. Dissertation, Hacettepe University, Ankara, Ph.D. Dissertation, Hacettepe University, Turkey ( in
142 pp. (in Turkish). Turkish).
Jain, A., Dubes, R., 1988. Algorithms for Clustering Data. Tutmez, B., Hatipoglu, Z., Kaymak, U., 2006. Modeling
Prentice-Hall, Englewood Cliffs, NJ, 320pp. electrical conductivity of groundwater using an adaptive
Jang, J.-S.R., Sun, C.T., Mizutani, E., 1997. Neuro-Fuzzy and neuro-fuzzy inference system. Computers & Geosciences 32
Soft Computing: A Computational Approach to Learning (4), 421–433.