Prediction of permeability from well logs using a new hybrid machine learning algorithm

Petroleum 9 (2023) 108e123
Contents lists available at ScienceDirect
Petroleum
journal homepage: www.keaipublishing.com/en/journals/petlm
Prediction of permeability from well logs using a new hybrid machine

learning algorithm
Morteza Matinkia a, Romina Hashami b, Mohammad Mehrad c, *,
Mohammad Reza Hajsaeedi c, Arian Velayati d
a
Department of Petroleum Engineering, Omidiyeh Branch, Islamic Azad University, Omidiyeh, Iran
b
Department of Applied Mathematics, Faculty of Mathematics and Computer Sciences, Amirkabir University of Technology, Tehran, Iran
c
Faculty of Mining, Petroleum and Geophysics Engineering, Shahrood University of Technology, Shahrood, Iran
d
Department of Chemical and Materials Engineering, University of Alberta, Edmonton, Canada
a r t i c l e i n f o a b s t r a c t
Article history: Permeability is a measure of fluid transmissibility in the rock and is a crucial concept in the evaluation of
Received 1 October 2021 formations and the production of hydrocarbon from the reservoirs. Various techniques such as intelligent
Received in revised form methods have been introduced to estimate the permeability from other petrophysical features. The ef-
15 December 2021
ficiency and convergence issues associated with artificial neural networks have motivated researchers to
Accepted 14 March 2022
use hybrid techniques for the optimization of the networks, where the artificial neural network is
combined with heuristic algorithms.
Keywords:
This research combines social ski-driver (SSD) algorithm with the multilayer perception (MLP) neural
Permeability
Artificial neural network
network and presents a new hybrid algorithm to predict the value of rock permeability. The performance
Multilayer perceptron of this novel technique is compared with two previously used hybrid methods (genetic algorithm-MLP
Social ski driver algorithm and particle swarm optimization-MLP) to examine the effectiveness of these hybrid methods in pre-
dicting the permeability of the rock.
The results indicate that the hybrid models can predict rock permeability with excellent accuracy.
MLP-SSD method yields the highest coefficient of determination (0.9928) among all other methods in
predicting the permeability values of the test data set, followed by MLP-PSO and MLP-GA, respectively.
However, the MLP-GA converged faster than the other two methods and is computationally less
expensive.
© 2022 Southwest Petroleum University. Publishing services by Elsevier B.V. on behalf of KeAi
Communications Co. Ltd. This is an open access article under the CC BY-NC-ND license (http://
creativecommons.org/licenses/by-nc-nd/4.0/).
1. Introduction production data, and seismic information [1]. Various empirical

correlations have been presented for the estimation of petrophys-
Accurate estimation of the petrophysical properties such as ical data from the well log measurements. However, these corre-
permeability, porosity, water saturation, density, and lithology is of lations are often limited to a single well only and may not apply to
utmost importance in the evaluation of the hydrocarbon reservoirs. other wells even if the geological features of the wells are relatively
Such characterization is primarily based on the measurements similar [2]. Hence, the problem with the correlations developed
obtained by well logs, core sample testing, well testing, field using regression techniques appears to be the non-uniqueness of
the equations which arises from introducing insufficient parame-
ters into the models.
* Corresponding author. Permeability is a petrophysical property of the rock that repre-
E-mail address: mmehrad1986@gmail.com (M. Mehrad). sents the displacement of the fluids in the porous media. This is an
Peer review under responsibility of Southwest Petroleum University. essential parameter for reservoir evaluation, management, and
selecting the optimum enhanced oil recovery method. Permeability
is generally measured either by coring from the formations or
downhole pressure testing methods. The coring test results are
Production and Hosting by Elsevier on behalf of KeAi
“local” and belong to a minute section of the reservoir. Downhole
https://doi.org/10.1016/j.petlm.2022.03.003
2405-6561/© 2022 Southwest Petroleum University. Publishing services by Elsevier B.V. on behalf of KeAi Communications Co. Ltd. This is an open access article under the CC
BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
M. Matinkia, R. Hashami, M. Mehrad et al. Petroleum 9 (2023) 108e123
Abbreviations MIMO Multiple-inputs multiple-outputs

MISO Multiple-inputs single-output
AARD Average absolute relative deviation MRA Multivariate regression analysis
AC Acoustic log MSE Mean square error
AI Artificial intelligence MSFL Micro-spherical focused log
ANFIS Artificial neuro-fuzzy interference system NMR Nuclear magnetic resonance
ANN Artificial neural network NPHI Neutron-porosity log
Cal Caliper log PEF Photoelectric factor
CNL Compensated neutron porosity log PHID Density-porosity log
CT Conductivity log PHIE Effective porosity
DEN Formation Density log PHIT Total porosity
DRHO Density Correction Log PSO Particle swarm optimization
FFBP Feed-forward back-propagation RHOB Deep density log
FL Fuzzy logic RL Resistivity log
FZI or FZI* Flow zone Indicator RLA1 Apparent resistivity focusing mode 1
GA Genetic algorithm RLA5 Apparent resistivity focusing mode 5
GMDH Group method of data handling RMSE Root MSE
GR Gamma ray RT True resistivity
HGAPSO Hybrid GA and PSO RVR Relevance vector regression
IQR Inter quartile range SFL Spherically focused log
K Permeability SP Spontaneous potential
LLD Deep laterolog SVM Support vector machine
LLS Shallow laterolog TNPH Thermal neutron Porosity
LSSVM Least square support vector machine XGBoost Extreme gradient boosting regression model
pressure tests account for the anisotropic nature of the formation Hybrid artificial intelligence-optimization algorithms have been
but yield in a single value. These techniques are costly and time- utilized for the prediction of reservoir petrophysical properties.
consuming [3]. Evolutionary optimization methods are often used to achieve the
Extrapolating values from insufficient permeability measure- optimum network structure and improve the convergence and ef-
ments can be erroneous due to the high uncertainty associated with ficiency of the network. Many studies of this type are available in
the “local” nature of these techniques. Moreover, such measure- the literature, where the petrophysical properties of formations are
ments are not always available. Therefore, researchers have estimated from the well logs [13e16].
attempted to correlate the permeability to the other petrophysical Mulashani et al. [17] applied enhanced group method of data
properties obtained from the well logs [4]. Measurements from handling (GMDH) which uses modified Levenberg-Marquardt (LM)
coring are always used for calibrating the predicted results. to predict permeability from well logs. Otchere et al. [18] used
Kozeny [5] and Carman [6] developed the first correlation (KC extreme gradient boosting (XGBoost) regression model to estimate
correlation) of this type, in which permeability is correlated to the reservoir permeability and water saturation. Zhang et al. [19]
formation porosity. Later on, modifications to the KC correlation combined petrophysical rock typing method (FZI or FZI*) with PSO-
were presented by other researchers [7,8]. The application of such SVM algorithm to predict permeability and porosity from GR, DEN,
equations is essentially limited to unconsolidated homogeneous and DT. Nkurlu et al. [20] estimated permeability from well logs by
sands due to the assumptions used in developing the analytical using GMDH neural network. Okon et al. [21] developed a feed-
correlations. forward back-propagation (FFBP) artificial neural network (ANN)
Most of the presented equations have limited application due to model with multiple input multiple out (MIMO) structure. Ade-
the existing heterogeneity of the reservoirs. The findings clarify niran et al. (2019) introduced a novel competitive ensemble ma-
that permeability cannot be simply correlated to the porosity only chine learning model for permeability prediction in heterogeneous
and many other factors involved must be incorporated into the reservoirs. Ahmadi and Ebadi [22] used the optimized least square
correlation. Such includes more petrophysical parameters such as support vector machine (LSSVM) and fuzzy logic (FL) to estimate
radioactive, electrical, and sonic properties which can resolve the the Porosity and Permeability of petroleum reservoir. Ali and
non-uniqueness problem of such correlations. Chawathe [23], studied the relationship between petrographic data
Advanced methods and intelligent techniques have been proposed collected during thin section analysis with permeability using the
to address the issue of the non-uniqueness of the presented correla- artificial neural network (ANN). Chen and Lin [24] used an
tions. For instance, Multivariate Regression Analysis (MRA) method ensemble-based committed machine with empirical formulas
was applied to address the models' generalization problem [9]. (CMEF) to estimate permeability. Gholami et al. [25] used relevance
Moreover, Artificial Neural Networks (ANN) method has been used vector regression (RVR) along with genetic algorithm (GA) as an
and gained popularity recently as an alternative to the MRA and the optimizer to predict permeability from well logs. Jamialahmadi and
results obtained from the utilization of this intelligent method have Javadpour [26] utilized RBF to identify the relationship between
been reported to be positive [10,11]. Other approaches such as fractals porosity and permeability. Verma et al. [27], studied five well log
and multifractal theory, statistical approaches, and methods based on Alberta Deep Basin and used ANN to predict porosity and perme-
percolation theory have been used extensively and successfully in the ability. Handhel [28] predicted the Horizontal and vertical perme-
estimation of the petrophysical properties [3]. Furthermore, intelli- ability by backpropagation algorithm from 5 different well logs.
gent methods can be employed to deal with the problem of the spatial Elkatatny et al. [29] developed an ANN with three well logs to
distribution of the properties in the formation [12]. predict permeability in heterogeneous reservoirs. In Table 1, the
109
previously used machine algorithms and input parameters for the Fig. 1 shows the workflow used in this study to extract the
prediction of permeability are presented. permeability index from the FMI graph and its calibration with the
In this paper, a hybrid MLP-SSD algorithm was designed and results of the permeability test, as well as modeling the perme-
employed to estimate the permeability of the formations from the ability using conventional petrophysical graphs. No core sample
well logs. Social Ski-driver (SSD) is categorized as an evolutionary was taken from the target depth (Fahlian Formation) and the
optimization algorithm (EA) that improves the classification per- permeability ðKFMI ) of the FMI output should be corrected for
formance, resolving the challenge of imbalanced data associated changes in water saturation (Sw ). Equation (1) was used for this
with robust classification modeling. This algorithm was recently purpose and the constant (n) in the correlation was determined for
introduced by Tharwat and Gabel [31] and is inspired by different Fahlian Formation using FMI graphs and permeability laboratory
evolutionary algorithms such as PSO, sine cosine algorithm (SCA), tests on samples from two other neighboring wells.
and gray wolf optimization (GWO), and its stochastic nature re-
sembles alpine skiing paths. The hybridization of MLP and SSD n
1
Sw
improves the efficiency of the computations. As shown in Table 1, Kcorr ¼ KFMI 10 (1)
the proposed hybrid algorithm has not been used previously to
predict the permeability of the rock. The performance of this hybrid To model permeability in this study, data preprocessing is first
model was compared to MLP-PSO,MLP-GA, and MLP models, and it performed to detect and remove outlier data, and select more
was found that the MLP-SSD algorithm is superior in terms of features that influence permeability. The Tukey method, which is a
model accuracy. statistical method based on box plot diagrams of data, was used to
In section 2, the methodology and workflow of the research are detect and remove outlier data [32]. Moreover, feature selection
described. In section 3, the type and statistics of the collected data was performed by applying the neighborhood component analysis
are presented and section 4 contains the results and discussions on (NCA) algorithm to the database without outliers. In the next step,
the quality of input data, algorithm training, and the performance the selected features were considered as input to the MLP-GA, MLP-
of algorithms with the unseen data sets. The conclusions of the PSO, and MLP-SSD hybrid algorithms. After training the hybrid
study and the description of the advantages and limitations of the models with 80% of the selected data, the trained models were
introduced model are given in section 5. tested with the rest of the data.
2. Methodology 2.1. Pre-processing
Sampling from wells to perform tests related to determining Data processing is essential for the correct extraction of the
rock permeability is very time-consuming and costly. Besides, governing relationships between inputs and outputs. Applying data
sampling along the entire length of the well for permeability preprocessing results in a model with high accuracy and general-
testing is not possible; Therefore, the output of these studies will be izability. Identifying and deleting outlier data and feature selection
discontinuous rock permeability data. Using this method for car- are two processes that were performed on raw data in this study in
bonate reservoirs that have high heterogeneity will not yield the preprocessing stage.
acceptable results. Thus, using other tools such as formation micro-
imager (FMI) or formation micro scanner (FMS) to estimate the 2.1.1. Outlier detection and elimination
permeability of carbonate reservoirs, in addition to reducing time Actual data is affected by numerous factors, among which noise
and cost, will provide continuous data. The use of permeability test is a key factor, leading to inaccurate extraction of the rules gov-
results on core samples is vital to calibrate the values of the FMI/ erning the data and thus results in the low generalizability of
FMS permeability index in this method; to achieve highly accurate estimator or class models. Exposure to noisy data can be classified
and reliable results. into two categories: identifying and deleting remote data and
Table 1
Application of machine learning algorithms in the literature review of permeability prediction.
Number Authors Algorithms Number of input Inputs

variables
1 Mulashani et al. [17] GMDH, LM 4 TNPH, SGR, VSH, PHIE, RHOZ

2 Otchere et al. [18] XGBoost 12 CALI, DRHO, DT, GR, NPHI, PEF, RACHEM, RACELM, RD, RHOB, ROP, RT
3 Zhang et al. [19] PSO-SVM 3 GR, DEN, DT
4 Nkurlu et al. [20] GMDH 6 RHOZ, RLA1, RLA5, SGR, TNPH, VSH
5 Okon et al. [21] ANN, FFBP 4 GR, True RT, RHOB, Depth interval logs
6 Adeniran et al. [30] SVMR, ANN, ANFIS, MSVMR 11 MSFL, NPHI, PHIT, RHOB, SWT, CALI, CT, DRHO, GR, RT, K
(KNN), MSVMR (NCC)
7 Elkatatny et al. [29] ANN, ANFIS, SVM, BPNN 4 Mobility index, NPHI, RHOB, Core permeability
8 Nasseri, A. and SOM Network, PCAN Network, 9 NPHI, RHOB, DT, CGR, ILD, ILM, URAN. THOR, POTA
Mohammadzadeh, M.J [13]. PNN Network
9 Ahmadi and Ebadi [22] GA, FIS, LSSVM 4 DT, RHOB, PHIT, NPHI
10 Gholami et al. [25] RVR, GA 7 GR, DT, RHOB, NPHI, LLD, PEF and MSFL logs
11 Verma et al. [27] ANN 5 GR, RHOB, Deep resistivity, NPHI, RHOB, NPHI
12 Ahmadi et al. [14] ANN, HGAPSO-ANN 5 K, CT, DT, NPHI, GR, RHOB
13 Sfidari et al. [16] SOM 5 DT, GR, NPHI, PEF, RHOB
14 Handhel [28] MLP, BP 5 GR, RHOB, DT, NPHI, Deep induction log
15 Chen and Lin [24] CMEF 5 Core Permeability, NPHI, RT, irreducible Water saturation
16 Ali and Chawathe [23] ANN 10 Sec. Intergranular porosity, Total Sec. Por., Total Por, Quartz, Feldespar,
Dolomite, Anhydrite, Micro Por, Clay, Rock Frag.
17 Jamialahmadi and RBF 2 Porosity, Permeability
Javadpour [26]
110
Fig. 1. The workflow for permeability modeling of the target well. The blue box on the right is a process performed to determine the constant in Equation (1) using the permeability
data of the core and the FMI log in two wells from the study field in the Fahlian Formation. The red box on the left represents the process used to determine permeability and model
it in the well under study. The data used in this study (ie, conventional good wells and permeability graphs normalized using Equation (1)) are marked in blue.
reducing the effect of noise. Since the outlier data have unusual developed models. The criterion for feature selection is the nu-
values compared to other observations (i.e. they deviate from merical value of the feature weight that is evaluated by the feature
other observations), they can be identified by both univariate and selection algorithm. At first, the feature selection algorithm should
multivariate methods. In the univariate method, the data distri- calculate the weight of each feature, and then, the weight of all
bution of one property is examined alone, while in the multivar- features is compared with a threshold value. Finally, features with
iate method, several properties in multidimensional space must be weights higher than the threshold value are selected for the
examined to identify remote data. In the noise reduction method, modeling, and the rest are discarded.
a filter is used to reduce the effect of noise in the data, the in- Various techniques have been proposed for feature selection
tensity of which is proportional to the signal-to-noise ratio in the which can be categorized into two “Filter” and “Wrapper” groups.
data. In this method, unlike the first method, the number of data is In Filter methods, features are designated without training and
not reduced, but its values are changed according to the noise based on their direct correlation with the output parameter. On the
intensity. other hand, Wrapper approach employs a series of training pro-
Due to the uncertainty of the signal-to-noise ratio in the data cesses. Studies favor the Wrapper technique with superior re-
received for this study, the Tukey method will be used to identify sponses compared to the Filter methods [33]. This study employs
and delete outlier data. In this method, by calculating the values of neighborhood component analysis which falls under the Wrapper
the first quarter (Q1 ) and third (Q3 ) for each feature, the inter- methods, for feature selection.
quartile range (IQR) is determined using equation (2). The values of Goldberger et al. [34] introduced the neighborhood component
the lower inner fence (LIF) and upper inner fence (UIF) are then analysis as a non-parametric method, aiming to maximize the ac-
calculated separately for each property (Equation (3) and (4)). Data curacy of the clustering algorithms and regression. If
with values less than LIF and more than UIF are identified as outlier S ¼ fðxi ; yi Þ; i ¼ 1; 2; …; Ng is considered as the training input data,
data and must be removed from each attribute. xi 2Rp is a feature vector and yi 2R is the corresponding output
vector. The objective here is to find values for the weight vector w in
IQR ¼ 1:5 ðQ3 Q1 Þ (2) 1 p dimension. These values indicate the significance of the fea-
tures in estimating the response variable. In this method, xj is a
LIF ¼ Q1 IQR (3) sample reference point among all the samples for xi sample. Pij ; the
probability of the xj selection as the xi reference point from all the
UIF ¼ Q2 þ IQR (4) samples is very sensitive to the distance of the two samples. This
distance is represented by the weighted distance dw , and is defined
via equation (5) [34]:
2.1.2. Feature selection
Feature selection is one of the fundamental concepts in data Xp
mining that affects the model performance significantly. Therefore, dw xi ; xj ¼ w2m xim xjm (5)
it is important to consider feature selection as the initial step to m¼1
design the model of interest. Feature selection is a process in which
where wm is the allocated weight for the mth feature. The corre-
the features with the highest influence on the prediction variable or
lation between the probability Pij and the weighted distance dw is
the desired output are recognized and selected either manually or
automatically. Hence, this process reduces a large dimension made possible by the kornel function k in equation (6).
feature vector into a smaller one by eliminating the irrelevant
features. k dw xi ; xj
Pij ¼ PN (6)
The inclusion of irrelevant features in the model input data re-
j¼1;jsi k dw xi ; xj
duces the accuracy and results in the generation of a model based
on these features which may affect the application of such poorly Moreover, if i ¼ j, then pii ¼ 0. Kornel function in equation (7) is
111
defined as follows: Table 2

Popular activation functions in MLP-NN.
z
kðzÞ ¼ exp (7) Function name Equation
s
Linear fj ðSj Þ ¼ Sj
where s is the kornel width which influences the probability of xj Sigmoid
fj ðSj Þ ¼
1
1 þ eSj
selection as the reference point. Assuming b y is the predicted Tangent hyperbolic 2
fj ðSj Þ ¼ 1
response values by the random regression model for the point xi 1 þ e2Sj
and l : R2 /R is the loss function (determines the difference be-
tween the yi and ybi ), average value lðyi ; ybi Þ is computed using
equation (8):
After setting the number of hidden layers and the number of
X
N neurons in each layer, the network has to be trained to adjust and
li ¼ Pij l yi ; yj (8) update the weights and biases using training algorithms. For this
j¼1;jsi purpose, classic training algorithms are generally applied such as
Leave-one-out is a strategy implemented to increase the accu- Levenberg-Marquardt (LM), conjugate gradient (CG), Newton
racy of the regression model. The inclusion of this strategy on the method (NM), gradient descent (GD), and quasi-Newton (QN). The
training data S ensures the success of the neighborhood component selection of the training algorithm must be based on the type of
regression. Summation of training data li divided by the number of problem that is going to be solved. Nevertheless, LM has been re-
data present in the training data collection is the average accuracy ported to have superior computational speed among all other
of the leave-one-out regression. Although, this objective function is methods but requires large memory [39]. GD algorithm, unlike LM,
prone to overfitting. A regularization term l was introduced to has lower computational speeds but requires less memory [39].
prevent the overfitting of the neighbor component analysis model Therefore, in problems with a large amount of data it is better to use
which has equal value for all the weights in one problem. Therefore, GD algorithm to save the memory.
the objective function can be presented as shown in equation (9):
2.3. Optimization algorithms
1 X X
N p
f ðwÞ ¼ li l w2m (9) A review of studies about MultiLayer Perceptron by using
N m¼1
i¼1 different learning Algorithms in the oil and gas industry shows that
The defined objective in this correlation is known as the regu- models that use meta-heuristic optimization algorithms are more
larized NCA. Weight values for the features are introduced to accurate and generalizable in comparison with classic algorithms
minimize the objective function. Different loss functions such as [3]. Thus, in this research Genetic, PSO, and Social ski-driver opti-
root mean square error and average absolute deviation may be mization algorithms were used as Neural Network learning
utilized in equation (9). algorithms.
2.2. Multi-layer perceptron neural network (MLP-NN) 2.3.1. Social ski-driver (SSD)
A novel optimization SSD algorithm was introduced in 2019 by
Artificial neural network (ANN) is one of the data-based Tharwat and Grabel [31]. This algorithm is based on a random
modeling techniques that is used extensively to solve compli- search, which is somehow reminiscent of a ski rider moving down a
cated problems such as clustering and classification, pattern mountain slope. The parameters of this algorithm are as follows:
recognition, and estimation, and prediction. ANN is made of several
calculation blocks, known as the artificial neuron, and has a flexible (1) Ski-driver place (Xi 2Rn Þ: This parameter is used to evaluate
structure that enables the network to model non-linear problems the objective function in the desired location. The value of n
[35]. indicates the size of the search space.
Multi-layer perceptron (MLP) has been employed repetitively to (2) Best previous location (Pi Þ: The compatibility value for all ski
solve the function estimation problems from the existing types of drivers is calculated with the help of the compatibility
ANN. This type of network consists of three layers: input, hidden, function. The calculated value for each factor is compared to
and output layers [36]. The neurons at each layer are connected to the current values and the best one is stored. This value is
the neurons at the next layer using weights with values in a range similar to the particle swarm optimization (PSO) algorithm.
of [1, 1]. Each neuron in MLP-NN performs two summation and (3) The average of the general Solution (Mi Þ: In this algorithm,
multiplication operations on the inputs, weights, and biases like the gray wolf optimization algorithm, the ski drivers pass
[12,36e38]. Initially, inputs, weights, and biases are summed up through the global points and show the best values which
using equation (6): are depicted in Fig. 2.
X
n Xa þ Xb þ Xg
Mit ¼ (11)
Sj ¼ wij Ii þ bj (10) 3
i¼1
where Xa , Xb , Xg show the best results in the t-th iteration.
In this equation n is the number of inputs, Ii is the ith input
The speed of the ski-driver ðVi Þ: The location of the ski driver is
variable, bj is the bias for the jth neuron and wij is the connecting
updated by adding speed, and this value is calculated using the
weight between ith input and jth neuron.
following equations:
In the next step, an activation function must be applied to the
output of the correlation 6. Different activation functions may be Xitþ1 ¼ Xit þ Vit (12)
utilized in an MLP-NN, of which the most common types are listed
in Table 2:
112
The flowchart of the PSO algorithm is shown in Fig. 3. The PSO

algorithm finds the best solution in the search space using a particle
population called congestion. The algorithm starts by initializing
the population position randomly in the search space (lower and
upper limit of decision variables) and initializing the speed be-
tween the lower limit (Vmin ) and the upper limit ((Vmax ). The
specified locations for the particles are stored as their best personal
position (Pb ). The position of all particles is evaluated by the
objective function, and the particle with the minimum value of the
objective function is selected as the best global position (Gb ). In
each iteration, the new velocity (Vi ðt þ 1Þ) for each particle (i) is
defined based on the previous velocity (Vi ðt)) and the distance of
the current location (xi ðtÞ) from the best personal and global po-
sitions, which is shown in equation (15) [42]. Subsequently, the
new situation of the particle (xi ðt þ 1Þ) is calculated based on the
previous position and the new velocity (Equation (16)). The new
location of the particles is then assessed with the objective func-
tion. iterations continue until the termination conditions are
reached. This algorithm is a very suitable method for continuous
problems due to its continuous nature.
Fig. 2. Demonstrate how particles A and B move toward the average of the three best
answers using the SSD algorithm. Particle A moves nonlinearly to A0 and then to A00 , as
Vi ðt þ 1Þ ¼ wVi ðtÞ þ c1 r1 ðPbi ðtÞ xi ðtÞÞ þ c2 r2 ðGb ðtÞ xi ðtÞÞ
does particle B [31]. (15)
8
< c sinðr ÞP t X t þ sinðr ÞMt X t if r 0:5 xi ðt þ 1Þ ¼ xi ðtÞ þ Vi ðt þ 1Þ (16)
tþ1 1 i i 1 i i 2
Vi ¼ (13)
: c cosðr1 Þ P t X t þ cosðr1 Þ Mt X t if r2 > 0:5 where i ¼ 1; 2;…;n, and n is the number of particles in the swarm;
i i i i
w is Inertial weight (the amount of recurrence that controls the
where Vi is velocity of Xi and r1 ; r2 are random numbers that are particle velocity) [43,44] and the range [0.5,0.9] improves the
uniformly generated in the range of 0 and 1. Pi is the best solution performance of the PSO algorithm (Martinez and Cao, 2018)); c1 ; c2
for the i-th ski-driver, Mi is the average of the total solution for the which are positive coefficients, are called personal and collective
whole population and c is a parameter that is used to create a learning factors, respectively, and r1 ; r2 are random numbers in the
balance between exploration and extraction, the value is calculated range [0,1] [44,45].
by Equation (14), in which a has a value between zero and one.
Increasing the number of repetitions reduces the value of c. Thus,
by increasing the number of iterations, the exploratory property of
the algorithm decreases, and its focus on the solutions increases.
ctþ1 ¼ act (14)

The main purpose of the SSD algorithm is to search spaces to
find the optimal or near-optimal answer. The number of parame-
ters that should be optimized determines the dimension of the
search space. In this algorithm, the position of the particles is first
randomly set and then the position of the particles is updated by
adding speed to the previous position according to Equation (12).
Also, the velocity of the particles is created according to Equation
(13) as a function of the current position of the particle, the best
personal position, and the best position of all particles so far, and
based on those particles, move nonlinearly towards the best answer
- which is the result of the average of the three best answers that
exist right now. This nonlinear motion increases the ability to
explore new solutions in the SSD algorithm [31].
2.3.2. Particle swarm optimization

The PSO algorithm was developed by Kennedy and Eberhart
[40], inspired by the social behavior of organisms in groups such as
birds and fish. Instead of focusing on a single individual imple-
mentation, this algorithm focuses on one particle of individuals.
Also, in this algorithm, the information of the best position of all
particles is shared. Although the convergence rate in this algorithm
is high, it is very difficult to get rid of the local optimization if the
PSO gets stuck in it [41]. Fig. 3. PSO algorithm flowchart [11,46e48].
113
2.3.3. Genetic algorithm SSD algorithms. After separating the data into two categories of
Genetic algorithm is a type of evolutionary algorithm which was training and testing, the neural network designed is trained with
introduced by Holland and is inspired by the science of biology and the LM optimization algorithm. Weights and bias values are then
concepts such as inheritance, mutation, selection, and combination extracted in the trained model and considered as the best current
[49,50]. Fig. 4 shows how the genetic algorithm works to find the values of all factors (particles or chromosomes) in meta-heuristic
optimal solution. As shown in this figure, a genetic algorithm, like optimization algorithms. Also, the sum of the number of weights
other evolutionary algorithms, begins with an initial population of and biases is defined as the number of decision variables (search
randomly generated chromosomes. Each of these chromosomes is space) in these algorithms. Fig. 6 shows how to extract the values of
then evaluated with an objective function that allocates a value weights and biases as well as their number for the MLP-NN Algo-
called fitness to them. In the next step, chromosomes participate in rithm with one hidden layer. The optimization algorithms used,
the reproduction process (crossover operation) as parent chromo- with a certain number of iterations and a specific population, are
somes according to the amount of fitness function they have. The implemented to determine the optimal values of the decision var-
lower the cost function, the more chances chromosomes have for iables (i.e. weights and biases) to minimize the model estimation
reproduction. Next, the mutation operation is used to produce the error. After stopping the optimization operation based on the
population of the next generation of chromosomes. Due to its specified conditions, the obtained optimal values of weights and
random nature, this operation allows escaping from local optimal biases are placed in the MLP model. Finally, the MLP-GA, MLP-PSO,
points. However, if elitism is used, it can be guaranteed that by and MLP-SSD hybrid models are evaluated with test data.
doing this process, the next population will be at least as good as
the current population. Crossover and mutation operations in the 3. Gathered data
genetic algorithm regulate its exploitation and exploration prop-
erties, respectively. The whole process is repeated for future gen- The data collected in this study include petrophysical logs
erations and continues until the algorithm terminates [51]. (depth, calorimetry, corrected gamma, photoelectric coefficient,
neutron porosity, density, true resistance, and time of compression
wave travel) and corrected permeability for the area of the Fahlian
2.4. Developing hybrid algorithms
Chahi Formation in one of the southwestern field of Iran. Due to the
high heterogeneity in this carbonate formation, which consists of
After determining the structure of the multilayer perceptron
limestone, the use of FMI can provide a good permeability profile.
neural network, adjusting the values of weights and biases is of
Table 3 shows the range of variations of each feature along with
particular importance and strongly affects the accuracy of the
some statistical indicators for the collected data. Fig. 7 shows the
developed model. Due to the better performance of meta-heuristic
log of changes in the values of each feature in the Fahlian Formation
algorithms in neural network training and determining the optimal
of the studied well. According to this figure the increase in the
values of weights and biases compared to classical optimization
diameter of the well, especially in the approximate range of
algorithms, GA, PSO and SSD will be used as neural network
4400e4420 m, has affected the values of the harvested character-
learning algorithms.
istics. Therefore, preprocessing seems necessary to remove outlier
Fig. 5 shows the hybrid steps of the MLP-GA, MLP-PSO, and MLP-
data.
4. Results and discussion
Based on the initial evaluations, the values of the harvested logs

are affected by noise. Therefore, the Tukey method was used to
remove the outliers. IQR was calculated using the values of the first
and the third quartiles. and the upper and lower limits were
determined. Accordingly, 339 data points (i.e., 21% of data) were
identified as outliers and were removed.
Fig. 8 shows the matrix of Pearson correlation coefficients be-
tween the features. There is no significant relationship between the
corrected gamma and permeability in the studied well. In contrast,
the porosity, density, and compressional travel time of sonic waves
have the highest correlation with permeability. Porosity and
compression wave travel time are directly related, and density is
inversely related to permeability.
It is necessary to apply the NCA algorithm to the data for the
feature selection among the given variables. However, the data first
needs to be normalized and then the optimal lambda value in
Equation (9) must be calculated. For mapping data between 1-
and þ1 by specifying the maximum values (Xmax ) and minimum
(Xmin ) for each attribute, the mapped values (Xnorm ) can be deter-
mined using Equation (17). To ascertain the optimal lambda value,
the amounts between zero and 1.3 were defined with a step of one
hundred-thousandth, and then with the k-fold cross-validation al-
gorithm, the RMSE quantity for each of the lambda values was
determined to estimate the permeability based on all inputs. The
sensitivity analysis implies the number of folds in the k-fold cross-
validation algorithm for this problem was considered 10. Fig. 9
Fig. 4. GA optimization flowchart [52,53]. shows the error value for different lambda values for estimating
114
Fig. 5. MLP-GA, MLP-PSO and MLP-SSD hybrid algorithm development flowchart.
Fig. 6. Extracting weights and biases in the form of decision variables vectors and the number of these variables for a hidden single-layer MLP-NN in order to optimize their values
with meta-heuristic algorithms.
Table 3
Range of changes of each feature along with some statistical indicators for the data recorded in Fahlian Formation of the studied well.
Statistical Indicators Parameters
Depth (m) HCAL (in) CGR (GAPI) PEF (B/E) NPHI (V/V) RHOB (G/C3) RT (OHMM) DTCO (us/ft) Perm (mD)
Feature Index 1 2 3 4 5 6 7 8 e
Minimum 4318.864 3.886 10.549 0.900 0.004 2.252 5.720 48.775 0.000
1st Quartile 4380.090 5.771 21.232 4.407 0.049 2.378 28.574 54.888 0.493
Mean 4441.317 6.060 27.512 5.004 0.103 2.466 233.971 62.844 24.095
3rd Quartile 4502.544 5.993 31.359 5.842 0.152 2.551 105.995 69.485 35.168
Maximum 4563.770 8.981 82.508 7.096 0.227 2.762 13933.301 78.728 325.332
115
Fig. 7. Profile of depth changes of the values for each parameter in the Fahlian Formation of the studied well.
permeability using all inputs. As can be seen in this figure, the best NN with different structures was used to model the training data,
value for lambda to achieve the lowest error in the estimate is which showed that the best structure for modeling permeability
0.13669. Therefore, this value was defined in Equation (9) to specify with selected properties is the use of an MLP-NN with two hidden
the properties with the greatest effect on permeability for lambda. layers, which are 3 and 5 neurons. They are in the first and second
layers, respectively.
Fig. 11 compares the results of three network training algo-
X Xmin
Xnorm ¼ 2 1 (17) rithms, Trainlm, Tgrainscg, and Trainrp. The Trainlm algorithm is a
Xmax Xmin
network training function that updates weight and bias values
Fig. 10 shows the values obtained for each of the features in the based on Levenberg-Marquardt optimization. The Levenberg-
permeability estimation using the NCA algorithm. As can be seen in Marquardt optimizer function uses the gradient vector and the
this figure, the properties of depth, porosity, true resistance, and Jacobin matrix instead of the Hessian matrix. This algorithm is
compression wave travel time have the highest value (effect) in designed to work specifically with loss functions, which are defined
permeability estimation. Therefore, these properties were consid- as the sum of error squares. The trainscg algorithm updates weights
ered as the input of hybrid models for permeability estimation in and biases based on the scaled conjugate gradient method. SCG was
the modeling stage. developed by Müller. Unlike algorithms for conjugate gradients
For permeability modeling using selected features, 80% of the that require a line, this algorithm does not perform the search steps
data were designated as training and 20% as test data. Then, MLP- linearly, thus reducing computational time. The Trainrp algorithm
Fig. 8. Matrix diagram of pairwise comparisons of independent and dependent parameters in the studied well after removal of outliers.
116
Fig. 9. RMSE changes for different lambda options in the NCA feature selection
algorithm. Fig. 11. Comparing different training algorithms in the accuracy of MLP-NN.
in 169 epochs. According to the figure, it is clear that at first, the

MSE error decreased with a steep slope and then reached a fixed
trend. Also, overfitting did not occur, and the best error rate in the
validation step with this method is 11.8889.
Fig. 13 shows cross plots of measured and the predicted
permeability of the rock for training, validation, and test and all
data. The charts indicate a good forecasting trend, except for one
outlier point, which can be seen in the training step. According to
the graph from zero to the first half, the line graph y ¼ t and the
amount of permeability estimation are superimposed, however, in
the second half, the estimated value is less than the actual value, so
the underestimation has occurred. The correlation coefficient (R)
values for train, test and validation steps are 0.99354, 0.99578, and
0.99383, respectively, indicating that the MLP method is acceptable
for predicting rock permeability. Outliers affect the amount of R, so
We have also calculated the RMSE, which were equal to 4.0244,
4.3603, and 4.9080 in train, validation, and test cases, which also
Fig. 10. The value obtained using the NCA feature selection algorithm for each of the
studied features.
also updates weights and biases based on the resilient back-

propagation algorithm. This algorithm eliminates the effect of
partial derivative size. In this algorithm, only the derivative symbol
is used to determine the direction of weight updates, and its
magnitude does not affect weight updates. It should be noted that
the results of the implementation of the above algorithms in 10
consecutive iterations with the best results are drawn in this dia-
gram. Considering RMSE as the evaluation criterion, it is clear that
the Trainlm method has the lowest error rate among the described
methods. Therefore, the mentioned method is reliable and can be
used to implement the MLP neural network algorithm.
The diagram of the mean square error (MSE) changes in epochs
for the MLP method with the Levenberg-Marquardt optimizer has
shown in Fig. 12. As mentioned earlier, neural network architecture
consists of 2 hidden layers with 3 and 5 neurons. In this method,
the stop condition is considered not to improve the validation error
rate in 6 consecutive repetitions, which finally ends the algorithm Fig. 12. Changes of MSE for training, validation and test subsets at each iteration.
117
Fig. 13. Crossplots of measured and predicted permeability for training, validation, test and all data.
indicates the reliability of this method. For permeability modeling with hybrid algorithms, it is neces-
Fig. 14 illustrates the error histogram of the trained neural sary to determine the values of controllable parameters of the
network with 20 bins for the three steps of train, validation, and optimization algorithm. For this purpose, a sensitivity analysis was
test. As we can see, the orange line means the average accuracy in performed, the results of which are displayed in Table 4.
the middle of the chart is close to zero, and the curve has a normal The measured and predicted values of rock permeability for
distribution and is slightly skewed to the right. This indicates that three hybrid models which were introduced in this paper are
the fitting error (error distribution of this method) is in an appro- shown in Fig. 15 on both training and test data. The coefficient of
priate range. determination (R2 ) expresses an alternative method to the RMSE
(objective function of the model) for evaluating the accuracy of the
Table 4
Results of sensitivity analysis on the controllable parameters of optimization
algorithms.
Optimization Parameters Values

algorithms
SSD Population 120

Inertia coefficient 0.98
Correction coefficient 2.0
PSO Swarm size 125
Cognitive constant 2.05
Social constant 2.05
Inertia weight (Damping ratio) 0.95
GA Population 130
Selection method Roulette wheel
Crossover Uniform
Mutation Uniform (p ¼ 0.06)
Mutation rate 0.07
Selection pressure (Roulette 2
wheel)
Fig. 14. Error histogram for obtained results of training, validation and test subsets
using simple MLP-NN model.
118
model prediction. R2 is a statistical criterion that shows the ratio of the rock to pass the fluid). The maximum value of R2 (0.9982) is
variance for a dependent variable that is described by one or more obtained by the MLP-SSD algorithm, which represents the best-
independent variables in the regression model. If R2 ¼ 1 then the predicted correlation performance for rock permeability. The sec-
permeability of the rock is predicted without error by the selected ond most accurate model is MLP-PSO with R2 ¼ 0.9915, followed by
independent variables. The higher this value, the lower the pre- MLP-GA with the lowest prediction (R2 ¼ 0.9892) accuracy among
diction error. The values of R2 on the training and test data for the these hybrid models. In addition, as shown in the figure, there is not
three hybrid models (MLP-GA, MLP-PSO, and MLP-SSD) are given in much difference in accuracy between the training and test data,
Fig. 15. These results are the best results of several sets of experi- and therefore the model is not over-fitted and is reliable.
ments in the 10-fold cross-validation method and also show that all The difference between the predicted values and the measured
of these methods have a high degree of correlation between the values for the rock permeability in the developed models is shown
measured and predicted values. in Fig. 16. The horizontal axis represents the measured perme-
It is noteworthy that the introduced hybrid models have high ability, and the vertical axis represents the difference between the
accuracy in predicting the permeability of the rock (the ability of predicted and measured permeability, known as the residual error.
Fig. 15. Measured versus predicted values for rock permeability for three developed hybrid machine-learning models applied to the training and testing data subsets: (a) MLP-GA
train data; (b) MLP-GA test data; (c) MLP-PSO train data; (d) MLP-PSO test data; (e) MLP-SSD train data; and, (f) MLP-SSD test data.
119
Although this error cannot be a valid indicator for the predicted research. As can be seen, there is no significant difference between
model, it provides an understanding of the predictive performance the accuracy obtained for the training and test data, so the intro-
of the models used in the permeability of the rock. The smallest duced models are reliable, and no over-fitting has occurred during
mean residual error is 0.119 for the MLP-PSO hybrid model, fol- the training process.
lowed by the MLP-SSD and MLP-GA models with absolute residual According to Table 5, the RMSE value for MLP-SSD is smaller
error values of 0.1209 and 0.2108, respectively. In general, the than the RMSE calculated for the test data of other hybrid models.
developed hybrid models used to predict rock permeability have In second place is the MLP-PSO method with RMSE ¼ 2.57 and
acceptable predictive performance and the amount of noise or ir- VAF ¼ 99.15. It is followed by the MLP-GA model with the RMSE
regularities is not significant. value of 2.89, which has the maximum absolute value of PI between
The error histogram of hybrid models 1, 2, and 3 are shown in the proposed methods.
Fig. 17. According to the figures, it is clear that the means of the Fig. 18 illustrates the error changes in 10 runs with300 different
studied models are close to zero, and their standard deviation is iterations of SSD, PSO, and GA optimization algorithms in each run.
insignificant compared to the maximum recorded value. The In terms of the average changes, the GA optimization algorithm
normal distributions, which are plotted on the error values ob- reaches the desired value of RSME in approximately 200 iterations.
tained for all models, seem symmetrical and slightly skewed to the The SSD and PSO algorithms converge in 270 and 300 iterations,
right. respectively. In addition, all the algorithms under study are
In evaluating forecast accuracy, we will gain a more accurate convergent and will reach almost the same RMSE after a certain
insight by comparing several separate statistical accuracies. In number of iterations. The only difference is that the GA method has
addition to the coefficient of determination R2 , the variance cal- less computational time than other methods. The training time of
culus, the root mean square error, the objective function of the MLP network is significantly shorter than the hybrid algorithms.
hybrid models, and the performance index are used to evaluate the MLP-GA has the shortest training time among the hybrid algo-
model's accuracy. The equations used for the calculation of these rithms. MLP-PSO is performing slightly faster than the MLP-SSD.
statistical parameters are shown in Appendix A. Running the code on a system with 64.0 GB RAM and 2.4 GHz
The accuracy of the proposed models varies according to the CPU (2 processors) resulted in 28 s, 726 s, and 771 s calculation time
small values of RMSE and the high values of VAF. Table 3 illustrates for MLP, MLP-PSO, and MLP-SSD algorithms, respectively.
the values of RMSE, VAF, and PI for the models developed in this
Fig. 16. Relative deviations of measured versus predicted values of rock permeability (mD) for developed machine-learning models: (a) MLP-GA; (b) MLP-PSO; and (c) MLP-SSD.
120
Fig. 17. Error (mD) histogram with fitted normal distribution (red line) for three hybrid machine-learning models: (a) MLP-PSO; (b) MLP-GA; and (c) MLP-SSD.
5. Conclusions for the first time in this study for estimation of permeability
demonstrated slightly better results compared with other hybrid
This study employs intelligent hybrid methods to estimate the methods. MLP-GA was the least computationally expensive method
permeability of rock from other petrophysical features. Multilayer among the examined networks. Overall, the excellent performance
perceptron network was combined with three heuristic algorithms of the hybrid methods in the estimation of permeability for the case
(SSD, GA, and PSO) to optimize the efficiency and improve the studied here indicates that such algorithms are applicable in highly
convergence of the networks. Petrophysical logs and data were heterogeneous formations.
collected from a well drilled into Fahlian formation in one of the
Iranian southwestern fields. The outliers in the data were identified
and removed and feature selection was carried out to enhance the 6. Limitations and suggestions
accuracy of the network in predicting the permeability of the rock.
The results show all the tested hybrid methods estimate the The generated results in this study are limited to the Fahlian
permeability with high accuracy. MLP-SSD method which was used carbonate formation. Although we expect the model to work effi-
ciently in other zones because of its excellent performance in a
highly heterogeneous formation, still, it is advised to use the model
with care in other formations and rocks.
Table 5
Prediction accuracy statistical measures of permeability (mD) for three developed
machine learning models.
Model Subset R-square RMSE (mD) VAF PI AARD Code and data availability
MLP-GA Test 0.9892 2.89 98.93 0.9151 1.99
Train 0.9959 1.68 99.59 0.341 1.43 The developed MLP-SSD and MLP-PSO codes in MATLAB R2020b
MLP-PSO Test 0.9915 2.57 99.15 0.5895 0.72 to produce this paper are available at https://github.com/
Train 0.9976 1.27 99.76 0.7187 0.45 mmehrad1986/Hybrid-MLP.
MLP-SSD Test 0.9928 2.37 99.28 0.3879 0.79 The used data in this study is not available because of respecting
Train 0.9982 1.12 99.81 0.8705 0.37
the commitment to maintain the confidentiality of the information.
121
Fig. 18. The convergence error for a) MLP-SSD b) MLP-PSO c) MLP-G.
Declaration of competing interest References
The authors declare that they have no known competing [1] U. Ahmed, S.F. Crary, G.R. Coates, Permeability estimation: the various sources
and their interrelationships, J. Petrol. Technol. 43 (1991) 578e587.
financial interests to personal relationships that could have [2] H. Akhundi, M. Ghafoori, G.-R. Lashkaripour, Prediction of shear wave velocity
appeared to influence the work reported in this paper. using artificial neural network technique, multiple regression and petro-
physical data: a case study in Asmari reservoir (SW Iran), Open J. Geol. (2014),
2014.
[3] M.A. Ahmadi, Z. Chen, Comparison of machine learning methods for esti-
Appendix AEquations of the Statistical Parameters mating permeability and porosity of oil reservoirs via petro-physical logs,
Petroleum 5 (2019) 271e284.
[4] B. Salimifard, Predicting Permeability from Other Petrophysical Properties,
2015.
[5] J. Kozeny, Úber kapillare leitung der wasser in boden, Royal Academy of Sci-
!1 ence, 1927.
1X
p 2
[6] P.C. Carman, Fluid flow through granular beds, Chem. Eng. Res. Des. 75 (1997).
RMSE ¼ ðyr zr Þ2 (A1) S32eS48.
p r¼1
[7] A.M.S. Lala, Modifications to the KozenyeCarman model to enhance petro-
physical relationships, Explor. Geophys. 49 (2018) 553e558.
[8] E.D. Krauss, D.C. Mays, Modification of the Kozeny-Carman equation to
varðyr zr Þ
VAF ¼ 1 100 (A2) quantify formation damage by fines in clean, unconsolidated porous media,
varðyr Þ SPE Reserv. Eval. Eng. 17 (2014) 466e472.
[9] A. Ismail, Q. Yasin, Q. Du, A.A. Bhatti, A comparative study of empirical, sta-
tistical and virtual analysis for the estimation of pore network permeability,
VAF J. Nat. Gas Sci. Eng. 45 (2017) 825e839.
PI ¼ r þ RMSE (A3) [10] P. Tahmasebi, A. Hezarkhani, A hybrid neural networks-fuzzy logic-genetic
100
algorithm for grade estimation, Comput. Geosci. 42 (2012) 18e27.
[11] M. Anemangely, A. Ramezanzadeh, B. Tokhmechi, A. Molaghab,
where y and z are real and predicted values of the dependent A. Mohammadian, Drilling rate prediction from petrophysical logs and mud
variable, and p is the number of dataset's records. logging data using an optimized multilayer perceptron neural network,
122
J. Geophys. Eng. 15 (2018) 1146e1159. [33] H. Osman, M. Ghafari, O. Nierstrasz, The Impact of Feature Selection on Pre-
[12] P. Saikia, R.D. Baruah, S.K. Singh, P.K. Chaudhuri, Artificial Neural Networks in dicting the Number of Bugs, 2018. ArXiv Prepr. ArXiv1807.04486.
the domain of reservoir characterization: a review from shallow to deep [34] J. Goldberger, G.E. Hinton, S. Roweis, R.R. Salakhutdinov, Neighbourhood
models, Comput. Geosci. 135 (2020) 104357. components analysis, Adv. Neural Inf. Process. Syst. 17 (2004).
[13] A. Nasseri, M.J. Mohammadzadeh, Evaluating distribution pattern of petro- [35] M.N. Amar, A.J. Ghahfarokhi, C.S.W. Ng, N. Zeraibi, Optimization of WAG in
physical properties and their monitoring under a hybrid intelligent based real geological field using rigorous soft computing techniques and nature-
method in southwest oil field of Iran, Arabian J. Geosci. 10 (2017) 1e15. inspired algorithms, J. Petrol. Sci. Eng. (2021) 109038.
[14] M.A. Ahmadi, S. Zendehboudi, A. Lohi, A. Elkamel, I. Chatzis, Reservoir [36] M.N. Amar, A.J. Ghahfarokhi, N. Zeraibi, Predicting thermal conductivity of
permeability prediction by neural networks combined with hybrid genetic carbon dioxide using group of data-driven models, J. Taiwan Inst. Chem. Eng.
algorithm and particle swarm optimization, Geophys. Prospect. 61 (2013) 113 (2020) 165e177.
582e598. [37] A.A. Heidari, H. Faris, I. Aljarah, S. Mirjalili, An efficient hybrid multilayer
[15] O. Sudakov, E. Burnaev, D. Koroteev, Driving digital rock towards machine perceptron neural network with grasshopper optimization, Soft Comput. 23
learning: predicting permeability with gradient boosting and deep neural (2019) 7941e7958.
networks, Comput. Geosci. 127 (2019) 91e98. [38] A. Hemmati-Sarapardeh, A. Varamesh, M.N. Amar, M.M. Husein, M. Dong, On
[16] E. Sfidari, A. Amini, A. Kadkhodaie, B. Ahmadi, Electrofacies clustering and a the evaluation of thermal conductivity of nanofluids using advanced intelli-
hybrid intelligent based method for porosity and permeability prediction in gent models, Int. Commun. Heat Mass Tran. 118 (2020) 104825.
the South Pars Gas Field, Persian Gulf, Geopersia. 2 (2012) 11e23. [39] Y. Suzuki, S.J. Ovaska, T. Furuhashi, R. Roy, Y. Dote, Soft Computing in In-
[17] A.K. Mulashani, C. Shen, B.M. Nkurlu, C.N. Mkono, M. Kawamala, Enhanced dustrial Applications, Springer Science & Business Media, 2012.
group method of data handling (GMDH) for permeability prediction based on [40] J. Kennedy, R. Eberhart, Particle swarm optimization, in: Proc. ICNN’95-
the modified Levenberg Marquardt technique from well log data, Energy 239 International Conf. Neural Networks, IEEE, 1995, p. 1942, https://doi.org/
(2022) 121915. 10.1108/09685220610648364, 1948.
[18] D.A. Otchere, T.O.A. Ganat, R. Gholami, M. Lawal, A novel custom ensemble [41] Q. Zhang, C. Li, Y. Liu, L. Kang, Fast multi-swarm optimization with Cauchy
learning model for an improved reservoir permeability and water saturation mutation and crossover operation, in: Int. Symp. Intell. Comput. Appl.,
prediction, J. Nat. Gas Sci. Eng. 91 (2021) 103962. Springer, 2007, pp. 344e352.
[19] Z. Zhang, H. Zhang, J. Li, Z. Cai, Permeability and porosity prediction using [42] R. Poli, J. Kennedy, T. Blackwell, Particle swarm optimization, Swarm Intell. 1
logging data in a heterogeneous dolomite reservoir: an integrated approach, (2007) 33e57.
J. Nat. Gas Sci. Eng. 86 (2021) 103743. [43] M.E.H. Pedersen, A.J. Chipperfield, Simplifying particle swarm optimization,
[20] B. Mathew Nkurlu, C. Shen, S. Asante-Okyere, A.K. Mulashani, J. Chungu, Appl. Soft Comput. 10 (2010) 618e628.
L. Wang, Prediction of permeability using group method of data handling [44] M.N. Amar, N. Zeraibi, K. Redouane, Bottom hole pressure estimation using
(GMDH) neural network from well log data, Energies 13 (2020) 551. hybridization neural networks and grey wolves optimization, Petroleum 4
[21] A.N. Okon, S.E. Adewole, E.M. Uguma, Artificial neural network model for (2018) 419e429.
reservoir petrophysical properties: porosity, permeability and water satura- [45] C.A.C. Coello, G.B. Lamont, D.A. Van Veldhuizen, Evolutionary Algorithms for
tion prediction, Model. Earth Syst. Environ. (2020) 1e18. Solving Multi-Objective Problems, Springer, 2007, https://doi.org/10.1007/
[22] M.-A. Ahmadi, M.R. Ahmadi, S.M. Hosseini, M. Ebadi, Connectionist model 978-0-387-36797-2.
predicts the porosity and permeability of petroleum reservoirs by means of [46] M. Mehrad, M. Bajolvand, A. Ramezanzadeh, J.G. Neycharan, Developing a
petro-physical logs: application of artificial intelligence, J. Petrol. Sci. Eng. 123 new rigorous drilling rate prediction model using a machine learning tech-
(2014) 183e200. nique, J. Petrol. Sci. Eng. 192 (2020) 107338. https://doi.org/10.1016/j.petrol.
[23] M. Ali, A. Chawathe , Using artificial intelligence to predict permeability from 2020.107338.
petrographic data, Comput. Geosci. 26 (2000) 915e925. [47] A.R.B. Abad, H. Ghorbani, N. Mohamadian, S. Davoodi, M. Mehrad,
[24] C.-H. Chen, Z.-S. Lin, A committee machine with empirical formulas for S.K. Aghdam, H.R. Nasriani, Robust hybrid machine learning algorithms for gas
permeability prediction, Comput. Geosci. 32 (2006) 485e496. flow rates prediction through wellhead chokes in gas condensate fields, Fuel
[25] R. Gholami, A. Moradzadeh, S. Maleki, S. Amiri, J. Hanachi, Applications of 308 (2022) 121872.
artificial intelligence methods in prediction of permeability in hydrocarbon [48] N. Mohamadian, H. Ghorbani, D.A. Wood, M. Mehrad, S. Davoodi, S. Rashidi,
reservoirs, J. Petrol. Sci. Eng. 122 (2014) 643e656. A. Soleimanian, A.K. Shahvand, A geomechanical approach to casing collapse
[26] M. Jamialahmadi, F.G. Javadpour, Relationship of permeability, porosity and prediction in oil and gas wells aided by machine learning, J. Petrol. Sci. Eng.
depth using an artificial neural network, J. Petrol. Sci. Eng. 26 (2000) 235e239. 196 (2021) 107811. https://doi.org/10.1016/j.petrol.2020.107811.
[27] A.K. Verma, B.A. Cheadle, A. Routray, W.K. Mohanty, L. Mansinha, Porosity and [49] J.H. Holland, Adaptation in Natural and Artificial Systems: an Introductory
permeability estimation using neural network approach from well log data, Analysis with Applications to Biology, Control, and Artificial Intelligence, MIT
SPE Annu. Tech. Conf. Exhib. (2012) 1e6. press, 1992.
[28] A.M. Handhel, Prediction of reservoir permeability from wire logs data using [50] M.N. Amar, N. Zeraibi, A combined support vector regression with firefly al-
artificial neural networks, Iraqi J. Sci. 50 (2009) 67e74. gorithm for prediction of bottom hole pressure, SN Appl. Sci. 2 (2020) 1e12.
[29] S. Elkatatny, M. Mahmoud, Z. Tariq, A. Abdulraheem, New insights into the [51] M.N. Amar, N. Zeraibi, A. Jahanbani Ghahfarokhi, Applying hybrid support
prediction of heterogeneous carbonate reservoir permeability from well logs vector regression and genetic algorithm to water alternating CO2 gas EOR,
using artificial intelligence network, Neural Comput, Appl 30 (2018) Greenh. Gases Sci. Technol. 10 (2020) 613e630.
2673e2683. [52] S.B. Ashrafi, M. Anemangely, M. Sabah, M.J. Ameri, Application of hybrid
[30] A.A. Adeniran, A.R. Adebayo, H.O. Salami, M.O. Yahaya, A. Abdulraheem, artificial neural networks for predicting rate of penetration (ROP): a case
A competitive ensemble model for permeability prediction in heterogeneous study from Marun oil field, J. Petrol. Sci. Eng. 175 (2019), https://doi.org/
oil and gas reservoirs, Appl. Comput. Geosci. 1 (2019) 100004. 10.1016/j.petrol.2018.12.013.
[31] A. Tharwat, T. Gabel, Parameters optimization of support vector machines for [53] M. Sabah, M. Mehrad, S.B. Ashrafi, D.A. Wood, S. Fathi, Hybrid machine
imbalanced data using social ski driver algorithm, Neural Comput. Appl. 32 learning algorithms to enhance lost-circulation prediction and management
(2020) 6925e6938. in the Marun oil field, J. Petrol. Sci. Eng. 198 (2021) 108125. https://doi.org/10.
[32] J.W. Tukey, Exploratory Data Analysis, Mass., Reading, 1977. 1016/j.petrol.2020.108125.
123

Prediction of permeability from well logs using a new hybrid machine learning algorithm

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Prediction of permeability from well logs using a new hybrid machine learning algorithm

Uploaded by

Copyright:

Available Formats

Petroleum 9 (2023) 108e123

Contents lists available at ScienceDirect

Prediction of permeability from well logs using a new hybrid machine

1. Introduction production data, and seismic information [1]. Various empirical

Abbreviations MIMO Multiple-inputs multiple-outputs

2. Methodology 2.1. Pre-processing

Number Authors Algorithms Number of input Inputs

1 Mulashani et al. [17] GMDH, LM 4 TNPH, SGR, VSH, PHIE, RHOZ

deﬁned as follows: Table 2

The ﬂowchart of the PSO algorithm is shown in Fig. 3. The PSO

ctþ1 ¼ act (14)

2.3.2. Particle swarm optimization

4. Results and discussion

Based on the initial evaluations, the values of the harvested logs

Fig. 5. MLP-GA, MLP-PSO and MLP-SSD hybrid algorithm development ﬂowchart.

Statistical Indicators Parameters

in 169 epochs. According to the ﬁgure, it is clear that at ﬁrst, the

also updates weights and biases based on the resilient back-

Optimization Parameters Values

SSD Population 120

Fig. 18. The convergence error for a) MLP-SSD b) MLP-PSO c) MLP-G.

Declaration of competing interest References

You might also like