Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

DC175238 DOI: 10.

2118/175238-PA Date: 6-March-17 Stage: Page: 1 Total Pages: 10

Intelligent Tool To Design Drilling, Spacer,


Cement Slurry, and Fracturing Fluids by
Use of Machine-Learning Algorithms
Arash Shadravan, ReservoirFocus; Mohammadali Tarrahi, Shell; and
Mahmood Amani, Texas A&M University at Qatar

Summary n Repeat experiments with more than 10% dial-reading


Design of drilling fluids, spacers, cement slurries, and fracturing fluctuation; inspect the jewel, pivot, rotor, and belt, and
fluids is often done by trial and error in the laboratory. In the first then recalibrate
n Repeat experiments with a modified fluid formulation if
step, the required properties of these fluids are categorized and
then efforts will be started with a rough idea of the optimal com- fluid-particulate settling is observed
position. This first guess usually depends on the experience of the • Determining the most sensitive and effective design parame-
laboratory analyst or fluid engineer. Afterward, the trial-and-error ters (such as temperature and density)
testing starts, and it continues until the fluid design moves closer • Design of experiment (DOE) or Latin hypercube sampling
to the desired fluid criteria. There are several test data that would to produce the most-representative and wide-enough sample
not be used in this method, and it is difficult to digest a large cases (Shadravan et al. 2015a, 2015b).
amount of information by the user. Trial and error could be time-
consuming, very costly, and misleading. Today, there is a need Prediction Tool. This is an intelligent data-driven prediction
for an intelligent system that uses all the available data (big data), tool.
even if the data sets are not close to the desired goal, and offers • Build a reliable prediction tool (regressor) through machine-
insights for fluid designs. learning algorithms
This paper conducted a study on the application of machine- • Predict (and assess the uncertainty of) any new samples’
learning-based methodologies, including Gaussian-process regres- properties (e.g., rheological properties)
sion (GPR) and artificial neural networks (ANNs), to reduce the
costs of testing, integrate available experimental data, and elimi- Design Tool. This is an intelligent fluid-design tool.
nate the need for personnel supervision. These practical nonlin- • Designing the desired fluid system depending on the
ear-regression methods empower efficient and fast prediction requested sample properties
tools that do not require including complex physics of the under- • Performing the opposite of prediction tool and building a
lying system while integrating all available data from different regressor to predict corresponding parameters of a suitable
sources. GPR, which is also known as Kriging in geostatistics lit- fluid system (through machine-learning algorithms)
erature, has exceptional advantages over traditional regression • Applying inverse problem methods to prediction tool and
methods because it does not require a known form for regression obtaining the desired fluid system
function and also has the capability of determining the estimation In the machine-learning literature, some of the most-well-
error and the confidence interval. This machine-learning-based known regression methods include linear regression (ordinary
tool offers insights for intelligent fluid design and could least squares), nonlinear regression or curve fitting (e.g., polyno-
reduce costs. mial regression), logistic regression, ridge regression, ANN,
Bayesian network, radial-basis-function network, support-vector
Introduction machine, GPR, and kernel regression (Theodoridis et al. 2010;
Machine-learning methodologies and expert systems have proved Duda et al. 2012). In this paper, we choose to investigate the appli-
to be conducive to digesting big data sets and making intelligent cation of the ANN and GPR algorithms to fluid-characteristics
decisions for agile productivity and more-accurate measures. These estimation and fluid design and build an intelligent multipurpose
algorithms have been successfully applied to petroleum- prediction and design tool. Fig. 1 lists various types of fluid-design
engineering fields such as production forecasting, history matching, experiments that usually are pursued by a trial-and-error method.
reservoir characterization, and enhanced-oil-recovery screening. There have been some studies regarding the applications of
The proposed work flow in this paper includes the following tools. ANN in drilling and fluid design (Osman and Aggour 2003; Gidh
et al. 2012). Shadravan and Amani (2012) and Amani et al.
Data-Preparation Tool. The tool prepares the training data (2012) studied the rheological properties on various drilling fluids
through guided experiments. under extreme high-pressure/high-temperature (HP/HT) condi-
• Quality assurance/quality control for the rheological-proper- tions. Lee et al. (2012) investigated how various HP/HT rheome-
ties experiment ters could produce different results under HP/HT conditions,
* Calibrate the equipment according to the user manual
assuming each data set is repeatable and each rheometer is well-
* Operate and assemble the equipment according to the
calibrated. In this study, one HP/HT rheometer was used to pre-
user manual vent exposing the data sets to such errors. The complexity and
n Maintain 150 rev/min for fluid conditioning
variations of downhole conditions, geology, additives (polymers,
n Maintain and record the data during the experiment in
cement types, weighting agents, crosslinker, or breaker), testing
one mode (manual or automatic) procedures, and experiment schedule could all be taken into
account properly to construct a useful laboratory database for an
intelligent fluid-design tool. The “mixing” water quality should
also not be ignored. Proper equipment calibration and competent
Copyright V
C 2017 Society of Petroleum Engineers
laboratory personnel are valuable elements of successful fluid
This paper (SPE 175238) was accepted for presentation at the SPE Kuwait Oil and Gas design. Fig. 2 shows traditional trial-and-error fluid design (Bland
Show and Conference, Mishref, Kuwait, 11–14 October 2015, and revised for publication.
Original manuscript received for review 24 August 2015. Revised manuscript received for
et al. 1993; Diarra et al. 2016), and Fig. 3 demonstrates the intelli-
review 19 December 2016. Paper peer approved 10 January 2017. gent machine-learning-fluid-design work flow.

2017 SPE Drilling & Completion 1

ID: jaganm Time: 15:57 I Path: S:/DC##/Vol00000/170004/Comp/APPFile/SA-DC##170004


DC175238 DOI: 10.2118/175238-PA Date: 6-March-17 Stage: Page: 2 Total Pages: 10

Fluid design

Drilling fluids Spacers Cement Fracturing fluids


Rheology Rheology Rheology
Fluid loss and free fluid Rheology
Gel strength Stability
Thickening time Turbidity
Fluid loss Compatibility
Dynamic setting Apparent density
Resistivity Contamination
Mechanical properties Acid solubility
Lubricity Spacer/surfactant screening
Barite sag Annular-seal durability
Interaction analysis Thermal expansion
Gel strength
Heat of hydration
Permeability

Fig. 1—Applications of the intelligent design tool for various fluid-design experiments.

to predict the rheological properties of a fluid as a simple example


Trial-and-
of the breadth of the application of machine-learning methods.
Desired
error fluid Fluid-design Laboratory
fluid Predicting such properties can be viewed as a regression analysis
design criteria tests (a data-driven approach), where the goal is to design the best lin-
design
ear- or nonlinear-regression function (or rule).
In this study, we propose to solve this problem with machine-
Fig. 2—Traditional arbitrary and trial-and-error-based fluid- learning methods that are well-established in computer-science
design procedure. literature. The proposed algorithms eliminate an arbitrary
approach in making decisions, and provide accuracy and fast
computation. This approach substitutes for the need for days and
Machine-Learning Methods for weeks of testing in the laboratory with an intelligent and agile
Nonlinear Regression fluid design that could require only a few tests to confirm the pro-
Investigation of the rheological properties of fracturing, drilling, posed compositions (additives) for the desired properties.
spacer, and cement-slurry fluids is conducted on a daily basis. In the parametric-regression methods, a known regression
This paper delineates the benefits of machine-learning algorithms function (such as polynomial or exponential) is defined in terms

Data Detecting DOE Most-representative


Preparation most-effective experimental
Latin hypercube
parameters samples
sampling

Fluid Fluid characteristics


Intelligent properties and Machine-learning (e.g., rheology),
data-driven experiment algorithm associated uncertainty
prediction condition
tool

Machine Confirm Desired


Fluid-design learning fluid
laboratior
criteria algorithm design
test
Intelligent
fluid-design
tool
Intelligent
data-driven Fluid characteristics
Candidate-fluid
prediction (e.g., rheology)
properties
tool

Inverse
Modeling

Fig. 3—Schematic of the proposed intelligent integrated tools for guided experiment design, characteristic prediction, and fluid-
system design.

2 2017 SPE Drilling & Completion

ID: jaganm Time: 15:57 I Path: S:/DC##/Vol00000/170004/Comp/APPFile/SA-DC##170004


DC175238 DOI: 10.2118/175238-PA Date: 6-March-17 Stage: Page: 3 Total Pages: 10

Table 1—Pros and cons in use of machine learning.

of a finite number of unknown parameters that are estimated from ANNs


the training data. In general, because the nature of the underlying ANN models are nonparametric-regression methods that have
physical process (e.g., relation of rheological properties and fluid been used progressively in different areas of science (Rafieisakhaei
characteristics) is very complex, assuming a known and closed et al. 2016a, 2016b). Although analyzing data by use of ANNs is
form for the regression function is a rigorous task. Therefore, non- usually more complex than traditional regression approaches,
parametric-regression methods are proposed that have no (or very ANN models are more flexible and efficient when the main goal is
little) a priori knowledge of the form of the function that is being estimation of dependent output values by use of different explana-
estimated. These methods allow for the class of functions that the tory independent variables (Theodoridis et al. 2010; Duda et al.
model can represent to be very broad. Here we implement and 2012; Rafieisakhaei et al. 2016a, 2016b).
apply ANNs and GPR, which are among the most-practical non- In our current experiment, there are three input parameters
parametric-regression methods (Bishop 2006; Theodoridis et al. (fluid density, Ingredient A content, and temperature) and seven
2010; Duda et al. 2012). output values (300-, 200-, 100-, 60-, 30-, 6-, and 3-rev/min viscos-
Recently data-driven machine learning, initially a field of com- ities), and we have nine data points (training samples)
puter science, has become a widespread and powerful tool for pre-
diction, decision making, and data analysis (Schakel and Mesdag Training data: fXi ; Yi g; i ¼ 1; 2; …; N; . . . . . . . . . . . ð1Þ
2014; Mijnarends et al. 2015; Cao et al. 2016). Machine-learning
algorithms function by constructing a mathematical model from Xi ¼ ith input vector ðn  1Þ; . . . . . . . . . . . . . . . . . . . ð2Þ
example or training inputs (e.g., laboratory measurements) to per-
form data-driven predictions or decisions. Machine-learning Yi ¼ ith output vectorðm  1Þ; . . . . . . . . . . . . . . . . . . ð3Þ
methods are successfully used in various applications such as bio-
where N represents the number of experiments or the number of
informatics, medical diagnosis, computational finance, marketing,
data points. In this study, n, m, and N are 3, 7 and 9, respectively.
online advertising, and computer vision (Bishop 2006; Rasmussen
Our ANN in the input and output layers contains four (three inputs
and Williams 2006; Theodoridis et al. 2010).
plus bias term) and seven neurons, respectively. We also consider
a hidden layer, and its number of neurons will be determined by
Cement and Spacer Design cross validation. The most commonly used ANN is multiple-layers
To accomplish long-term zonal isolation, several factors, includ- perceptron, where the learning (finding network weights) is per-
ing the cement-spacer/drilling-fluid rheological properties and formed dependent on minimization of the mean square error of the
friction-pressure hierarchy, must be carefully engineered. Design- output and by use of a backpropagation-learning algorithm.
ing the cement and spacer fluids with the desired rheological As a preprocessing step, it is required to scale input and output
properties traditionally is achieved by a trial-and-error approach variables to have them in a similar range. Input parameters are
given the complications and high costs of performing pressurized transformed to be standard normal variables (zero mean and unit
(especially cement) rheology tests at high temperatures (greater variance), and output values are shifted and scaled to a [–1,1]
than 220 F). Shadravan et al. (2016) showed three case studies interval to be consistent with the range of the activation function
where well integrity was maintained, benefiting from the applica- (tanh function). The original data transformation and normaliza-
tion of DOE in designing spacer fluids. The proposed model tion is performed in the following way:
worked well for specific formulations, but changes in some of the
ingredients or the range of their concentrations would necessitate XX
Provided the training data Xsn ¼ ;
developing another DOE model, which could be time-consuming. rX
Limited inventories of the ingredients for the fluid design or their 2ðY  Ymin Þ
cost could encourage operating companies or service providers to Ys ¼  1; . . . . . . . . . . . . . . . . . . . . . . . . ð4Þ
use different ingredients, therefore making DOE models futile. In Ymax  Ymin
fact, DOE models lack dynamic learning and progression, and the where X and rX represent input mean and standard deviation
pursuit of a more-comprehensive approach is vital. (SD), respectively; Xsn is the standard normal input that is being
Shadravan and Tarrahi (2016) criticized the trial-and-error used in the machine-learning algorithms; Ymin and Ymax are mini-
fluid-design approach and discussed what is missing in stepping mum and maximum of the output or target values, respectively;
toward intelligent fluid design. They showed how machine learn- and Ys is the scaled output that will be used to train the network.
ing could save hours of laboratory testing, integrate sizable exper- After ANN learning and finding the network weights W
imental databases, and optimize the operational cost and delivery. and U, now we can estimate the output for a new set of inputs
Tarrahi and Shadravan (2016a, 2016b) and Shadravan et al. (where the activation function is tanh and the hidden layer has
(2016) demonstrated the application of machine learning and q neurons):
inverse-modeling theory to design a spacer fluid with desired
rheological properties for optimal cement/spacer/drilling fluids. ( )!
 1 
The use of machine learning as opposed to DOE can yield Y0 ¼ tanh U T 1 ; . . . . . . . . . . . . ð5Þ
improved zonal isolation, but it requires establishing appropriate tanh W T
X0
infrastructures to record and quality control the data. It is evident
that there are pros and cons in use of machine learning as well. W ¼ Input layer to hidden-layer weights ½ðn þ 1Þ  q;
Table 1 illustrates some of the positive and negative attributes of
                   ð6Þ
machine-learning algorithms.

2017 SPE Drilling & Completion 3

ID: jaganm Time: 15:57 I Path: S:/DC##/Vol00000/170004/Comp/APPFile/SA-DC##170004


DC175238 DOI: 10.2118/175238-PA Date: 6-March-17 Stage: Page: 4 Total Pages: 10

60 rev/min at Constant Fluid Density


300 rev/min 55

50 40 30
70
60 45 30 25

Viscosity (cp)
Viscosity (cp)

50 40
40 20
35 20
30
30 10
20
15
10
25 0
0 400
5 20 5 10
4 20 300 4
3 200 3
2 15 15 2
10 1 5
1 100 0
0 5
Ingredient A (gal/bbl) Fluid Density (lbm/gal) Temperature (°F) Ingredient A (gal/bbl)
(a) (b)

Fig. 4—Nonlinear estimator resulting from ANN (pink asterisks show the training data).

and covariance function) are suitable (Deutsch and Journel 1992), GPR
or Kriging is the best linear unbiased estimator. In geostatistics lit-
U ¼ Hidden layer to output layer wights ½ðq þ 1Þ  m: erature, the spatial dependencies are represented by a variaogram
                   ð7Þ function instead of a covariance function. In signal-processing lit-
erature, this estimation procedure is known as Kalman filtering,
The activation function at neurons is chosen to be a hyperbolic which has been also applied to history-matching problems (Tarrahi
tangent function. We used the backpropagation method in sequen- et al. 2013, 2015). GPR by nature does not need any a priori
tial mode to train the network and obtain the weights W and U assumption on the form of regression functions. It is a purely data-
matrices (Bishop 2006; Theodoridis et al. 2010). To start the driven methodology that extracts the relationship between varia-
learning process, we initialize the weights as random numbers. bles from the information provided by the training data.
Estimation error for the training data is not a promising perform- It is certainly the inherent correlation and relationship of the
ance measure for ANN because it may raise the overfitting issue, parameters and measured properties that guide or train the regres-
where the constructed regression tool reproduces the training sion tool to follow a specific trend. Data-driven machine-learning
data almost perfectly but fails to properly estimate the new data methods are trained by a set of measurements (training samples).
(Hawkins 2004). The constructed regression tool should fit the Therefore, if there is no correlation or dependency between a set
training data properly and at the same time should have the gener- of parameters and the measured outputs, then the machine-learn-
alization power to predict any unknown and new cases. Therefore, ing tool cannot extract any meaningful relationship. Consequently,
to find the optimal ANN, we should split the data to the new train- the designed regression tool will be insensitive to that parameter,
ing and test data; design ANN by use of training data; and per- so it is recommended to remove the independent or less-effective
form cross validation on the test data. To obtain the best ANN parameters from the parameter set. Sensitivity analysis is a proce-
estimator, we tuned the ANN parameters (maximum error and dure that can be used to identify the most-effective parameters.
maximum iteration in the backpropagation-learning algorithm and GPR has been applied in a variety of applications, such as environ-
the number of hidden layers) through leave-one-out cross valida- mental science, geoscience, mining, and remote sensing, where it is
tion (LOOCV). Each time we leave out one aspect of the training also known as Kriging (Journel and Huijbregts 1978). Other fields of
data and design ANN with the remaining data. We repeat this pro- GPR application are hydrology, neuroscience, real-estate appraisal,
cedure N times (N is number of training data). After tuning, the and image processing. In geostatistics, a subfield of geoscience and
best-obtained average relative cross-validation error is 26% and mining, GPR or Kriging is used to populate the rock-property meas-
its corresponding training data error is 13%. Depending on urements (that is, from core at the wells), such as permeability and
LOOCV, the optimum number of hidden-layer neurons is six. porosity, over the whole subsurface resource or reservoir extent.
For simplicity, we plotted the viscosity with respect to only two It should be noted that the underlying physical process of the
input parameters and kept the third one constant. Fig. 4a shows the system (i.e., the governing equations, such as the diffusion equa-
300-rev/min viscosity with respect to fluid density and Ingredient tion in case of fluid flow in porous media) is not incorporated
A content while temperature is constant (80 F). Fig. 4b illustrates directly into the GPR formulation. However, GPR infers the inter-
the 60-rev/min viscosity with respect to temperature and Ingredient relationship of the parameters from the provided training data set.
A content while fluid density is constant (12 lbm/gal). The plotted In addition, GPR as a data-driven method requires a sample data
surface is the nonlinear regression resulting from ANN. With this set in advance. A covariance model that shapes the correlation
nonlinear-regression function, we can predict the viscosity of the between parameters is also needed to be determined beforehand,
new sample without performing extra laboratory experiments. which depends on the physical system under investigation. The
form of the covariance function is usually predetermined, such as
Gaussian Process Regression. GPR is a nonlinear-regression or an exponential, spherical, or Gaussian function (Journel and
interpolation technique that models the new estimated (interpolated) Huijbregts 1978), and its parameters can be inferred from the
values derived from the Gaussian process determined by a covari- given data set in Appendix A.
ance function (Williams and Rasmussen 2006). GPR is a statistical
machine-learning method that is also known as Kriging and is well-
established in geostatistics and computer-science literature (Deutsch Methodology
and Journel 1992; Shi and Choi 2011). To estimate the correspond- In this study, the fluid was weighted up by adding particulates and
ing value for the new input, GPR calculates the weighted average not salt. Because there are three input parameters and plotting
(linear combination) of the known values (training data) depending each of the dial readings with respect to three independent input
on the correlation (or proximity) of the new input and the training parameters is not very informative, for illustration purposes, we
data governed by the covariance function (Rasmussen and Williams plotted the dial reading with respect to only two input parameters
2006). Providing the training data and the prior assumptions (e.g., and kept the third one constant. In Figs. 5 and 6, all the output

4 2017 SPE Drilling & Completion

ID: jaganm Time: 15:57 I Path: S:/DC##/Vol00000/170004/Comp/APPFile/SA-DC##170004


DC175238 DOI: 10.2118/175238-PA Date: 6-March-17 Stage: Page: 5 Total Pages: 10

3 rev/min at Constant T (°F) 3 rev/min at Constant Fluid Density


18
18 16
16 20
20 14
14
12

Viscosity (cp)
12 15

Viscosity (cp)
15 10
10
10 8
10 8
6
6
5 4
5 4
2 2
0 0
5 400
4 20 300 5
3 4
2 15 200 3
10 2
1 1
0 5 100 0
Ingredient A (gal/bbl) Fluid Density (lbm/gal) Temperature (°F) Ingredient A (gal/bbl)
(a) (b)

6 rev/min at Constant T (°F) 6 rev/min at Constant Fluid Density


22 22
20 20
18 18
25 25
16 16
20 14 14
Viscosity (cp)

Viscosity (cp)
20
12 12
15 10 15
10
10 8 8
10
6
6
5 4 5
4
2 0
0
5 400
4 20 5
3 300 4
2 15 3
200 2
1 10 1
100 0
Ingredient A (gal/bbl) 0 5 Fluid Density (lbm/gal) Temperature (°F) Ingredient A (gal/bbl)
(c) (d)

30 rev/min at Constant T (°F) 30 rev/min at Constant Fluid Density


30
30
25
25 35
35
30 30 20
20
Viscosity (cp)

25 25
Viscosity (cp)

20 15 20 15
15 15
10 10
10 10
5 5 5 5
0 0
5 400
4 20
3 300 5
15 4
2 3
1 10 200 2
1
Ingredient A (gal/bbl) 0 5 Fluid Density (lbm/gal) 100 0
Temperature (°F) Ingredient A (gal/bbl)
(e) (f)
60 rev/min at Constant T (°F) 60 rev/min at Constant Fluid Density
35
30 35
40
35 25 40 30
Viscosity (cp)

Viscosity (cp)

30 20 30 25
25
20 15
20 20
15 10
10 10 15
5 5
0 0
400 10
5
4 20 300 5
3 4 5
15 200 3
2 2
1 10 1
100 0
Ingredient A (gal/bbl) 0 5 Fluid Density (lbm/gal) Temperature (°F) Ingredient A (gal/bbl)
(g) (h)

Fig. 5—Nonlinear estimator resulting from GPR (pink asterisks show the training data).

2017 SPE Drilling & Completion 5

ID: jaganm Time: 15:57 I Path: S:/DC##/Vol00000/170004/Comp/APPFile/SA-DC##170004


DC175238 DOI: 10.2118/175238-PA Date: 6-March-17 Stage: Page: 6 Total Pages: 10

100 rev/min at Constant T (°F) 100 rev/min at Constant Fluid Density


45 45
40 40
50 35 50 35
30 30
40 40

Viscosity (cp)
Viscosity (cp)

25 25
30 20 30 20
20 15 20 15
10 10
10 10
5 5
0 0
5 400
4 20 300 5
3 4
2 15 200 3
10 2
1 1
100 0
Ingredient A (gal/bbl) 0 5 Fluid Density (lbm/gal) Temperature (°F) Ingredient A (gal/bbl)
(a) (b)

200 rev/min at Constant T (°F) 200 rev/min at Constant Fluid Density 55


55 50
50 45
45 60
60 40
40 50 35
50 35

Viscosity (cp)
Viscosity (cp)

40 30
40 30
30 25
25
30 20
20 20
20 15 15
10 10
10 10
0 0
5 400
4 20 300 5
3 4
2 15 3
200 2
1 10 1
100 0
Ingredient A (gal/bbl) 0 5 Fluid Density (lbm/gal) Temperature (°F) Ingredient A (gal/bbl)
(c) (d)

300 rev/min at Constant T (°F) 300 rev/min at Constant Fluid Density


60

70 60 70 50
60 60
Viscosity (cp)

40
Viscosity (cp)

50 50 50
40 40 30
30 40 30
20 20 20
10 30 10
0 10
0
5 400
4 20 20
3 300 5
15 4
2 3
1 10 200 2
10 1
0 5 100 0
Ingredient A (gal/bbl) Fluid Density (lbm/gal) Temperature (°F) Ingredient A (gal/bbl)
(e) (f)

Fig. 6—The predictor function to estimate viscosity resulting from GPR method (pink asterisks show the experimental data
points).

values are plotted vs. two pairs of the input parameters. For In addition, to show the estimation error or confidence interval
instance, Fig. 6e shows the 300-rev/min viscosity with respect to of estimation, in Fig. 8 the nonlinear-regression function or non-
fluid density and Ingredient A (polymer) content while tempera- linear estimator of 60-rev/min viscosity vs. Ingredient A content
ture is constant ( F). The pressure was held constant at 7,500 psi. (fluid density and temperature are constant) is plotted along with
Pressure affects the rheological properties of oil-based fluids more the associated confidence interval of 68%, corresponding to the
than water-based fluids and its effects should not be ignored, espe- estimated value 6 SD.
cially across abnormally pressured zones. The plotted surface is
the nonlinear regression resulting from GPR, and, as shown, it Conclusions
goes through the training data (pink asterisks). With this nonlin- Trial-and-error fluid design is not always a cost-effective option
ear-regression function, we can predict the viscosity of the new and lacks long-term vision to accumulate knowledge on fluid sys-
sample without performing laboratory experiments. Fig. 7 also tems. There is a need for a tool that is empowered by machine-
shows how the predicted viscosity changes vs. Ingredient A at learning methodologies. This paper investigated an intelligent
constant rev/min, temperature, and fluid density. tool that is equipped with flexible machine-learning algorithms.

6 2017 SPE Drilling & Completion

ID: jaganm Time: 15:57 I Path: S:/DC##/Vol00000/170004/Comp/APPFile/SA-DC##170004


DC175238 DOI: 10.2118/175238-PA Date: 6-March-17 Stage: Page: 7 Total Pages: 10

At 100 rev/min (Constant Fluid Density At 60 rev/min (Constant Fluid Density


and Temperature) and Constant Temperature)
50 50

45 45

40 40

35 35

Viscosity (cp)
Viscosity (cp)

30 30

25 25

20 20

15 15

10 10

5 5

0 0
0 1 2 3 4 5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
Ingredient A (gal/bbl)
Ingredient A (gal/bbl)
Fig. 8—Nonlinear-regression function resulting from GPR and
Fig. 7—2D plot of viscosity vs. Ingredient A at constant temper-
the corresponding confidence interval (the highlighted enve-
ature (2008F) and fluid density (12 lbm/gal) for a fixed rev/min.
lope around the nonlinear-regression curve shows the estima-
tion error), with temperature of 2008F and fluid density of 12
lbm/gal.

ANN is a fast and wide-ranging regression and prediction


approach that can be trained and built in real time as the training Y ¼ output vector
data become available. This study also introduced GPR as a reli- Ys ¼ standardized output vector
able regression tool that has shown promising applicability in dif- c ¼ covariance function
ferent computational-fluid designs. The presented results, which h ¼ rotation angle
are derived from driven training data from experiments, demon- rX ¼ input SD
strate the advantage of GPR over ANN as one of the most-popular
machine-learning methods. GPR, unlike ANN, honors the training
data (i.e., GPR-estimated output is exactly equal to the training- Acknowledgment
sample output, in contrast to the ANN-estimated output). GPR The authors would like to thank the management of Reservoir
also offers a prior-free-regression implementation (i.e., it does not Focus for the permission to publish this work.
need any a priori assumption on the form of regression functions).
Furthermore, GPR is unique in the sense of providing the estima-
tion error and uncertainty assessment while predicting the fluid References
characteristics and designing a fluid system. The accuracy of esti-
Amani, M., Al-Jubouri, M., and Shadravan, A. 2012. Comparative Study
mated output is a byproduct of the GPR method, unlike with
of Using Oil-Based Mud Versus Water-Based Mud in HPHT Fields.
ANN. The developed intelligent prediction and design tool can
Adv. Petrol. Explor. Develop. 4 (2): 18–27. https://doi.org/10.3968/
extract and relate any set of input parameters to output variables
j.aped.1925543820120402.987
depending on the provided training samples, which can be both
Bishop, C. M. 2006. Pattern Recognition and Machine Learning. New
experiment and simulation results.
York City: Springer.
Bland, R. G., Clapper, D. K., Fleming, N. M. et al. 1993. Biodegradation
Nomenclature and Drilling Fluid Chemicals. Presented at the SPE/IADC Drilling
C ¼ covariance matrix Conference, Amsterdam, 22–25 February. SPE-25754-MS. https://
D ¼ training data set doi.org/10.2118/25754-MS.
g ¼ transformation matrix Cao, Q., Banerjee, R., Gupta, S. et al. 2016. Data Driven Production Fore-
h ¼ transformed correlation length casting Using Machine Learning. Presented at the SPE Argentina Ex-
k ¼ the dimension of function f ploration and Production of Unconventional Resources Symposium,
L ¼ scaling matrix Buenos Aires, 1–3 June. SPE-180984-MS. https://doi.org/10.2118/
m ¼ number of outputs (dimension of output) 180984-MS
n ¼ number of input properties (dimension of input) D’Orangeville, C. and Lasenby, A. N. 2003. Geometric Algebra for Physi-
N ¼ number of training samples cists. Cambridge, UK: Cambridge University Press.
Nðl; RÞ ¼ Gaussian distribution with mean and covariance of l Deutsch, C. V. and Journel, A. G. 1992. Geostatistical Software Library
and R and User’s Guide. New York City: Oxford University Press.
p ¼ power parameter Diarra, R., Carrasquilla, J., Phelan, D. et al. 2016. New Approach In Lift-
pð f Þ ¼ probability of estimation function f ing Cement In Highly Depleted And Naturally Fractured Formations.
q ¼ number of neurons in hidden layer Presented at the IADC/SPE Drilling Conference and Exhibition, Fort
R ¼ rotation matrix Worth, Texas, 1–3 March. SPE-178772-MS. https://doi.org/10.2118/
U ¼ hidden layer weights for neural networks 178772-MS
v ¼ total variance of training-data output Duda, R. O., Hart, P. E., and Stork, D. G. 2012. Pattern Classification.
W ¼ input-layer weights for neural networks Hoboken, New Jersey: John Wiley & Sons.
X ¼ input vector Gidh, Y. K., Purwanto, A., and Ibrahim, H. 2012. Artificial Neural Net-
Xi ¼ input vector for ith sample work Drilling Parameter Optimization System Improves ROP by Pre-
Xsn ¼ normalized input vector dicting/Managing Bit Wear. Presented at the SPE Intelligent Energy
X ¼ input mean International, Ultrecht, The Netherlands, 27–29 March. SPE-149801-
yi ¼ output value for ith sample MS. https://doi.org/10.2118/149801-MS.

2017 SPE Drilling & Completion 7

ID: jaganm Time: 15:57 I Path: S:/DC##/Vol00000/170004/Comp/APPFile/SA-DC##170004


DC175238 DOI: 10.2118/175238-PA Date: 6-March-17 Stage: Page: 8 Total Pages: 10

Hawkins, D. M. 2004. The Problem of Overfitting. J. Chem. Inf. Comp. Seminar, Bergen, Norway, 20 April. SPE-180032-MS. https://doi.
Sci. 44 (1): 1–12. https://doi.org/10.1021/ci0342472. org/10.2118/180032-MS.
Journel, A. G. and Huijbregts, C. J. 1978. Mining Geostatistics, London: Tarrahi, M., Jafarpour, B., and Ghassemi, A. 2013. Assimilation of Micro-
Academic Press. seismic Data into Coupled Flow and Geomechanical Reservoir Models
Lee, J., Shadravan, A., and Young, S. 2012. Rheological Properties of with Ensemble Kalman Filter. Presented at the SPE Annual Technical
Invert Emulsion Drilling Fluid Under Extreme HPHT Conditions. Pre- Conference and Exhibition, New Orleans, 30 September–2 October.
sented at the IADC/SPE Drilling Conference and Exhibition, San SPE-166510-MS. https://doi.org/10.2118/166510-MS.
Diego, California, 6–8 March. SPE-151413-MS. https://doi.org/ Tarrahi, M., Jafarpour, B., and Ghassemi, A. 2015. Integration of Micro-
10.2118/151413-MS. seismic Monitoring Data Into Coupled Flow and Geomechanical Mod-
els With Ensemble Kalman Filter. Water Resour. Res. 51 (7):
Mijnarends, R., Frolov, A., Grishko, F. et al. 2015. Advanced Data-Driven
5177–5197. https://doi.org/10.1002/2014WR016264.
Performance Analysis For Mature Waterfloods. Presented at the SPE
Theodoridis, S., Pikrakis, A., Koutroumbas, K. et al. 2010. Introduction to
Annual Technical Conference and Exhibition, Houston, 28–30 Sep-
Pattern Recognition: A Matlab Approach. Burlington, Massachusetts:
tember. SPE-174872-MS. https://doi.org/10.2118/174872-MS.
Academic Press.
Osman, E. A. and Aggour, M. A. 2003. Determination of Drilling Mud
Williams, C. K. and Rasmussen, C. E. 2006. Gaussian Processes for
Density Change with Pressure and Temperature Made Simple and
Machine Learning. Cambridge, Massachusetts: The MIT Press.
Accurate by ANN. Presented at the Middle East Oil Show, Bahrain,
9–12 June. SPE-81422-MS. https://doi.org/10.2118/81422-MS
Rafieisakhaei, M., Barazandeh, B., and Tarrahi, M. 2016a. Analysis of
Supply and Demand Dynamics to Predict Oil Market Trends: A Case Appendix A—GPR
Study of 2015 Price Data. Presented at the SPE/IAEE Hydrocarbon
To perform GPR, we consider one output at a time [i.e., we con-
Economics and Evaluation Symposium, Houston, 17–18 May. SPE-
struct seven GPR models for seven dial-reading outputs (Shi and
179976-MS. https://doi.org/10.2118/179976-MS.
Choi 2011)]. Therefore, the training data for each GPR-model
Rafieisakhaei, M., Barazandeh, B., Moosavi, A. et al. 2016b. Modeling design is
Dynamics of the Carbon Market: A System Dynamics Approach on
the CO2 Emissions and its Connections to the Oil Market. Oral presen-
Training data DfXi ; yi g; i ¼ 1; 2; …; N; . . . . . . . . ðA-1Þ
tation given at the 34rd International Conference of the System Dy-
namics Society, Delft, The Netherlands, 17–21 July.
Rasmussen, C. E. and Williams, C. K. I. 2006. Gaussian Processes for
where yi is a scalar and represents one of the dial-reading outputs.
Machine Learning. Cambridge, Massachusetts: MIT Press.
Each nonlinear-regression function is a random sample (para-
metrized by the input variable) drawn from a joint Gaussian prob-
Schakel, M. D. and Mesdag, P. R. 2014. Fully Data-Driven Quantitative
ability function given the training data set D. Our goal in GPR is
Reservoir Characterization by Broadband Seismic. SEG Technical
to train a function f from data D. A Gaussian process is a set of
Program Expanded Abstracts 2014: 2502–2506. https://doi.org/
random estimation functions, f , each with probability pð f Þ that
10.1190/segam2014-0671.1.
formally can be expressed as Bayesian formulation while a set of
Shadravan, A. and Amani, M. 2012. HPHT 101: What Every Engineer or training samples D is given:
Geoscientist Should Know about High Pressure High Temperature
Wells. Presented at the SPE Kuwait International Petroleum Confer- pð f ÞpðDj f Þ
ence and Exhibition, Kuwait City, Kuwait, 10–12 December. SPE- pð f jDÞ ¼ ; . . . . . . . . . . . . . . . . . . . . . ðA-2Þ
pðDÞ
163376-MS. https://doi.org/10.2118/163376-MS.
Shadravan, A. and Tarrahi, M. 2016. Machine Learning Leads Cost Effec- where pð f jDÞ shows the posterior probability of all the regression
tive Intelligent Fluid Design: Fluid Engineering Perspective. Presented functions given the training data. pð f Þ represents the prior proba-
at the SPE Bergen One Day Seminar, Bergen, Norway, 20 April. SPE- bility of all the possible regression functions and pðDj f Þ is the
180033-MS. https://doi.org/10.2118/180033-MS. likelihood function. In Bayesian inference context, pðDÞ is only a
Shadravan, A., Narvaez, G., Alegria, A. et al. 2015a. Engineering the scaling parameter that does not affect the intended posterior prob-
Mud-Spacer-Cement Rheological Hierarchy Improves Wellbore Integ- ability density function (PDF) pð f jDÞ.
rity. Presented at the SPE E&P Health, Safety, Security and Environ- The nonlinear-regression function in GPR is a set of random
mental Conference-Americas, Denver, 16–18 March. SPE-173534- variables indexed by a continuous parameter (e.g., time, space, or
MS. https://doi.org/10.2118/173534-MS. temperature), which is also called random function f ðXÞ. The
Shadravan, A., Tarrahi, M., and Amani, M. 2015b. Intelligent Cement
assumption in GPR is that any set of regression functions has a
Design: Utilizing Machine Learning Algorithms to Assure Effective
jointly Gaussian distribution with zero mean:
Long-term Well Integrity. Presented at the Carbon Management Tech- k 1 1
   f CðXÞ1 f
nology Conference, Sugar Land, Texas, 17–19 November. CMTC- pð f jXÞ ¼ N½0; CðXÞ ¼ ð2pÞ 2 jCðXÞj 2 e 2 ; . . . ðA-3Þ
440236-MS. https://doi.org/10.7122/440236-MS.
Shadravan, A., Tarrahi, M., and Amani, M. 2016. Agile Data-Driven Fluid where CðXÞ is the covariance matrix and k represents the dimen-
Design: Predicting the Properties of Drilling, Spacer and Cement sion of function f with respect to the continuous parameter. Esti-
Slurry Fluids. Oral presentation given at the 2016 AADE Fluids Tech- mating the unknown target value y0 given the set of training y
nical Conference and Exhibition, Houston, 12–13 April. AADE-16- presents a conditional-probability function:
FTCE-07.
Shi, J. Q. and Choi, T. 2011. Gaussian Process Regression Analysis for pðy0 jyÞ ¼ Nðy0 ; ry0 Þ: . . . . . . . . . . . . . . . . . . . . . . ðA-4Þ
Functional Data. Boca Raton, Florida: CRC Press.
Subrahmanya, N., Xu, P., El-Bakry, A. et al. 2014. Advanced Machine The GPR estimation value is the mean of the above condi-
Learning Methods for Production Data Pattern Recognition. Presented tional PDF (Eq. A-4). Basically GPR proposes a Gaussian PDF
at the SPE Intelligent Energy Conference and Exhibition, Ultrecht, for the new estimated value, which can be fully characterized by a
The Netherlands, 1–3 April. SPE-167839-MS. https://doi.org/10.2118/ mean and a SD value.
167839-MS. As a necessary preprocessing step to avoid scaling issues from
Tarrahi, M. and Shadravan, A. 2016a. Inverse Modeling for Fluid System different input and output ranges, we apply data normalization
Characterization through Machine Learning Algorithms. Presented at and scaling to input and output values. Target or output values are
the SPE Bergen One Day Seminar, Bergen, Norway, 20 April. SPE- assumed to be normal standard variables (resulting from a Gaus-
180034-MS. https://doi.org/10.2118/180034-MS. sian distribution with zero mean and unit variance). Moreover,
Tarrahi, M. and Shadravan, A. 2016b. Advanced Big Data Analytics the input values (input parameters) are shifted and scaled to a
Improves HSE Management. Presented at the SPE Bergen One Day [0,1] interval.

8 2017 SPE Drilling & Completion

ID: jaganm Time: 15:57 I Path: S:/DC##/Vol00000/170004/Comp/APPFile/SA-DC##170004


DC175238 DOI: 10.2118/175238-PA Date: 6-March-17 Stage: Page: 9 Total Pages: 10

X  Xmin 2 3
Xs ¼ . . . . . . . . . . . . . . . . . . . . . . . . ðA-5Þ CðX1 ; X1 Þ  CðX1 ; XN Þ
Xmax  Xmin 6 .. .. .. 7
Cnn ¼4 . . . 5
YY CðXN ; X1 Þ … CðXN ; XN Þ
Ysn ¼ : . . . . . . . . . . . . . . . . . . . . . . . . . . . ðA-6Þ
rY ¼ Covariance matrix of training data; . . . . . . . ðA-18Þ
To define the spatial relation of the samples in the input space 2 3
(in this study, a 3D space), a covariance function is established: CðX1 ; X0 Þ
6 .. 7
Cnu ¼6
4 .
7
5
CðXi ; Xj Þ ¼ vexpðhp Þ; . . . . . . . . . . . . . . . . . . . . . ðA-7Þ
CðXN ; X0 Þ
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
h ¼ gT L1 g; . . . . . . . . . . . . . . . . . . . . . . . . . . . ðA-8Þ ¼ Cross covariance of training data and new input:
                   ðA-19Þ
g ¼ RðXi  Xj Þ; . . . . . . . . . . . . . . . . . . . . . . . . . . ðA-9Þ
2 3 The estimated output is basically a linear combination of the
l1 0 0 training outputs where the coefficients are determined by the spa-
L¼40 l2 0 5 ¼ Scaling matrix; . . . . . . . . . . . ðA-10Þ tial configuration of the input (with respect to the training data)
0 0 l3 and the covariance-function shape. One of the unique advantages
of GPR compared with ANN is the ability to obtain the estimation
Xi ; Xj ¼ Experiment input values ð3  1Þ; . . . . . . . ðA-11Þ error. GPR not only provides us with the estimated output at the
2 3 new input value, but it is also able to calculate its associated error
1 0 0 or variance. With this capability, we can specify how good each
R1 ðh1 Þ ¼ 4 0 cosðh1 Þ sinðh1 Þ 5; . . . . . . . . . . . ðA-12Þ estimation is or determine the confidence of each estimation and
0 sinðh1 Þ cosðh1 Þ decide if we need to perform new laboratory experiments.
2 3
cosðh2 Þ 0 sinðh2 Þ
Estimation variance ofy0 ¼ v  CTnu Cnn Cnu : . . . . . . ðA-20Þ
R2 ðh2 Þ ¼ 4 0 1 0 5; . . . . . . . . . . . . ðA-13Þ
sinðh2 Þ 0 cosðh2 Þ
2 3 This provides us with the estimation SD and consequently the
cosðh3 Þ sinðh3 Þ 0 confidence interval associated with each estimation. It should be
R3 ðh3 Þ ¼ 4 sinðh3 Þ cosðh3 Þ 0 5; . . . . . . . . . . ðA-14Þ noted that GPR is an absolute estimator; that is, estimated values
0 0 1 for the training data are equal to true outputs, and unlike ANN,
the estimation error for training data is zero. After tuning by use
and of LOOCV, the average minimum cross-validation error is 20%.
We obtained different hyper parameters (different covariance
R ¼ R3 ðh3 ÞR2 ðh2 ÞR1 ðh1 Þ; . . . . . . . . . . . . . . . . . ðA-15Þ functions) for different dial-reading outputs. For all the output
dial readings, the obtained power parameter p through cross vali-
where l1 ; l2 , and l3 are correlation lengths in three input directions; dation is 2, so the preferred covariance function is the squared-
the matrix R represents the rotation matrix in 3D, which can be exponential-covariance function.
calculated by use of the rotation angles around different axes
(D’Orangeville and Lasenby 2003); and v represents the variance
parameter. In specific cases, where the parameter p ¼ 2, the result- Arash Shadravan is a director at ReservoirFocus in Houston. He
ing covariance is the squared-exponential-covariance function (or manages various projects related to enhancing the econom-
Gaussian), and if p ¼ 1, the exponential-covariance function is ics of upstream operations in the US. Shadravan has previously
reproduced. The parameters involved in defining the covariance worked for Schlumberger, Baker Hughes, Occidental Petro-
function are called hyper parameters. By changing the hyper pa- leum, and Superior Energy Services. He has been a journal
reviewer and technical contributor as a committee member
rameters, the GPR estimator quality will change. In this study, we
in SPE conferences and workshops. Shadravan has published
tune seven hyper parameters (three correlation lengths, three rota- more than 30 peer-reviewed journal and technical confer-
tion angles, and the power) through LOOCV to obtain the optimal ence papers related to the application of machine learning in
GPR estimator. In LOOCV, we assume one of the training data production optimization, cementing, drilling fluids, and under-
points is unknown; use the other eight training data points to balanced drilling. He has been a SPE member since 2006.
design the GPR estimator; estimate the unknown data point; and Shadravan holds a master’s degree in petroleum engineering
then compare the estimated value with the true value. This proce- from Texas A&M University and a bachelor’s degree in petro-
dure is repeated for all the outputs (seven dial readings) and for leum engineering.
all the training data (nine samples). In this cross-validation proce- Mohammadali Tarrahi is a senior reservoir engineer in the Subsur-
dure, the performance measure of the GPR estimator is the aver- face Modeling and Optimization Department at Shell Global
age relative error. Solutions in Houston. He has been part of several research
The following is the relation between the semivariogram (pop- and development groups at Shell, Occidental Petroleum Corpo-
ular spatial-correlation representation in geostatistics) and the co- ration, and Microseismic. Tarrahi is the author of more than
variance function: 30 peer-reviewed-journal and technical-conference papers
regarding numerical reservoir simulation, geostatistics, reservoir
characterization, uncertainty assessment, big-data analytics,
cðXi ; Xj Þ ¼ v  CðXi ; Xj Þ: . . . . . . . . . . . . . . . . . . . ðA-16Þ and statistical machine learning. He holds a PhD degree in petro-
leum engineering from Texas A&M University, a master’s degree
To estimate the corresponding output y0 of the new input pa- in petroleum engineering, and bachelor’s degrees in electrical
rameter X0 , we use the Gaussian process interpolation (also called engineering and petroleum engineering.
simple Kriging):
Mahmood Amani is an associate professor of petroleum engi-
2 3T neering at Texas A&M University at Qatar (TAMUQ), and has
y1 been on the faculty of Texas A&M University for more than 13
6 .. 7 1 XN
y0 ¼ 4 . 5 Cnn Cnu ¼ ki yi ; . . . . . . . . . . . . . . ðA-17Þ years. He moved from the main campus of Texas A&M Univer-
i¼1 sity in College Station, Texas, to Qatar to help start the Petro-
yN leum Engineering Program at TAMUQ. Amani is the founding

2017 SPE Drilling & Completion 9

ID: jaganm Time: 15:57 I Path: S:/DC##/Vol00000/170004/Comp/APPFile/SA-DC##170004


DC175238 DOI: 10.2118/175238-PA Date: 6-March-17 Stage: Page: 10 Total Pages: 10

faculty member and the first program coordinator and 2016 SPE Middle East Regional Distinguished Achievement
chair of the Petroleum Engineering Program at TAMUQ. He Award for Petroleum Engineering Faculty as well as the 2013
considers his role in establishing the petroleum-engineering SPE Middle East Regional Service Award, and he holds two US
program in Qatar as his most-remarkable professional patents. Amani holds a bachelor’s degree in mechanical en-
achievement. Before joining academia as a faculty member gineering from Wichita State University, a master’s degree in
focused on petroleum engineering, Amani worked as a natural-gas engineering from Texas A&M University-Kingsville,
research scientist with the Texaco Exploration and Production and a PhD degree in petroleum engineering from Texas A&M
Technology Department in Houston. He was awarded the University.

10 2017 SPE Drilling & Completion

ID: jaganm Time: 15:57 I Path: S:/DC##/Vol00000/170004/Comp/APPFile/SA-DC##170004

You might also like