Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

URTeC: 343

Cloud Based ROP Prediction and Optimization in Real-Time Using


Supervised Machine Learning
Kriti Singh*, Sai Sharan Yalamarty, Mohammadreza Kamyab, and Curtis Cheatham,

Downloaded from http://onepetro.org/URTECONF/proceedings-pdf/19URTC/2-19URTC/D023S041R002/1138566/urtec-2019-343-ms.pdf by University of Michigan user on 15 March 2023


Corva AI.
Copyright 2019, Unconventional Resources Technology Conference (URTeC) DOI 10.15530/urtec-2019-343

This paper was prepared for presentation at the Unconventional Resources Technology Conference held in Denver, Colorado, USA,
22-24 July 2019.

The URTeC Technical Program Committee accepted this presentation on the basis of information contained in an abstract submitted
by the author(s). The contents of this paper have not been reviewed by URTeC and URTeC does not warrant the accuracy, reliability,
or timeliness of any information herein. All information is the responsibility of, and, is subject to corrections by the author(s). Any
person or entity that relies on any information obtained from this paper does so at their own risk. The information herein does not
necessarily reflect any position of URTeC. Any reproduction, distribution, or storage of any part of this paper by anyone other than the
author without the written consent of URTeC is prohibited.

Abstract
A supervised machine learning model for rate of penetration (ROP) prediction was developed that is
efficient for use with real-time data. Once ROP can be predicted with a sufficiently high and consistent
degree of accuracy, drilling parameters such as differential pressure, flow-rate, and rotary speed can be
swept to determine an optimum ROP several times during the drilling of a stand of pipe.

Eight different types of machine learning models were trained using a population of 50 horizontal wells in
the Permian Basin. Fifteen drilling parameters, or input features in terms of machine learning were proposed
for this development. They included surface torque, flow rate, hydraulic parameters and the three mentioned
above. A new technique using lagged features allows data from the well not directly linked to the current
operation to act as a proxy for formation properties. This lagging is believed to account for formation
changes and bit wear. A fully automated pipeline was developed that fetches WITSML data, qualifies it
and then stores it in a structured format in the AWS cloud. The predictive/optimization model is also cloud
based and can be used anywhere the user has internet access.

The various machine learning models were compared using leave-one-out-cross-validation by training on
45 of the wells and blind testing on the remaining five. Mean Absolute Percentage Error (MAPE) was used
to compare the models because it allows a percentage estimate of ROP prediction accuracy. The winning
model was multivariate adaptive spline regression because it is fast enough to keep up with real time
drilling, provides sufficiently accurate predictions, and offers clear interpretability. Average MAPE was
13% and was consistent across wells with widely varying ROP’s. This bolstered our confidence in the
machine learning model to predict ROP in real time and optimize the key-controllable parameters.

Introduction
ROP optimization has long been a quest in reducing the cost of drilling wells. This has become more acute
in the long lateral sections of unconventional reservoir wells. The drilling industry is moving towards the
automated optimization of key controllable parameters such as differential pressure, weight-on-bit, rotary
speed, and flow rate. However, simply drilling faster is only part of the solution. There is the often discussed
URTeC 343 2

“sweet spot” where ROP is maximized and lateral, axial, and torsional vibrations are minimized. Directional
control must be maintained to minimize correcting slides or tool command sets. Drilling efficiency
parameters such as Mechanical Specific Energy (MSE), and depth of cut must be kept in check to protect
PDC bit life. Other undesirables such as lost circulation, wellbore instability, and inefficient hole cleaning
should be accounted for to avoid stuck and lost-in-hole situations.

Downloaded from http://onepetro.org/URTECONF/proceedings-pdf/19URTC/2-19URTC/D023S041R002/1138566/urtec-2019-343-ms.pdf by University of Michigan user on 15 March 2023


This work presents the first step in achieving that long term vision by predicting and optimizing ROP from
an automated fetch of streaming and static drilling data relying primarily on statistical and machine learning
models.

A literature search shows that there have been several thousand papers written on drilling and ROP
optimization. Some of the most notable models are Bingham1 (1964), Bourgoyne and Young2 (1974), and
Mothari3 (2010). About 80 papers were written in the last decade with about a quarter of them moving from
empirical models to statistical and machine learning approaches. The empirical models are usually designed
for specific drilling conditions and require frequent calibration. This has motivated recent work in data
driven models that describe developments in machine-learning and statistical models for ROP optimization.
Hegde4, et al (2015) proposed using trees, bagging and random forests to predict ROP. They also wrote
follow up papers on the use of hybrid models. Hegde5&6, et al (2018) continued with optimization algorithms
for real-time drilling optimization. The use of deep learning networks was proposed by Li and Samuel7
(2019) to predict ROP ahead of the bit in real-time. This work continues in that direction by testing the
effectiveness of eight machine learning techniques for predicting and optimizing ROP in lateral sections of
unconventional horizontal wells.

Permian Basin Unconventional Wells

Figure 1 The trajectory profile of a typical unconventional well. It has four sections: 1) vertical, 2) nudge, 3) build-to-horizontal, and 4) lateral. A
nudge is present in about 60% of the wells in the Permian Basin. When drilling with a PDM, directional objectives are achieved in slide footage
shown in red. The motor is rotated in a more standard drilling sense otherwise, shown in blue. This work considers predicting/optimizing only in
the sections that are rotary drilled.
URTeC 343 3

Roughly 4,000 horizontal wells have been drilled in about half a dozen reservoirs in Texas and New Mexico
in an area collectively called the Permian Basin. These wells are remarkably similar in trajectory design as
compared to other areas. Lateral sections are getting longer as time progresses with 10,000 feet currently
being a good average and some operators pushing to 13, 000 feet. This makes the Permian an ideal area to
test new drilling analysis techniques. The trajectory design of these wells (shown in Figure 1) has four

Downloaded from http://onepetro.org/URTECONF/proceedings-pdf/19URTC/2-19URTC/D023S041R002/1138566/urtec-2019-343-ms.pdf by University of Michigan user on 15 March 2023


sections:

1) Surface vertical
2) “S” shaped trajectory nudge (in 60% of wells)
3) Build to horizontal – with build rates between 6 and 16 deg/100ft
4) 8,000 to 12,000 ft lateral in a Permian shale group

The wells are drilled with both rotary steerable systems (RSS’s) and positive displacement motors (PDM’s)
with a bent housing. Motors are used in the majority of the runs in the wells but that percentage appears to
be dropping. Directional control is achieved with a PDM by holding the drill string stationary and orienting
the bent housing in a particular direction in what is called slide drilling. Nearly all of the build to horizontal
(section 3) is accomplished with slide drilling while less than 15% slide drilling is typically used in sections
2 and 4. The rest of the drilling in these two sections is rotary drilling where the string is rotated from the
surface with a RPM assist from the PDM to the drill bit. The red coding in figure 1 shows typical slide
drilling while the blue coding is the rotary drilling. Discriminating between slide and rotary drilling is fairly
simple. Slide drilling has zero string rotation or zero RPM. Lateral sections for this work is defined as well
footage once a well inclination of 88 degrees is achieved.

The scope of work for evaluating the machine learning techniques is the rotary drilling in the laterals
(section 4) of fifty wells. The main objective in lateral drilling is to achieve optimum ROP, while the
objective of slides is directional control. Thus the lateral sections of these Permian unconventional wells
are ideal for this work. It will be useful to extend these techniques to other sections of the well in the future.

Drilling Data Preparations

The drilling data gathered from these wells was a 16 parameter subset of the standard WITSML data
collected by the electronic data recorder (EDR) providers with a timestamp. These include six direct
measurements: STQ – surface rotary torque, SRPM – surface rotations per minute, SPPA –standpipe
pressure (usually averaged), HKLD – hookload, mud flow rate in FLOWI, and measured depth MD; and
five computational parameters: ROP – rate of penetration, SWOB – surface (calculated) weight-on-bit,
DPRES – differential surface pressure, MSE – mechanical specific energy, and HSI – bit hydraulic
horsepower per square inch; and finally five parameters: MW – mud density or weight, PV – mud plastic
viscosity, YP – mud yield point, BS – bit size, and TFA – bit total flow area that vary periodically.

Both streaming and static data were fetched using WITSML in an automated process from a cloud
environment. Static data typically includes the bit and bottomhole assembly (BHA) descriptions from which
the static parameters are extracted. There are various problems or faults in streaming drilling data. This data
was cleansed by removing missing values and removing outliers and spikes during connections. ROP
anomalies at pipe connections were also removed. The streaming data was then converted from java script
object notation (JSON) into Python Pandas structured dataframes based on timestamp.
URTeC 343 4

Training and Testing Data Sets

Downloaded from http://onepetro.org/URTECONF/proceedings-pdf/19URTC/2-19URTC/D023S041R002/1138566/urtec-2019-343-ms.pdf by University of Michigan user on 15 March 2023


Figure 2 Spearman pair-plots for the 15 drilling parameters or predictive features and ROP. Histograms of the features are shown in the diagonal.
It is difficult to determine relationships much less the degree of predictability visually even with this moderate number of features.

One of the fifty wells in this study was selected to determine the relative important predictive features for
ROP from the other fifteen data parameters or predictive features. Spearman correlation and pair-plots were
used to determine the sensitivity of the predictive features. Figure 2. shows the complexity in plotting the
15 parameters versus ROP in a roughly 10,000 ft lateral. It is difficult to determine any specific relationships
visually even if the plots are blown up to poster size. One might be able to see that the mud parameters
(MW, PV, and YP) have linear correlations, as they should! Determining a correlation or linear relationship
among the parameters and ROP (the bottom row) is very difficult. The Spearman calculations result in a
coefficient that can be considered sensitive to linear correlation. This is shown in Table 1. This will be the
starting point for testing machine learning models. Since we have both input features and corresponding
output value available as training examples, we used only supervised learning algorithms for model
training. In supervised learning, the goal is to approximate the mapping function so well that when you
have new input data you can predict the output variables for that data.
URTeC 343 5

Table 1: The 15 initial drilling parameters ranked in order of sensitivity to linear correlation after Spearman analysis

Drilling Parameter Sensitivity


DPRES 0.69
MSE 0.66

Downloaded from http://onepetro.org/URTECONF/proceedings-pdf/19URTC/2-19URTC/D023S041R002/1138566/urtec-2019-343-ms.pdf by University of Michigan user on 15 March 2023


STQ 0.32
SRPM 0.27
MD 0.23
SPPA 0.15
SWOB 0.11
FLOWI 0.09
MW 0.09
PV 0.08
HKLD 0.07
BS 0.05
YP 0.04
HSI. 0.02
TFA 0.01

Additionally, the drilling parameters or predictive features were evaluated to determine if they were
normally distributed. Usually the goal in machine learning and artificial intelligence work is to strive to
organize data in terms of linearity. Processing data to obtain a linear relationship between cause and effect
helps to reduce variance and computational complexities in further work. It helps in handling outliers and
data that carries too much analysis weight called leverage points or influencing points. If “X” (drilling
parameters) is normally distributed and “Y” (ROP) is also normally distributed, one is more likely to fit a
straight line with most of the data centered in the middle of the line rather than at the end points, aka outliers,
leverage, and influencing points. In this case, the distributions for ROP and HSI were skewed. Log
normalization (fig 3) was used to make the distribution and retain data from the inter-quartile range.

Figure 3 The distribution ROP vs HSI data is skewed left as shown in “a”, it is transformed to a log scale to obtain a normal shape and then
centered on the inner quartiles for a more usable normal relationship in “b”.
URTeC 343 6

Testing Machine Learning Models

The relationship between ROP and the different predictive features is complex. Eight machine learning
algorithms were selected for testing. 45 of the 50 wells were picked at random for this work. The remaining
five were reserved for a blind test once the models were operational. Three linear models were chosen: 1)

Downloaded from http://onepetro.org/URTECONF/proceedings-pdf/19URTC/2-19URTC/D023S041R002/1138566/urtec-2019-343-ms.pdf by University of Michigan user on 15 March 2023


multivariate linear regression, 2) least absolute shrinkage selector operator (LASSO) regression, and 3)
ridge regression. Three non-linear models were tested; random forest/decision trees, 2) artificial neural
networks (ANN) / deep learning, and 3) recurrent neural networks (RNN). The remaining algorithms are
Principal Component Analysis which is a decomposition technique and multivariate adaptive spline
regression or spline regression for short. These algorithms were compared using leave-one-out-cross-
validation.

Mean absolute percentage error (MAPE) was used to judge the effectiveness of the eight models. It is given
by the following formula:

𝑛
100% (𝐴𝑡 − 𝐹𝑡 )
MAPE = ∑
𝑛 𝐴𝑡
𝑡=0

where “n” is the number of fitted data pairs, “At” is the actual value of ROP and “Ft” is the forecast value.

Adding lagged features for formation information

The best model should be repeatable outside this 50 well population. It should work in different formations
and strengths of rock. Formation or log data is not readily available as operational drilling data. Gamma ray
measurements are basically the only petrophysical data gathered in unconventional wells. No correlation
was seen between lagged gamma ray and ROP. This was not surprising in total shale environments. The
next approach was to add engineering features that used lagged operational drilling data which could act as
a proxy for formation information. Several combinations were tested: Lagged MSE, SRPM, DPRES and
STQ were found to significantly improve the predictions. This reduced MAPE by approximately 20% for
our winning model.

Results

The performance of the eight machine learning models on the 45 test wells is detailed in Table 2. Decision
Trees and Random Forest models performed well on training data but poorly on blind data due to
overfitting. Deep learning and artificial neural networks models gave acceptable accuracy but were rejected
because their black box nature resulted in poor interpretability. Shrinkage methods such as LASSO and
Ridge Regression gave clear interpretability but reduced accuracy because of the assumed linear
relationship. The winning model was multivariate adaptive spline regression because it is computationally
inexpensive and fast to keep up with real time drilling, provides the best accuracy, and offers clear
interpretability.
URTeC 343 7

Table 2: The effectiveness of the eight machine learning models in developing an accurate ROP with the 45 test well data sets

Machine Learning Model Results


Random Forest / Decision Trees Poor accuracy, over-fitting
ANN / Deep Learning Good accuracy, poor interpretability

Downloaded from http://onepetro.org/URTECONF/proceedings-pdf/19URTC/2-19URTC/D023S041R002/1138566/urtec-2019-343-ms.pdf by University of Michigan user on 15 March 2023


Recurrent Neural Network (RNN) Less accurate than deep learning
Multivariate Linear Regression Poor accuracy, linear assumption
LASSO Regression, Ridge Regression Improved accuracy, linear
Principal Component Analysis No significant improvement
Spline Regression Best accuracy, good interpretability

The next step was to predict ROP in the remaining five wells in a blind test. Table 3. shows those results.
While there was a wide range in ROP in the wells and in the average ROP for a well, the error was fairly
consistent at about 13%. This suggests stability in the predictive model. Figures 4&5 show the predicted
versus actual ROP for sections of two of the blind test wells.

Table 3 The blind test wells have wide ranging ROP but a uniform mean absolute percent error.

Well Number Average ROP (ft/hr) MAPE (%)


A 123 13
B 104 12
C 203 15
D 69 13
E 116 10

Figure 4 Predicted vs actual ROP for blind test well A.


URTeC 343 8

Downloaded from http://onepetro.org/URTECONF/proceedings-pdf/19URTC/2-19URTC/D023S041R002/1138566/urtec-2019-343-ms.pdf by University of Michigan user on 15 March 2023


Figure 5 Predicted vs actual ROP for blind test well B.

Optimizing ROP

Once a reliable and consistent machine learning model is established, the next step is to use it to predict
better values for the forecast feature. In this case ROP optimization occurs when features can be tested to
yield a higher forecast number for ROP. The model develops a weighting function for the input features.
These may change with the data circumstances within a range. Table 4. Lists the general weights for the
features. DPRES was identified as the most important feature by a significant margin. While this differs
from other studies and recent writings, it is not surprising. Differential pressure caused by the energy
required for a PDM to rotate and cut rock is an indirect measurement of downhole drilling torque at the bit.
Actual strain gauge measurements of bit torque are considered too expensive to utilize in land
unconventional wells. Their use in offshore wells is considered essential in drilling mechanics work and is
well documented. Other models, Hegde4,5&6 have used SWOB as the key controllable parameter in ROP
optimization. SWOB is well documented as a surrogate for torque, but it is a pure surface measurement
with multiple complexities involving the rig. While differential pressure is derived from a surface
measurement, it does not have the same type of issues and is easier to measure accurately. The key is that
it is affected by downhole action, namely the PDM. It has more characteristics of a bit torque. While the
bonus of an indirect downhole torque “measurement” does not occur when directionally drilling with rotary
steerable systems (RSS), operators are finding drilling efficiencies in using a PDM assist with rotary
steerables. The PDM does not have a bend for directional control since the RSS is handling directional
control, but it does yield DPRES values for downhole torque.
URTeC 343 9

Table 4 : Identifying the most important features in optimizing ROP

Feature Weights (%)


DPRES 52.74
STQ 19.05
Lagged MSE 14.19
SWOB 6.76

Downloaded from http://onepetro.org/URTECONF/proceedings-pdf/19URTC/2-19URTC/D023S041R002/1138566/urtec-2019-343-ms.pdf by University of Michigan user on 15 March 2023


Lagged SRPM 4.61
Lagged DRES 0.9
BS 0.67
SRPM 0.23
MD 0.19
MW 0.19
SPPA 0.16
Lagged STQ 0.15

Once the weighting is established for the input features, an optimization sweep can test changes. Figure 6.
shows improvements in ROP forecasts as a function of DPRES. Since there is a weighted relationship
among the input features, optimization works through all the parameters to determine the best ROP. There
is usually a “sweet spot” where any further testing of the features does not improve or actually reduces
ROP. It is dimensionally very difficult to plot this process. Figure 7. shows a flow chart for this.

Diff vs ROP Response Plot

240

220
ROP (ft/hr)

200

180

160

140

120
300 400 500 600 700 800 900
Diff (psi)

Figure 6 As the model sweeps values of DPRES from 410 to 825 psi, predicted ROP increases from 145 to 217 ft/hr.
URTeC 343 10

Downloaded from http://onepetro.org/URTECONF/proceedings-pdf/19URTC/2-19URTC/D023S041R002/1138566/urtec-2019-343-ms.pdf by University of Michigan user on 15 March 2023


Figure 7 The flow of multiple feature adjustment in predicting an optimal ROP.

Internal Live Test

The 50 well population led to the development of this spline regression machine learning model for first
predicting and then recommending optimization for ROP. This was, by definition, a post-mortem data
analysis process. Real-time WITSML data streams have their idiosyncrasies. The previously discussed data
qualifications and normalizing techniques must mesh with that data stream, process lagged features, predict
ROP, and make recommendations for optimization and not fall behind in a real world drilling environment.
The development showed that the processing was fast, and several live well tests have proved this point.
Results must be presented so that drilling decision makers can effectively evaluate all the concerns in a
proposed change to drilling operations. Figures 8 and 9 show two snapshots from real-time screens.

Figure 8 Snapshots of the real-time ROP prediction and optimization metrics with this machine learning model. The operator can determine how
effectively the model is performing.
URTeC 343 11

Downloaded from http://onepetro.org/URTECONF/proceedings-pdf/19URTC/2-19URTC/D023S041R002/1138566/urtec-2019-343-ms.pdf by University of Michigan user on 15 March 2023


Figure 9 A snapshot of the current status for 90 minutes of drilling. The model is 91.2% accurate with optimum values for DPRES of 531 psi,
and 99 RPM resulting in 182 ft/hr for ROP.

Conclusions

A lot of work has been done and documented in the quest to optimize rate of penetration. Most of these
models are deterministic and require significant expertise in adjusting parameters for the optimization to
work effectively. This work evaluated eight machine learning techniques and found that a spline regression
model could maintain a consistent error in predicting a MAPE of 13% in the fairly uniform rotary drilling
of unconventional lateral sections of Permian Basin wells. The novelty of this work is that the model does
not require retraining on every new formation. It is a general model that includes the effect of formation
change and has shown good results in different target formations of Permian Basin.

The stable predictions can be used to recommend changes in drilling operation parameters or input features
that would lead to an optimum ROP. Drilling data, both post-mortem and real-time has problems. Several
statistical techniques were used to weight and normalize fifteen possible predictors of ROP that greatly
reduced spikes, outliers and skew in the data. Both the data processing and prediction models are self-
learning and do not require significant worker input. The algorithms are fast and work well in the real-time
drilling environment. Differential pressure caused by downhole motor action was found to be the most
significant input feature which was not surprising in the work since it is a proxy for downhole torque,
though this differs from other work in this field. Work continues with these tools in other unconventional
horizons and sections of horizontal wells.

Acknowledgements

The authors would like to thank the various operators and drilling engineers who gave input and suggestions
during the development of this work. We would also like to thank Bill Lesso for his help in editing this
manuscript.
URTeC 343 12

References

1. Bingham, M.G. 1964. “A New Approach to Interpreting Rock Drillability.” The Oil and
Gas Journal.

Downloaded from http://onepetro.org/URTECONF/proceedings-pdf/19URTC/2-19URTC/D023S041R002/1138566/urtec-2019-343-ms.pdf by University of Michigan user on 15 March 2023


2. Bourgoyne Jr, A.T. and Young Jr, F.S. 1974. “A Multiple Regression Approach to Optimal
Drilling and Abnormal Pressure Detection.” SPE Journal, 14(4) pgs 371-384.

3. Mothari, H.R., Hareland, G., & James, J.A. 2013. “Improved Drilling Efficiency
Technique Using Integrated PDM and PDC Bit Parameters.” Journal of Canadian
Petroleum Technology 49(10), pgs 45-52.

4. Hegde, C., Wallace, S., & Gray, K. 2015. “Using Trees, Bagging, and Random Forests to
Predict Rate of Penetration during Drilling” SPE 176792 presented at the Middle East
Intelligent Oil & Gas Conference & Exhibition held in Abu Dhabi 15-16 Sept 2015.

5. Hegde, C., Soares, C., & Gray, K. 2018. “Rate of Penetration (ROP) Modeling Using
Hybrid Models: Deterministic and Machine Learning” URTeC 2896522 presented in
Houston 23-25 July 2018.

6. Hegde, C., Daigle, H., & Gray, K. 2018. “Performance Comparison of Algorithms for
Real-Time Rate-of-Penetration Optimization in Drilling Using Data-Driven Models” SPE
191141 SPE Journal, Oct 2018 pgs 1706-1722.

7. Li, Y. & Samuel, R. 2019. “Prediction of Penetration Rate Ahead of the Bit through Real-
Time Updated Machine Learning Models” SPE/IADC 194105 presented and the
SPE/IADC International Drilling Conference and Exhibition held in The Hague 5-7 March
2019.

You might also like