Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Electrical Power and Energy Systems 90 (2017) 1–9

Contents lists available at ScienceDirect

Electrical Power and Energy Systems


journal homepage: www.elsevier.com/locate/ijepes

Probabilistic cost prediction for submarine power cable projects


Kristen R. Schell a,⇑, João Claro b,c, Seth D. Guikema a
a
Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI 48109, United States
b
Faculdade de Engenharia da Universidade do Porto, Rua Dr. Roberto Frias, s/n, 4200-465 Porto, Portugal
c
INESC TEC, Campus da FEUP, Rua Dr. Roberto Frias, 4200 Porto, Portugal

a r t i c l e i n f o a b s t r a c t

Article history: It is estimated that Europe alone will need to add over 250,000 km of transmission capacity by 2050, if it
Received 26 May 2016 is to meet renewable energy production goals while maintaining security of supply. Estimating the cost of
Received in revised form 18 November 2016 new transmission infrastructure is difficult, but it is crucial to predict these costs as accurately as possi-
Accepted 24 January 2017
ble, given their importance to the energy transition. Transmission capacity expansion plans are often
Available online 6 February 2017
founded on optimistic projections of expansion costs. We present probabilistic predictive models of
the cost of submarine power cables, which can be used by policymakers, industry, and academia to better
Keywords:
approximate the true cost of transmission expansion plans. The models are both generalizable and well-
Cost estimation
Submarine power cables
specified for a variety of submarine applications, across a variety of regions. The best performing statis-
Probabilistic prediction models tical learning model has slightly more predictive power than a simpler, linear econometric model. The
Density forecast specific decision context will determine whether the extra data gathering effort for the statistical learning
Power system planning model is worth the additional precision. A case study illustrates that incorporating the uncertainty asso-
Integrated resource planning ciated with the cost prediction to calculate risk metrics - value-at-risk and conditional-value-at-risk -
Value-at-risk provides useful information to the decision-maker about cost variability and extremes.
Conditional-value-at-risk Ó 2017 Elsevier Ltd. All rights reserved.

1. Introduction figure heavily in the European Union’s (EU) plans for achieving
ambitious renewable energy goals. In Germany, the North and Bal-
The first submarine power cable used for electricity transmis- tic seas alone are seeing the construction and operation of 33 off-
sion was commissioned in 1954, connecting the electric grid of shore wind farms, totaling 13.5 Gigawatts (GW) of capacity [3,4].
Gotland Island to Sweden’s mainland grid. The cable was rated at The push for renewable production is not limited to Europe: and
20 megawatts (MW), traversing a submarine route length of so, worldwide, the submarine power cable industry is expected
98 kilometers (km) [1]. On the opposite end of the spectrum, the to grow by 45% in the next decade [5].
proposed EuroAsia Interconnector would connect the electricity
grid of Israel to Greece via Cyprus, with a total rated transmission 1.1. Cost estimation techniques
capacity of 2000 MW, traversing a submarine route length of over
1500 km, at a maximum depth of over 2700 m. The most ambitious When project cost estimation is conducted in the planning
to date, this submarine cable project has an estimated cost of 1.5 phase of large infrastructure projects, it is usually done through
billion euros [2]. Unit Cost Estimation (UCE) [6]. This method requires a cost estimate
Over the past fifty years, submarine power cables have been for each unit or process being built, as well as knowledge of the
employed in diverse applications, including: crossing bays, lakes unit’s depreciation rate, salvage value, expected lifetime, and
or rivers; providing supply to islands from mainland grids; sharing expected repair and maintenance costs. An informative example
supply between islands; interconnecting national grids; providing of this method of cost estimation is illustrated in [7]. As in most
supply to offshore oil and gas rigs; and, most recently, for offshore engineering economic models, these cost estimates are based on
wind power connection [1]. the expected values of the costs of many individual components.
Both offshore wind power and national-level grid interconnec- This is problematic because it does not account for the uncertainty
tions - in the seas of Northern Europe and the Mediterranean - surrounding each individual input cost, or how the costs relate to
each other; positively correlated costs compound uncertainty,
but negatively correlated costs can reduce uncertainty. Thus, using
⇑ Corresponding author. expected value inputs does not guarantee an expected value out-
E-mail address: krschell@umich.edu (K.R. Schell). put of a UCE model.

http://dx.doi.org/10.1016/j.ijepes.2017.01.017
0142-0615/Ó 2017 Elsevier Ltd. All rights reserved.
2 K.R. Schell et al. / Electrical Power and Energy Systems 90 (2017) 1–9

Industry has attempted to minimize the uncertainty in input 2. Data


values by annually updating unit cost reference books, such as
the RS Means [8]. These volumes catalog the material, labor (num- The data is based on a privately maintained submarine power
ber of crew needed, daily output expected, labor-hours, etc.), and cable project database [14]. At the time of this study, the database
equipment needs for specific sub-projects. The project engineering contained a record of 296 projects, with each record comprised of
team must determine what sub-project tasks will make up the various project features. Data collected included project attributes
entire project cost (e.g. for large electrical infrastructure projects, like the power (MW) and voltage (kV) of the submarine cable,
such sub-projects could include generation equipment installation, manufacturer, armoring material, and insulation type. Of the 36
inverter installation, transmission line installation, etc.). Despite project attributes sought, 22 were reported with sufficient fre-
the great detail of such cost reference books, these reference vol- quency to enable collection for a large number of projects. The con-
umes are of limited use, as they are proprietary, region-specific tract cost of the submarine power cable project was also collected
point-estimates that are usually developed only for specific for 106 projects.
deployment options. For example, while RS Means publishes a The data was verified through a significant effort of cross-
yearly review of electrical cost data with estimates for under- referencing sources of project details: from company press releases
ground and overhead transmission line construction work, esti- to industry technical reports and presentations. When not reported
mates for the costs associated with submarine power cable in the company press release, the maximum depth of the cable
projects are not included [8]. route was obtained from bathymetry maps. After the verification
Because detailed, and presumably more accurate data, is too of the 296 project records, it was determined that the data for only
often proprietary, researchers have recently studied how to apply 61 projects could be reliably substantiated. To reduce the variabil-
statistical methods to infrastructure project cost estimation. With ity in the cost data, only costs reported in press releases from man-
more sophisticated mathematical models, a reasonably accurate ufacturers were used (e.g. [15]).
cost estimate could be made with less detailed input data.
2.1. Project attributes

There are many features of a project that can affect its cost. For
1.2. Early cost prediction for infrastructure planning
submarine power cable projects, materials costs, such as the cost of
copper or aluminum used in the conductor, is thought to be a large
Infrastructure planning is a major undertaking, with just the
contributor to project cost. Thus, project attributes that represent
planning phase typically spanning years. To determine the poten-
material cost were collected such as, the number of conductor
tial feasibility of an infrastructure project, an estimate of the pro-
cores in each cable (one core for direct current (DC) and three cores
ject cost is needed fairly early in the planning stage, when
for alternating current (AC)); the cross-sectional area of the con-
specific project details are not fully known. However, it is in the
ductor in square-millimeters; the type of current (AC or DC); the
early planning stages that management decides whether or not
number of cables; the length of the submarine route of the cable
to proceed with a project. Thus, it is imperative to have the cost
(s); the type of conductor (copper, Cu, or aluminum, Al); the voltage
estimated as early and as accurately as possible.
(kV) and power (MW) of the cable; and the market price of copper.
To this end, several types of infrastructure projects have utilized
Project attributes aimed at approximating the equipment cost
methods in statistical learning for early cost prediction. These
of a submarine power cable project included: the cable laying ves-
methods include linear regressions, classification trees and artifi-
sel used; the maximum depth along the submarine route; and the
cial neural networks, applied to various infrastructure projects
application for which the cable will be used (island supply; grid
such as metro network planning [9], bridge construction [10],
interconnection; offshore wind power; bay/lake/river crossing; or
highway projects [11], and road reconstruction [12].
oil and gas offshore platform power supply).
The statistical methods used in these studies have been applied
Market conditions for labor costs were approximated by the fol-
to either small data sets of projects (n = 12–18) [9,11], or to data
lowing project attributes: country of project; manufacturer of the
sets within a specific region [10,12,13]. The results of model-fit
submarine cable; cable customer; contract year; and estimated
from such data sets can seem excellent (with R2 values of greater project length in years.
than 0.9), but are usually too optimistic, as such a model is not gen-
eralizable to many other cases.
2.2. Data transformation and variable selection
In this paper, we develop probabilistic models to support early
cost prediction for submarine power cable projects. The final mod-
Finally, the contract cost for each submarine power cable pro-
els presented in Section 3 are based on a global database of 61 sub-
ject was converted to real values in 2012 USD [16]. The natural log-
marine cable projects. This makes the models both generalizable
arithm of the cost is used as the dependent variable in all models
and well-specified for a variety of applications (i.e. submarine
presented in Section 3, due to its normality. Modeling the cost data
power cable projects for island supply, offshore wind farm connec-
as a Gamma distribution did not improve predictive performance.
tion, and grid interconnection, inter alia), across a variety of
As described in Section 3, many different statistical models
regions.
were tested with different combinations of the 21 aforementioned
project attributes. Table 1 details the project attributes, the inclu-
sion of which resulted in the best prediction of project cost. The
most useful attributes from this perspective were eight continuous
1.3. Paper structure
variables and three categorical variables.
The structure of the paper is as follows. Section 2 describes the
global submarine power cable project database. Section 3 elabo- 3. Model development and selection
rates on the statistical learning methods applied to the data set.
Section 4 details the predictive accuracy of the final models. Sec- The primary research question of this work is to determine the
tion 5 applies the final models to a case study on submarine power best statistical model for submarine power cable cost prediction.
cable replacement for Vancouver Island, Canada. Industry insight on predictors was obtained through conversations
K.R. Schell et al. / Electrical Power and Energy Systems 90 (2017) 1–9 3

Table 1 3.1.1. Null model


Submarine power cable project database. The null model is a linear model (Eq. (1)) with an intercept and a
Independent variables Mean, li Minimum Maximum normally distributed error term, , with zero mean and finite vari-
Continuous, X i ance, r2 , as described by N ð0; r2 Þ. With no predictors, the inter-
Submarine cable route [km] 94.1 2.20 425 cept is the unconditional expected mean of the response; as
Maximum depth [m] 176 10.0 1,620 such, it is often used as a baseline comparison to test whether
Number of cables 2.4 1.0 9.0
input variables truly improve the predictive accuracy of higher
Cumulative length,
worldwide [km] 5,672 61.0 11,144
order models.

Y ¼ b0 þ 
Market price,
copper [$2012 USD/ton] 10,576 2,471 13,983
ð1Þ
Voltage [kV] 253 52.0 600
If higher order models do not perform better than the null
Project length [years] 3.43 1.00 6.00
Contract year [year] 2009 1998 2015 model, then a simple mean cost estimate could be used as the pre-
dicted cost of all future submarine power cable projects. However,
Independent variables Number Least Most
Categorical, X i of levels frequent frequent Table 6 shows that the best predictive models outperform the null.
Country 27 Bahrain + 15 Norway + 1
(1)a (8) 3.1.2. Linear model
Application 5 Oil & Gas Island Due to the lack of public data on submarine power cable pro-
Power Supply Supply
jects, several consulting and industry agencies have attempted to
(6) (27)
AC/DC 3 AC/DC AC use limited project data to predict cost solely by submarine route
(1) (39) length [20,21]. As these models are based on only a limited number
Dependent variable, Y i Mean, li Minimum Maximum of projects (16 [20]), the idea is tested here with a larger sample
size (n = 61). Eq. (2) represents the linear regression of submarine
Cost [M$2012 USD] 216.8 15.00 1,240 power cable cost based on X 1 , the length of the submarine route
(km), and the error term,  (N ð0; r2 Þ).
Ln(Y i ) 18.75 16.52 20.94
a
The number in parentheses represents the number of times the categorical level
(or levels, where indicated by ‘‘+” some number) appear(s) in the data. Y ¼ b0 þ b1 X 1 þ  ð2Þ
Table 6 shows that this model does not predict submarine
power cable costs well.

with industry representatives to determine which variables they


3.1.3. Econometric learning curve model
believe affect the cost of submarine cable projects. Using this
A model of submarine power cable cost based on the theory of
insight, along with insights gained from exploratory data analysis,
technological learning curves was also explored. The basic idea
various statistical models with different variable combinations,
behind learning curves is that implementing the project brings
were fit to the database. The statistical models initially explored
valuable lessons-learned, which reduce the cost of the project. A
included linear models, generalized linear models (GLM) with a
secondary effect is that, as learning helps a firm improve perfor-
gamma cost distribution, principal component regression, general-
mance and reduce cost, the firm becomes more competitive in
ized additive models (GAM), GAMs with model-based boosting
the market, in turn increasing overall competition, which itself
(mboost [17]) for optimized variable selection, bagged regression
decreases cost [22,23].
trees, random forests, and multivariate adaptive regression splines
Learning curve models are routinely utilized within larger
(MARS). All models were trained and tested in the R statistical pro-
energy system models in the United States (US) [24] and the Euro-
gramming environment, using the packages stats, mgcv, mboost,
pean Union (EU) [25], as well as in climate change integrated-
randomForest, rpart, gbm and earth [18].
assessment models (IAM) [26]. The assumptions of both learning
Models that performed well based on standard goodness-of-fit
curve specifications, as well as exogenously utilized learning rates,
statistics and limited predictive tests were selected for further
can dramatically affect overall model results. For example, in [25],
study. The best performing models were then subjected to predic-
fast-learning assumptions result in almost five times the GDP gain
tive accuracy tests via Leave-One-Out-Cross-Validation (LOOCV)
in the EU when compared to no-learning. It is imperative to use
[19]. The final models are assessed via their predictive errors:
accurately specified learning curve models, as the overall results
mean absolute error (MAE); and mean percent error (MPE).
of dependent bottom-up energy and climate change models are
often the basis of both US and European Union-level policy designs.
If an accurate learning curve model could be specified for sub-
3.1. Linear models marine power cables, the endogenous energy investment decisions
in energy system models would be better described. For example,
Three linear models were studied for use as baseline compar- the decisions of when, where and how much capacity of offshore
ison models (Eqs. (1), (2) and (4)). While model interpretability is wind governments should invest in would be better informed, hav-
not the focus of this study, it is essential to compare less complex ing an accurate estimate of the corresponding transmission costs.
models with more complex models. If the less complex model can Similarly, the trade-off of investment in more renewable energy
perform almost as well as the more complex, the less complex may generation capacity versus the investment in an interconnection
suffice in certain decision contexts. This could be especially true in to a region naturally endowed with renewable energy (such as
the planning/feasibility phase of a project, when not all the techni- Norway) would be able to be assessed in large energy system
cal project details are known, such as insulation choice, current models.
type, or conductor size. Thus, a model that can make accurate pre- The learning curve model developed for submarine power
dictions based on the least number of inputs is desirable. Such a cables is specified in Eq. (3), and is adapted from the most com-
model would also be advantageous for researchers and policy ana- monly used specification of the learning curve in energy modeling
lysts, as these two groups do not typically have access to detailed [27]. The project cost, C t , is based on the historical data for the
input data. cumulative length, CLt , of submarine power cable that had been
4 K.R. Schell et al. / Electrical Power and Energy Systems 90 (2017) 1–9

laid up to year, t. Using the database described in Section 1, cumu- Basis functions allow for non-linear relationships between pre-
lative length was calculated based on the years 1998 to 2015. In Eq. dictors and the dependent variable. They take the general form
(3), dL is the shape of the curve representing the learning rate, d0 is hm ðXÞ ¼ ðX  cÞþ , where þ represents the positive part of the linear
the cost of the cable at a specific cumulative length, N t is the num- basis function and zero otherwise, and c is the hinge point of the
ber of cables laid in the project, and Lt is the submarine route basis function, or the product of two functions, when variable
length. Economies of scale effects can be included using the expo- interactions are allowed.
nents d1 and d2 , however, it was found that for this data set, the The collection of possible basis functions is shown in Eq. (6), for
best cost prediction occurs with d1 and d2 set equal to one. By tak- each input X j , with knots, c, possible at each observed value of that
ing the natural logarithm of Eq. (3), an estimate of the learning rate input, xij [19].
can be calculated (Eq. (4)), with the error term, t (N ð0; r2t Þ).  
C ¼ ðX j  cÞþ ; ðc  X j Þþ c2fx ð6Þ
1j ;x2j ;...;xNj g
Ct ¼ d0 Ndt 1 Ldt 2 CLdt L ð3Þ
The best performing MARS model in terms of prediction con-
ln C t ¼ ln d0 þ d1 ln Nt þ d2 ln Lt þ dL ln CLt þ t ð4Þ tained the variables representing: the number of cables in the sub-
marine power cable project, NumCables; the submarine route
The standard representation of a learning rate is defined as
length, SubRoute; the cumulative length of submarine power cable
LR ¼ 1  2dL , which gives the change in cost after a doubling of projects installed to date, CumulLen; the cable voltage, Voltage;
cumulative cable length [28]. Under the theoretical assumption expected project duration, ProjLen; market price of copper, CuPrice;
that learning-by-doing leads to cost reductions, the learning rate, contract year, ConYear; and, current type, AC/DC.
LR, should be positive [27]. Modeling using Eq. (4) and the sub-
marine power cable database results in dL ¼ 0:073, which repre- Y ¼ 19:41  0:0983ð40  SubRouteÞþ
sents an LR equal to 4.96%. That is, the cost of a submarine power þ 0:0137ðSubRoute  40Þþ
cable project will decrease by 4.96%, per doubling of cumulative
 0:0004ðCumulLen  3135Þþ
length laid.
A well-specified, econometric learning curve model for sub-  0:0035ð300  VoltageÞþ
marine power cable costs would be advantageous for energy sys- þ 0:4967ðConYear  2007Þþ
tem modelers and policy-makers. Soderholm and Sundqvist [23]  0:0001ð3  NumCablesÞþ  CuPrice
show that the learning rate heavily depends on the structure of þ 0:0206ð40  SubRouteÞþ  ProjLen
the econometric model chosen. Models for the cost of wind power
 0:0094ðSubRoute  40Þþ  DC ð7Þ
with the best fit statistics (R2 values of 0.96) required extensive
economic input data: such as, country-level ‘‘electricity prices, The model allowing three degrees of variable interaction per-
age, structure of existing coal-fired power plants, coal prices, etc.” formed best in LOOCV testing, with 8 out of 9 predictors utilized.
[23]. The data gathering effort for such models often exceeds the The final model is specified in Eq. (7). The last three basis functions
time available to decision-makers in the planning/feasibility phase are multiplied by the linear predictors, CuPrice; ProjLen, and DC
of project consideration. In addition, the learning rate output from cable type, respectively. This model requires some technical spec-
such models is often not generalizable enough1 to be used effi- ifications that may not be known in the early planning stages of a
ciently by energy system/climate change analysts in their own mod- project, such as Voltage, Number of Cables and whether the project
els. Thus, the most basic learning curve model is analyzed here. will be AC or DC. However, it is likely not too burdensome to
assume a small range of possible values for these parameters and
3.2. Multivariate adaptive regression splines (MARS) model run a sensitivity analysis for a given project. Running the MARS
model using the small range of possible values will give the
The desire for an accurate predictive model based on readily decision-maker an idea of the possible array of project costs she
accessible data drives the model development search to statistical might face.
learning models. The sophisticated algorithms behind statistical
learning models allow for less demanding data gathering efforts,
4. Results - model predictive accuracy
as such models can exploit potential non-linear relationships
between the predictors and the dependent variable. Though a
4.1. Mean prediction
well-performing predictive model still depends on highly relevant
data, these models typically require less ancillary data than econo-
The models in Section 3 were tested for predictive accuracy
metric models, such as market price data. Higher-order statistical
using LOOCV. This allows the distribution of model errors to be
learning models are oftentimes advantageous in this respect, when
evaluated, the results of which are presented in Table 6. Both the
such market data is commonly proprietary, and unavailable to
absolute error, AE, and the absolute percent error, APE, are calcu-
energy system modelers.
lated as the absolute value of the actual cost of the submarine
While many different statistical learning models were tested
power cable, y, subtracted from the predicted cost, y ^. The APE met-
(see Section 3), the best performing model was a MARS model
ric is additionally divided by the absolute value of y and multiplied
[29]. The generic MARS model is formulated as in Eq. (5), where
by 100.
X is a vector of predictor inputs, X j for j ¼ 1; 2; . . . ; p; hm ðXÞ are
The full distribution of errors is better visualized in Fig. 1.
basis functions dependent on the predictors’ discovered relation-
Repeatedly testing the models for each observation in the data
ship with Y, and the error term,  (N ð0; r2 Þ). set allows for an approximation of the distribution of errors per
X
M predictive model. Fig. 1 shows that the MARS model has the tight-
Y ¼ b0 þ bm hm ðXÞ þ  ð5Þ est distribution of errors, with the most frequent errors occurring
m¼1
near zero. This result is most easily seen in the graphs measuring
the direct output of the models, Fig. 1a and b. When the model pre-
dictions are transformed back to dollar values, as seen in Fig. 1c
1
The best-fitting econometric models are country-specific, and do not offer global and d, the error distributions show the same skew of costs repre-
learning rates [23,28]. sented in the original database. This is because the models predict
K.R. Schell et al. / Electrical Power and Energy Systems 90 (2017) 1–9 5

Absolute Error (AE) distribution per Absolute Percent Error (APE) distribution
prediction model per prediction model

0.20
MARS MARS
Econometric Econometric
1.0
Linear Linear
Null Null

0.15
0.8

Density
Density
0.6

0.10
0.4

0.05
0.2

0.00
0.0

0 1 2 3 0 5 10 15

Ln[$ 2012 USD] %

(a) AE by Model Output (b) APE by Model Output

Absolute Error (AE) distribution per Absolute Percent Error (APE) distribution per
prediction model prediction model
8e-09

MARS 0.000 0.002 0.004 0.006 0.008 0.010 0.012 MARS


Econometric Econometric
Linear Linear
Null Null
6e-09

Density
Density
4e-09
2e-09
0e+00

0.0e+00 2.0e+08 4.0e+08 6.0e+08 8.0e+08 1.0e+09 1.2e+09 0 200 400 600 800

$ 2012 USD %

(c) AE of Exponentially Transformed Model Output (d) APE of Exponentially Transformed Model Output
Fig. 1. Comparison of model error results, in terms of direct comparison of model output [Ln(ConCost)], and the exponentially transformed model output, which gives cost in
familiar units of $2012 USD. The distribution plots are the result of smoothing via Gaussian kernel density estimates [30].

the logarithm of the cost, in order to abide by the normality Using the vector of residuals obtained from testing the models
assumption. Even after transformation back to dollars, the result on the data set, a normal distribution is fit, with the standard devi-
is the MARS model has the lowest APE. The lowest AE, however, ation determined from the fitted curve. The standard deviation
occurs with the Econometric model (see Table 2); thus, decision- from the residual curve is then applied to the mean-value esti-
makers should utilize the model that minimizes their preferred mates from the prediction model, giving the full uncertainty distri-
error metric. bution around the prediction. This method can be applied to any
statistical learning model from which residuals can be calculated.
To compare the errors between models that output probability
4.2. Probabilistic prediction density predictions, the normal methods of MAE and MAPE do not
apply. One applicable method is the continuous ranked probability
While the distribution of absolute and absolute percent errors score (CRPS) [31]. The CRPS compares the probability distribution
from testing gives a sense of the predictive power of the model, of the prediction to the probability distribution of the observed
it does not give any information about the uncertainty associated data value, as in Eq. (8), via the cumulative distribution functions
with a specific point prediction. The models presented in Section 3 of each (Eq. (9)).
output mean-value predictions. These predictions represent an Z 1
expected value considering a probability density function for the CRPS ¼ CRPSðF; xa Þ ¼ ½FðxÞ  F a ðxÞ2 dx ð8Þ
1
error, which has been assumed to be Gaussian, of the form
Y  N ð0; r2 Þ. and where, F and F a are cumulative distribution functions:
6 K.R. Schell et al. / Electrical Power and Energy Systems 90 (2017) 1–9

Table 2
Prediction error by model.

AE [Ln($2012 USD)] APE [%] AE [Million$2012 USD] APE [%]


l r min max l r min max l r min max l r min max
Null 0.81 0.61 0.022 2.3 4.34 3.36 0.12 13.7 Null 142 193 2.45 1106 117 166 2.16 864
Linear 0.67 0.45 0.037 1.8 3.63 2.54 0.19 10.9 Linear 121 131 4.78 529 85.9 103 3.63 509
Econometric 0.55 0.43 0.025 2.2 3.00 2.33 0.13 11.4 Econometric 106 134 2.83 623 62.5 61.9 2.57 254
MARS 0.54 0.40 0.008 1.4 2.89 2.13 0.043 7.42 MARS 111 170 0.982 933 59.0 58.6 0.80 249

Z x
5.1. Problem description
FðxÞ ¼ qðyÞdy
1
F a ðxÞ ¼ Hðx  xa Þ; ð9Þ The submarine power cable system that connects Vancouver
Island, British Columbia, Canada to the mainland was chosen for
where Eq. (10) is the Heaviside function, the case study. In 2007, Li et al. developed a risk-based approach
to assess different cable replacement strategies. It probabilistically
 assessed the risks of cable failure to the power system, by calculat-
0 for x < 0
HðxÞ ¼ ð10Þ ing the expected energy not supplied (EENS) [32], which is one of
1 for x P 0
the most important reliability indices in transmission expansion
Thus, the CRPS measures the difference between the predicted planning [33]. However, uncertainty in the cost estimate of the
and actual CDFs. This is true even for the case when the actual replacement cable is not considered. The cost of replacing the sub-
observation is a single value, where the CDF is represented as a sin- marine power cable is estimated as $8 million CAN 2006, which is
gle step function from zero to one at the observed value. $9,298,653 USD 2012. Tables 4 and 5 replicate the results reported
Fig. 2a shows a clear shift left of the CRPS for the MARS model, in Li et al.’s Tables IX and X [32], and update the cost data to 2012
illustrating a better distribution of probabilistic errors, compared real values in USD [16].
to the Linear and Null models. Compared to the Econometric The final two columns in Table 4 represent the value of the ben-
model, the CRPS for the MARS has a much shorter right tail. efit to the system; column three with the original 2006 cost data,
Fig. 2b shows the same for the dollar values of the errors, with while column four is updated to 2012 USD real values. Li et al.
the maximum MARS CRPS an order of magnitude smaller than divide this benefit value by the estimated cost of the cable replace-
the other models. This order of magnitude decrease in error is ment, $8 million CAN 2006, to get the benefit/cost ratio presented
extremely valuable for decision-makers. in Table 5. If the benefits outweigh the costs, i.e., the benefit/cost
ratio is greater than one, then the advised strategy is cable replace-
ment. The direct benefit of cable replacement decreases from 2006
4.3. Model limitations to 2010 (Table 3), because other system components are being
upgraded and replaced, so the load on the aging cable is gradually
While the best performing model, MARS, clearly outperforms being phased out (see Table I in [32]). The main contributor to eas-
the baseline Linear and Null models, the error distributions could ing the load on the cable under study is the commissioning of a
be closer to zero. Hundreds of statistical learning models were new AC cable, expected in 2008.
tested with the data set on-hand, which leads the authors to recog- The dynamics of system upgrades makes this case study partic-
nize that other, more predictively powerful explanatory variables ularly relevant. While the overall risk to the system is predicted to
for submarine power cable cost may exist, beyond the scope of this decrease over time due to other system upgrades, it is never pre-
data set. One variable that could not be collected for a sufficient dicted to be nil. Even this relatively short (5 km) system compo-
number of projects was the cross-sectional area (mm2) of the con- nent could play a big role in reducing system risk. The ultimate
ductor core per cable. In conversations with industry representa- decision of whether or not to reduce system risk even further, by
tives, it emerged this might be a valuable piece of information to replacing the cable, is equally dependent on the cost estimate of
approximate material cost. The best data that could be acquired the cable replacement, as it is on the estimate of risk reduction.
from the public domain was the global market price of copper. Therefore, emphasis on careful study into cost estimation is just
However, this variable might be neither the only, nor the best, to as important as the analysis of system risk.
aid in predicting cost. As with all large infrastructure projects,
there are many factors that contribute to the uncertainty of the
final cost. It is hypothesized that any single new variable will bring 5.2. Probabilistic model application
only a modest decrease in model errors.
The cost of the Vancouver Island cable replacement is proba-
bilistically estimated using both the MARS model and the Econo-
5. Case study metric Learning Curve model described in Section 3. The MARS
model requires the data presented in Table 6, for the predictors:
Both the best-performing MARS model, as well as the second- number of cables (NumCables); submarine route length (SubRoute);
best econometric learning curve model, are applied to a case study worldwide cumulative length of installed submarine power cables
described below. The analysis shows that the probabilistic predic- (CumulativeLength); voltage (Voltage); project duration (ProjLen);
tion gives more valuable information to the decision-maker than market price of copper (CuPrice); contract year (ConYear); and clas-
the single, mean-value point estimate does. As seen in Section 5.3, sification of current (AC/DC). The Econometric Learning Curve
this additional information can fundamentally affect the decision model requires data from the first three predictors in Table 6.
made on a project. In this case study, the loss of precision using The data used for prediction is from the British Columbia Transmis-
the cost estimates from the econometric model, as opposed to sion Company (BCTC) [34], with all predictor values falling within
the more precise MARS model, does not affect the decision the range of the training data used to develop the model (see
analysis. Table 1).
K.R. Schell et al. / Electrical Power and Energy Systems 90 (2017) 1–9 7

CRPS distribution per prediction model CRPS distribution per prediction model

2.5
MARS MARS
Econometric Econometric

3.0e-08
Linear Linear
Null Null
2.0

2.0e-08
1.5

Density
Density
1.0

1.0e-08
0.5

0.0e+00
0.0

0.0 0.5 1.0 1.5 2.0 0e+00 2e+08 4e+08 6e+08 8e+08 1e+09
Ln[$2012 USD] $2012 USD

(a) CRPS by Model Output (b) CRPS of Exponentially Transformed Model Output
Fig. 2. Comparison of probabilistic error results by CRPS, in terms of direct comparison of model output (Ln(ConCost)), and the exponentially transformed model output,
which gives cost in familiar units of $2012 USD. The distribution plots are the result of smoothing via Gaussian kernel density estimates [30].

Table 3 Assuming a normal distribution, the cost estimate (ly ) and the
Total EENS (MW h) in the 5 year period [32].
standard deviation (ry ) (see Table 7) have been derived from the
Failure year of cable Replace cable after failure Never replace cable testing residuals of the respective model, as discussed in
2006 15,686 17,643 Section 4.2.
2007 15,678 16,396
2008 14,720 15,170
2009 14,690 14,904
5.3. Uncertainty analysis and risk measures
2010 14,671 14,671

The decision-making framework for cable replacement, as pre-


sented by [32], is a benefit/cost analysis. The cable cost estimate
Table 4
Reduction of EENS (MW h) and risk cost [M$] due to replacing the cable [32].
can inform decision-makers in either a deterministic or probabilis-
tic way. The decision analysis with deterministic information, i.e.
Failure year Reduction of Cost of risk reduction Cost of risk reduction, when only the mean cost estimate is presented to the decision-
of cable EENS (MW h) [M$2006 CAN] by [M$2012 USD]
maker, is shown in Table 8. Using Li et al.’s calculations of the ben-
2006 1,957 6.008 6.986 efit per year (see Table 4), the benefit/cost ratio is calculated. Given
2007 718 2.204 2.563
only a point estimate of cost, a decision-maker would choose to
2008 450 1.382 1.607
2009 214 0.657 0.763 replace the cable if the benefit/cost ratio is greater than one. As
2010 0 0.000 0 seen in Table 8, this occurs only in the first year, 2006.
Because the cost of the cable investment is large and irre-
versible, in such a setting it is natural to consider the variability
in cost, in addition to the deterministic, average cost; namely,
Table 5 through mean-risk formulations. These formulations have two
Benefit/cost ratios for replacement of the cable [32].
important benefits: they require only two moments, which can
Failure year of cable Benefit/cost ratio be estimated, and they provide useful recommendations [35].
2006 0.751 We illustrate these analysis possibilities with three different
2007 0.276 types of risk measures: a probability - the probability that the cost
2008 0.173 is higher than the benefit, i.e., that the benefit/cost ratio is lower
2009 0.082
than one; a quantile - the 90% value-at-risk (VaR), i.e., the mini-
2010 0
mum benefit/cost ratio likely to happen with a 90% probability;
and a tail expectation - the 90% conditional-value-at-risk (CVaR),
i.e., the expected value of the 10% worst benefit/cost ratios.

Table 6
Vancouver island cable cost prediction data.

Predictors
NumCables SubRoute [km] CumulativeLength [km] Voltage [kV] ProjLen [Years] CuPrice [$2012 USD/ton] ConYear AC/DC
2006 1 5 2,088 300 1 10,653 2006 DC
2007 1 5 2,588 300 1 11,281 2007 DC
2008 1 5 3,135 300 1 11,024 2008 DC
2009 1 5 3,625 300 1 8,127 2009 DC
2010 1 5 5,629 300 1 11,944 2010 DC
8 K.R. Schell et al. / Electrical Power and Energy Systems 90 (2017) 1–9

Table 7
Probabilistic cost prediction of Vancouver island cable.

Year MARS Econometric


Mean, ly ry Mean, ly ry
Ln([M$2012 USD]) Ln([M$2012 USD])
2006 15.56 0.6657 15.75 0.6975
2007 15.49 0.6657 15.73 0.6975
2008 16.01 0.6657 15.72 0.6975
2009 16.62 0.6657 15.71 0.6975
2010 15.91 0.6657 15.68 0.6975

Table 8
Deterministic and probabilistic analyses.

MARS Deterministic Uncertainty Analysis & Probabilistic Risk Measures


Failure year of cable, y Benefit/Cost ratio, BC y PðBC y P 1Þ BC y VaR BC y CVaR

2006 1.22 61.7% 0.52 0.36


2007 0.48 13.4% 0.20 0.14
2008 0.18 0.47% 0.07 0.05
2009 0.05 0.0002% 0.02 0.01
2010 0 0% 0 0
Econometric Deterministic Uncertainty Analysis & Probabilistic Risk Measures
Failure year of cable, y Benefit/Cost ratio, BC y PðBC y P 1Þ BC y VaR BC y CVaR

2006 1.01 50.6% 0.41 0.28


2007 0.38 8.08% 0.15 0.11
2008 0.24 2.02% 0.10 0.07
2009 0.11 0.096% 0.05 0.03
2010 0 0% 0 0

Using the full distribution of the cable cost estimate, the prob- expected value of the cost random variable at the specified tail of
ability that the benefit/cost ratio, BC y , will be greater than one is the distribution. The BC y CVaR results given in column five of
given in Eq. (11). This is determined by calculating the probability Table 8 are again at the q = 90th percentile of cost.
that the cost is less than or equal to the benefit value in that year
2
by , where y is the year of cable failure. This is according to the prob- ely þry =2
CVaRðqÞ ¼ ð1  UðU1 ðqÞ  ry ÞÞ 8y ð13Þ
ability density function (Eq. (11)) of the log-normal random vari- 1q
able cost estimate, X y , where ly is the mean cost estimate, ry is
As shown in Eq. (13), CVaR calculates the expected value of the
the standard deviation of the cost estimate, and by is the value of
10% worst cost estimates, which are, in terms of this case study, the
the benefit. This calculation gives the probability that the cost esti- expected value of the 10% highest costs. Thus, the CVaR calculated
mate would equal the benefit, making the benefit/cost ratio at least here gives the risk of a significantly lower-than-expected benefit/-
one. The results are shown in column three (PðBC y P 1Þ) of Table 8. cost ratio. For example, in 2006, the Econometric model gives a 90%
Z by BC y CVaR equal to 0.28; an extremely risk-averse decision-maker
1 2 2
PðX y 6 by Þ ¼ pffiffiffiffiffiffiffi eðlnðby Þly Þ =2ry 8y ð11Þ may find the risk of such a low benefit/cost ratio unacceptable,
1 2pry by even though the mean benefit/cost ratio indicates a favorable cost.
A quantile risk measure often used by decision-makers, termed Both VaR and CVaR present the decision-maker with informa-
Value-at-Risk (VaR) and given by Eq. (12), calculates the value of tion about the tail of the probability distribution, or what might
the random variable (i.e. the submarine cable cost estimate) at happen in an extreme case. Along with the probability of the ben-
the desired q-quantile. efit/cost ratio being greater than one (PðBC y P 1Þ), information
about the uncertainty associated with the cost estimate provides
VaRðqÞ ¼ expðly þ ry U1 ðqÞÞ 8y ð12Þ added value to the decision-maker. Risk metrics are such an impor-
tant tool in decision-making that recent research is bringing them
The results shown in column four of Table 8 were calculated at directly into the optimization problem [36,37].
the 90th percentile of the cost. These results tell the decision-
maker that, for example, according to the MARS model in the year
2006, with 90% probability the benefit/cost ratio will be higher 6. Conclusion
than 0.52. Thus, the risk of a benefit/cost ratio below 0.52 is very
low. Using the VaR risk metric gives the decision-maker not only A well-performing model for early cost prediction of submarine
the value of a worse-than-expected benefit/cost ratio, but the vari- cable projects has been developed. While the model framework,
ability of the cost distribution. A very risk averse decision-maker MARS, is a complex statistical learning model, the data input
may decide that, even though the expected benefit/cost ratio is sig- needed to make a prediction is publicly available. Where decision
nificantly greater than one, the VaR of 0.52 is too far from the contexts do not demand the precision accuracy given by the MARS
expected, and too low, to go ahead with the investment. It is left model, the Econometric learning curve model, with less input data,
to the decision-maker to assess what value of risk s/he is willing may suffice as reasonably accurate.
to take on. Both models output the uncertainty around the predicted cost
A third risk metric, a tail-expectation termed the Conditional- value, giving decision-makers the ability to calculate risks and
Value-at-Risk (CVaR), is also calculated. CVaR measures the assess investment decisions based on those risks. The cost predic-
K.R. Schell et al. / Electrical Power and Energy Systems 90 (2017) 1–9 9

tion models developed give valuable information to decision- [14] Worzyk T. Subsea cable database; 2014. <http://www.worzyk.com/index1.
html>.
makers in industry, policy analysis and academia, when cost esti-
[15] Nexans. Nexans wins an 18 million euros contract in Thailand; December 2005.
mation is an integral component of alternatives assessment. <http://www.nexans.us/eservice/US-en_US/navigatepub_158895_-3678/Nexans_
wins_an_18_Million_Euros_contract_in_Thaila.html>.
Acknowledgment [16] OECD.Stat. Producer prices; September 2015. <http://stats.oecd.org/index.
aspx?DatasetCode=MEI_PRICES_PPI>.
[17] Hothorn T, Buhlmann P, Kneib T, Schmid M, Hofner B. Model-based boosting
The authors would like to thank Thomas Worzyk for leveraging 2.0. J Mach Learn 2010;11:2109–13.
his extensive industry expertise in giving insightful comments on [18] Available CRAN packages by name. <https://cran.r-project.org/web/packages/
available_packages_by_name.html>.
factors affecting cost prediction for submarine power cables [19] Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data
projects. mining, inference and prediction. 2nd ed. Springer; 2009.
The authors would also like to acknowledge financial support for [20] Nexant. Caribbean regional electricity generation, interconnection and fuels
supply strategy. Tech rep. World Bank; March 2010.
this work from the FCT - Fundação para a Ciência e a Tecnologia [21] R. Group North Sea for the NSCOGI North Seas Countries’ Offshore Grid
(Portuguese Foundation for Science and Technology) through the Initiative. Offshore transmission technology. Tech rep. ENTSO-E (European
Carnegie Mellon Portugal Program under Grant SFRH/ Network of Transmission System Operators for Electricity); November 2011.
[22] Arrow K. The economic implications of learning by doing. Rev Econ Stud
BD/51563/2011. This work is also financed by the ERDF European 1962;29(3):155–73.
Regional Development Fund through the Operational Programme [23] Söderholm P, Sundqvist T. Empirical challenges in the use of learning curves
for Competitiveness and Internationalisation - COMPETE 2020 Pro- for assessing the economic prospects of renewable energy technologies.
Renew Energy 2007;32:2559–78.
gramme, and by National Funds through the FCT within project
[24] Gumerman E, Marnay C. Learning and cost reductions for generating
POCI-01-0145-FEDER-006961. technologies in the national energy modeling system (NEMS). Tech. Rep.
LBNL-52559. Ernest Orlando Lawrence Berkeley National Laboratory; 2004.
References [25] Capros P, Mantzos L. Endogenous learning in european post-kyoto scenarios:
results form applying the market equilibrium model primes. Int J Global
Energy Issues 2000;14(1–4):249–61.
[1] Worzyk T. Submarine power cables: design, installation, repair, environmental [26] P.N.N. Laboratory. The global change assessment model (GCAM); August 2012.
aspects. Springer; 2009. <https://wiki.umd.edu/gcam/index.php/Main_Page>.
[2] Euroasia interconnector; 2012. <www.euroasia-interconnector.com/>. [27] Rubin ES, Azevedo IM, Jaramillo P, Yeh S. A review of learning rates for
[3] TenneT. Tennet – the largest investor in the energy transition; March 2015. electricity supply technologies. Energy Policy 2015;86:198–218.
<http://www.tennet.eu/de/en/grid-projects/offshore-projects.html>. [28] Junginger M, Faaij A, Turkenburg W. Global experience curves for wind farms.
[4] 50Hertz Transmission. The energy supply of the future; 2015.<http://www. Energy Policy 2005;33(2):133–50.
50hertz.com/en/Offshore/Wind-farms>. [29] Friedman JH. Multivariate adaptive regression splines. Ann Stat 1991;19:1–67.
[5] Navigant Research. Submarine electricity transmission: HVDC and HVAC [30] Team RC. R: a language and environment for statistical computing. R
submarine power cables: demand drivers, technology issues, prominent Foundation for Statistical Computing. Vienna, Austria: R Foundation for
projects, key industry players and global market forecasts; 2015. <https:// Statistical Computing; 2016.
www.navigantresearch.com/research/submarine-electricity-transmission>. [31] Matheson JE, Winkler RL. Scoring rules for continuous probability
[6] Hendrickson C, Au T. Project management for construction: fundamental distributions. Manage Sci 1976;22(10):1087–96.
concepts for owners, engineers, architects and builders. 2nd ed. Prentice Hall; [32] Li W, Choudhury P, Gillespie D, Jue J. A risk evaluation based approach to
2008. replacement strategy of aged HVDC components and its application at BCTC.
[7] Jaafari A. Probabilistic unit cost estimation for project configuration IEEE Trans Power Deliv 2007;22(3):1834–40.
optimization. Project Manage 1988:226–34. [33] Hemmati R, Hooshmand R-A, Khodabakhshian A. State-of-the-art of
[8] Charest AC, editor. Electrical cost data: 38th annual edition, vol. 38. Norwell, transmission expansion planning: comprehensive review. Renew Sust
MA: RSMeans; 2015. Energy Rev 2013;23:312–9.
[9] Gunduz M, Ugur LO, Ozturk E. Parametric cost estimation system for light rail [34] Cherukupalli S, MacPhail A. Forty years operating experience with 300 kV DC
transit and metro trackworks. Expert Syst Appl 2011;38:2873–7. submarine cable systems; March 2010. <http://www.pesicc.org/iccwebsite/
[10] Chou J-S, Lin C-W, Pham A-D, Shao J-Y. Optimized artificial intelligence models subcommittees/E/E04/2010/2010Spring-40yearsExperiencewith300kVDCSub-
for predicting project award price. Autom Constr 2015;54:106–15. marine-Cherukupalli.pdf>.
[11] Hegazy T, Ayed A. Neural network model for parametric cost estimation of [35] Mieghem JAV. Capacity management, investment and hedging: review and
highway projects. J Constr Eng Manage 1998;124:210–8. recent developments. Manuf Serv Oper Manage 2003;5(4):269–302.
[12] Cirilovic J, Vajdic N, Mladenovic G, Queiroz C. Developing cost estimation [36] Morales JM, Pinson P, Madsen H. A transmission-cost-based model to estimate
models for road rehabilitation and reconstruction: case study of projects in the amount of market-integrable wind resources. IEEE Trans Power Syst
Europe and Central Asia. J Constr Eng Manage 2013;140. 2012;27(2):1060–9.
[13] Soo Kim B, Hong T. Revised case-based reasoning model development based [37] Delgado D, Claro J. Transmission network expansion planning under demand
on multiple regression analysis for railroad bridge construction. J Constr Eng uncertainty and risk aversion. Electr Power Energy Syst 2013;44:696–702.
Manage 2011;138:154–62.

You might also like