Computers & Industrial Engineering: Jingjing Ding, Shengqing Chang, Ruifeng Wang, Chenpeng Feng, Liang Liang

Computers & Industrial Engineering 175 (2023) 108875
Contents lists available at ScienceDirect
Computers & Industrial Engineering

journal homepage: www.elsevier.com/locate/caie
Parallel DEA-Dantzig-Wolfe algorithm for massive data applications

Jingjing Ding a, Shengqing Chang a, Ruifeng Wang b, Chenpeng Feng a, *, Liang Liang a
a
School of Management, Hefei University of Technology, No. 193 Tunxi Road, Hefei, Anhui Province 230009, PR China
b
Hefei Comprehensive Center, SPD Bank, No. 1271 Guangxi Road, Hefei, Anhui Province 230092, PR China
A R T I C L E I N F O A B S T R A C T
Keywords: The application of data envelopment analysis (DEA) in large-scale datasets raises computational concerns, and
Data envelopment analysis (DEA) many novel algorithms have been proposed. However, limitations of the existing algorithms such as the
Large-scale datasets computational difficulties due to data volume and privacy issues still exist when the datasets under evaluation
Dantzig-Wolfe decomposition
are massive and possess a high-density feature. The existing algorithms have not mentioned the potential conflict
Data privacy
between a requirement of full data for implementation in applications and the reality that data privacy could
prevent a full data application. To address the above-mentioned issues, we integrate DEA and the Dantzig-Wolfe
(DW) decomposition algorithm and propose a parallel DEA-DW algorithm to facilitate the computing of effi
ciency scores. Furthermore, the computing time of the algorithm is analyzed. Finally, we perform numerical
experiments with different datasets to demonstrate the feasibility and effectiveness of the proposed algorithm,
and analyze the interactions of the master problem (MP) and the sub-problems (SPs) of the algorithm.
1. Introduction LPs by identifying efficient DMUs. Barr and Durchholz (1997) design an
algorithm to quickly find efficient DMUs as benchmarks, and then
Data envelopment analysis (DEA), originally proposed by Charnes complete the evaluation of all DMUs. Similar work can be seen in
et al. (1978), is a non-parametric estimation method for measuring the Korhonen and Siitari (2007, 2009), Dulá (2011), Zhu et al. (2018),
relative efficiencies of a set of homogeneous decision-making units Khezrimotlagh et al. (2019), Khezrimotlagh and Zhu (2020), Jie (2020),
(DMUs). In an era of an expanding data scale, the need arises to boost Khezrimotlagh (2021), Yu et al. (2021), Dellnitz (2022), just to name a
computing efficiency by exploring the structural properties of DEA few. Chen and Cho (2009), and Chen and Lai (2017) propose a new
models when facing a great amount of data in real-life applications. method to evaluate all DMUs by identifying some “similar” key DMUs of
The standard or ‘naive’ approach for assessing the efficiency of n evaluated DMU and solving small-size LPs. Dulá and López (2009)
DMUs is to solve n specialized linear programs (LPs) respectively. summarize five pre-processing methods, which can be used to quickly
Repeated solutions of many similar LPs are computationally intensive, if determine partially efficient DMUs or partially inefficient DMUs. More
not infeasible, in large-scale data applications. To address the issue, one studies on DEA computation for large-scale datasets can be found in
needs high-performance computer hardware equipped with a stack of Dulá and Thrall (2001), Dulá and López (2002), Dulá and López (2013),
computational strategies, a high-performance LP solver and other and so on.
appropriate software packages. Despite the fact that the last two factors Dulá (2008) introduces three important factors that affect DEA
play key roles (Dulá, 2008), this paper focuses on a computational computing times, namely, the number of DMUs (cardinality), the
strategy that takes advantage of features such as decomposability and number of inputs and outputs (dimension), and the proportion of effi
parallelism while treating the LP solver as a ‘black box’. cient DMUs (density). Unfortunately, the methods mentioned above are
Many computational strategies have been put forward in the litera particularly suitable for low-density situations. However, when the
ture for large-scale datasets. Ali (1993, 1994) proposes two solution density is high, the cost will overweigh the benefit from reducing inef
enhancement techniques: restricted basis entry (RBE) and early identi ficient DMUs. For example, hierarchical decomposition (HD) proposed
fication of efficient DMUs (EIE), the former reduces the LP’s size by by Barr and Durchholz (1997), build hull (BH) proposed by Dulá (2011)
removing inefficient DMUs and the latter reduces the number of solved and the framework proposed by Khezrimotlagh et al. (2019) to identify
* Corresponding author.
E-mail addresses: jingding@hfut.edu.cn (J. Ding), 2020110822@mail.hfut.edu.cn (S. Chang), wrf960117@mail.hfut.edu.cn (R. Wang), cpfeng@hfut.edu.cn
(C. Feng), lliang@hfut.edu.cn (L. Liang).
https://doi.org/10.1016/j.cie.2022.108875
Available online 5 December 2022

0360-8352/© 2022 Elsevier Ltd. All rights reserved.
J. Ding et al. Computers & Industrial Engineering 175 (2023) 108875
all efficient DMUs have limitations because the remaining scale of effi computing time is provided. (3) The proposed algorithm is helpful to
cient DMUs is still too large in high-density situations. maintain data confidentiality among different data owners, which
Regarding the high-density cases, Chen and Cho (2009) identify a broadens the range of applications by satisfying the data privacy
few “similar” critical DMUs as reference set to compute each DMU’s requirement.
efficiency value. Simulation results show that the accelerating proced The rest of the paper is organized as follows. Section 2 describes the
ure can reduce the computational time drastically. Chen and Lai (2017) underlying DEA model used in this paper and how to split the model by
propose a “Trial and Error” (TE) procedure that could control the size of the DW decomposition. In Section 3, we describe the proposed parallel
individual LPs while still maintaining optimality. These methods have a DEA-DW algorithm and demonstrate how it works by an example. In
prominent feature that the size of a main working DEA model is un addition, a formula is provided to estimate the computing time. Section
changed. However, to verify whether the solution is optimal, they need 4 performs numerical experiments and analyzes the interactions of the
to check an optimality condition on the whole dataset, which renders the MP and SPs. Section 5 concludes the paper.
algorithm impractical as it involves frequent hard disk operations when
the size of datasets exceeds the capacity limit of a computer. 2. Preliminaries
To sum up, what if the final number of efficient DMUs is too big to fit
into the RAM (Random Access Memory)? When the size of a dataset is 2.1. Basic BCC model
larger than that of the RAM of a computer, the computer has to
frequently interact with the comparatively low-speed hard disk to swap Suppose that there are n DMUs containing m inputs and s outputs.
in and out the data generated in the computing process. This is DMU0 DMUj (j = 1, 2, ⋯, n) denotes that the unit consumes the xij (i =
impractical and time-consuming. A question that arises is: how to take 1, ..., m) inputs to produce the yrj (r = 1, ..., s) yr0 (r = 1, ..., s) outputs.
advantage of the structural features of a DEA model if we cannot avoid DMU0 denotes the DMU under evaluation. A popular DEA model for
running a large-scale DEA model that exceeds the capacity limit of a evaluating the efficiency score is the BCC (Banker–Charnes–Cooper)
computer? The ultimate need to overcome the difficulties of solving a model (Banker et al., 1984). The input-oriented and envelopment model
DEA model caused by large-scale datasets with a high-density feature is is as follows.
one of the motivations for this work.
Aside from the computing difficulty, data privacy has not been dis Einput
0 = Min θ0
cussed in addressing the computing issue in the current literature. ∑n
Obviously, the existing DEA algorithms require the full data of all DMUs s.t. λj xij ≤ θ0 xi0 , i = 1, ⋯, m;
to be implemented. This requirement poses a hidden risk to data sharing.
j=1
In reality, the full dataset might belong to different owners. In addition, ∑

n
λj yrj ≥ yr0 , r = 1, ⋯, s;
data are valuable assets. The ownership very likely hinders the appli j=1 (1)
cation that requires full data due to a privacy issue. Lacking a high- ∑
n
powered incentive mechanism, it is very hard, if not impossible, to λj = 1
request data owners to exchange their data with others. Considering this j=1
reality, we argue that an algorithm should be of great value that enables λj ≥ 0, j = 1, ⋯, n;

the application of DEA models while maintaining data confidentiality. θ0 is free in sign.
To resolve the conflict between DEA applications and data privacy
among data owners is the other motivation for this work. In the above model, the units attempt to proportionately minimize
Motivated by the computing difficulties and data privacy issues, this the input by (θ0 *100 %) meanwhile keeping the output as the same
paper first combines DEA with the Dantzig-Wolfe (DW) decomposition. level. Each LP contains m + s + 1 constraints and n + 1 variables. The
The DW decomposition is proposed by Dantzig and Wolfe (1960), which readers are referred to Banker et al. (1984) for more information about
can decompose a decomposable large-scale LP into two types of prob the BCC model.
lems: a master problem (MP) and some sub-problems (SPs). Each SP Assuming that θ* and λ* are the optimal solution of Model (1),
∑n * ∑
contains some partial variables, and the MP coordinates all SPs. In the ( j=1 λj xij , nj=1 λ*j yij ), termed the virtual DMU for DMU0 , is composed
meantime, each SP only needs to interact with the MP. The optimal of the two types. One type is the DMUj with λ*j ∕
= 0, we call them the
solution of the original large-scale LP will be obtained by a finite number
reference points or simply reference of DMU0 . Additionally, these DMUs
of iterations of the column generation algorithm. In this paper, we use
are efficient DMUs in the production possibility set (PPS). The other type
the DW decomposition to split a DEA model into some SPs and construct
is the DMUs with λ*j = 0. They have nothing to contribute to the virtual
an MP to coordinate each SP. Based on this combination, a parallel DEA
algorithm (as parallel DEA-DW algorithm) is proposed. On one hand, DMU, and do not affect the efficiency of DMU0 .
each SP only needs to deal with a small part of the full dataset, while the In the following exposition, we will use the above-mentioned BCC
size of the MP depends on the number of SPs. There is no doubt that the model to illustrate and the proposed methods are also applicable to CCR
size of the MP will be extremely small compared with a standard DEA envelopment and other same-type models. For the multiplier DEA
model applied to a full dataset. In addition, the size of the interactions models, one can obtain all the needed information by extracting the dual
between the MP and the SPs is small. As a result, the computational information from the computation results of the envelopment models,
feasibility of DEA models in large-scale datasets is guaranteed theoret which we do not discuss in the sequel.
ically. On the other hand, as SPs do not interact with each other, data
confidentiality among DMUs is maintained.
The main contributions of this paper are as follows: (1) We first 2.2. DEA-DW decomposition
combine DEA with DW decomposition, which can decompose the
extremely large DEA problem into very small SPs. (2) A new parallel In this subsection, we provide an approach to split a DEA model into
DEA-DW algorithm is proposed to solve the problem of insufficient RAM p SPs and construct an MP to coordinate each SP by using DW decom
size due to large-scale datasets, and the formula for the estimated position. Before introducing our approach, we need to transform Model
(1) to Model (2-A) , which is an LP with a block angular structure.
2
p
∑ i ,sr ,si ,sr ) denotes slack variables. For brevity,
In Model (3-A), (s1− 1+ 2− 2+
Min θk1
we provide a matrix form for Model (3-A) in Model (3-B).
k=1
⌊n/p⌋
∑ ⌊n/p⌋
∑ Min E = CT X
s.t. λj xij ⩽ θ1i xi0 , λj yrj ⩾ y1r0 , i = 1, …, m , r = 1, …, s; ⎡ ⎤
[ ] D1 D2 [ ] [ ]
(3-B)
j=1 j=1
A1 ⎢ ⎥ X1 b1
⌊2n/p⌋
∑ ⌊2n/p⌋
∑ s.t. X = ⎢ ⎥
⎣ F1 0 ⎦ =
λj xij ⩽ θ2i xi0 , λj yrj ⩾ y2r0 , i = 1, …, m , r = 1, …, s; A2 X2 b2
0 F2
j=⌊n/p⌋+1 j=⌊n/p⌋+1
⋮ In Model (3-B), CT represents the coefficients of the objective function,

∑
n ∑
n (D1 D2 ) represents the coefficients of Constraints (3.1), and (F1 0),
λj xij ⩽ θpi xi0 , λj yrj ⩾ ypr0 , i = 1, … , m , r = 1, …, s; (0 F2 ) correspond to Constraints (3.2) and (3.3) respectively. We divide
j=⌊(p− 1)n/p⌋+1 j==⌊(p− 1)n/p⌋+1
p p p
all variables X into two parts X1 and X2 where X1 = (λj , j = 1, …, 5; θ11 , θ12 ;
∑ ∑ ∑ 1+ T
θk1 = θk2 = ⋯ = θkm , 1⩾θki ⩾ 0; y11 0 , y12 0 ; s1−
i , sr ) and X2 = (λj , j = 6, …, 10; θ21 , θ22 ; y21 0 , y22 0 ;
2+ T T
k=1 k=1 k=1
s2−
i , sr ) . b1 = (1, 0, y1 0 , y2 0 ) represents the right-hand side of
p
∑
ykr0 = yr0 , r = 1, …, s; Constraint (3.1), and b2 = (0, 0, 0, 0)T represents the right-hand side of
k=1 Constraints (3.2) and (3.3). With this matrix form, we are able to rewritten
∑
n the constrains of Model (2-A) as Model (2-B).
λj = 1, λj ⩾ 0, j = 1, …, n;
j=1 Min E = CT X
⎡ ⎤
(2-A) D1 D2 ⋯ Dp− 1 Dp
⎢ ⎥⎡ ⎤
As shown in Model (2-A), we divide the left-hand side terms of the ⎢ F1
⎢ 0 ⋯ 0 0 ⎥⎥ X1
first m + s constraints of Model (1) into p parts, while the right-hand side [ ] ⎢ ⎥⎢ ⎥ [ ]
A1 ⎢ 0
⎢ F2 ⋮ ⋮ ⎥ ⎢ ⎥
⎥⎢ X2 ⎥ b1 (2-B)
terms are each different. We decompose the right-hand side variable θ0 s.t. X = ⎢ ⎥⎢ ⎥ =
A2 ⎢ ⋮ ⋱ ⋮ ⋮ ⎥ ⎢ ⋮ ⎥ b2
of the first m input variable constraints into p variables (θki , k = 1, ..., p), ⎢
⎢
⎥⎣ ⎦
⎥
while the right-hand side term yr0 of the latter s output variable con ⎢ ⋮
⎣ ⋯ ⋯ Fp− 1 0 ⎥⎦ X p
straints is also decomposed into p variables (ykr0 , k = 1, ..., p). Property 1 0 ⋯ ⋯ 0 Fp

below ensures the equivalence of the two models.
In Model (2-B), A1 = [ D1 D2 ⋯ Dp ], and the remaining part of
Property 1. Model (1) is equivalent to Model (2-A). the coefficient matrix is A2 . Clearly, the variables corresponding to
Proof: Let us assume that the optimal solution of Model (2-A) is Dk (k = 1, ..., p) is the same as the variables corresponding to Fk (k = 1,...,
p). Below we shall introduce the MP based on A1 and SPs based on Fk (k =
(θk∗ k∗
i (i = 1,…,m,k = 1,…,p),λj (j = 1,…,n),yro (r = 1,…,s,k = 1,…,p)).
∗
1, ..., p).
We take all (λ*j , j = 1,…, n) into Model (1), and it’s clear to see that the In the DW decomposition, the variables X in Model (2-B) are replaced
sum of the left-hand side in the first row of the constraints is less than or by a convex combination of samples of extreme points of the feasible
p∗ p∗
equal to (θ1∗ 2∗ 1∗ 2∗
i + θi ⋯ + θi )xi0 . Let (θi + θi + ⋯ + θi ) = θi0 , the
∗
domains of the SPs, i.e., X = EW, where E denotes extreme points of SPs,
objective function of Model (1) is the same as that of Model (2-A), which and W denotes the convex multipliers. To avoid confusion, we note in
means, the same optimal objective value of Model (1) is less than or equal passing that each column of E is an extreme point and the number of
to Model (2-A). Similarly, we can obtain a feasible solution of Model (2-A) columns of E is a run time determined parameters. With this reformu
that gives the optimal value of Model (1) implying that the optimal value lation of decision variables in position, we transform Model (2-B) to
of Model (2-A) is less than or equal to Model (1). Together, we have shown Model (2-C).
that the two models have an identical optimal value. Hence, the proof.
To simplify exposition, we illustrate Model (2-A) through a two-SP Min E = CT EW
⎡ ⎤
example. Without loss of generality, let m = 2, s = 2, p = 2, n = 10. D1 D2 ⋯ Dp− 1 Dp
⎢ ⎥⎡ ⎤
⎢ F1 0 ⋯ 0 0 ⎥
min E = θ11 + θ21 ⎢ ⎥ E1 W1
[ ] ⎢ ⎥⎢ ⎥ [ ]
⎫ A1 ⎢ 0
⎢ F2 ⋮ ⋮ ⎥⎢ ⎥
⎥⎢ E2 W2 ⎥ b1 (2-C)
⎪
⎪ s.t. EW = ⎢ ⎥⎢ ⎥=
∑
5 ∑
10 ⎪
⎪ ⎢ ⋮ ⎥⎢ ⎥
⎪ A2 ⋱ ⋮ ⋮ ⎥⎣ ⋮ ⎦ b2
s.t. λj + λj =1 ⎪⎪
⎪
⎢
⎢ ⎥
⎪
⎪ ⎢ ⋮ ⎥
j=1 j=6 ⎪
⎪ ⎣ ⋯ ⋯ Fp− 1 0 ⎦ Ep Wp
⎬
θ11 − θ12 + θ21 − θ22 =0 (3.1) 0 ⋯ ⋯ 0 Fp
⎪
⎪
⎪
⎪
y11 0 + y21 0 = y1 0 ⎪
⎪
⎪ In Model (2-C), Ek Wk (k = 1, 2, ...p) represents the convex combina
⎪
y12 0 + y22 0
⎪
⎪
= y2 0 ⎪
⎪ tion of extreme points of the kth SP. Based on Model (2-C) and the above
⎭
illustration of DW decomposition, we provide the decomposition of
⎫ Model (2-A) into the MP (Model-4-A) and p SPs (Model-5-A). Though
∑
5 ⎪ (3-A)
λj xij + s1−i − θ1i xi 0 = 0,
⎪
i = 1, 2 ⎪
⎪
⎪ Model (4-A) looks daunting, the constraints and the objective function
⎪
j=1 ⎬ are respectively, A1 EW = b1 and CT EW.
(3.2)
∑
5 ⎪
⎪
⎪
λj yrj − s1+
r − y1r 0 = 0, r = 1, 2 ⎪
⎪
⎪
⎭
j=1
⎫
∑
10 ⎪
⎪
λj xij + s2−i − θ2i xi 0 = 0, i = 1, 2 ⎪
⎪
⎪
⎪
⎬
j=6
(3.3)
∑
10 ⎪
⎪
⎪
λj yrj − s2+
r − y2r 0 =0, r = 1, 2 ⎪
⎪
⎪
⎭
j=6
3
p ∑
∑
Min z = ωk(l) θk1(l)
k=1 l∈Lk
⎫
p ∑
∑ p ∑
∑ p ∑
∑ ⎪
⎪
⎪
s.t. ω k k
(l) θ1(l) = ω k k
(l) θ2(l) = ⋯ = ω k k
(l) θm(l) , 1⩾θki ⩾ 0; ⎪
⎪
⎪
⎪
⎪
k=1 l∈Lk k=1 l∈Lk k=1 l∈Lk ⎪
⎪
⎪
⎪
∑
n ∑ ⎪
⎬
ωk(l) λj(l) = 1, λj ⩾ 0; (dual variables: π)
j=1 l∈Lk ⎪
⎪
⎪
⎪ (4-A)
⎪
⎪
p ∑
∑ ⎪
⎪
⎪
⎪
ωk(l) ykr0(l) = yr0 , r = 1, …, s; ⎪
⎪
k=1 l∈Lk
⎪
⎭
}
∑
ωk(l) = 1, k = 1, ..., p; (dual variables: αk )
l∈Lk
ωk(l) ⩾ 0,
( ) ( )
Min zk = CTk − πT Dk Xk − αk Min zk = CTk − π T Dk Xk − αk
⌊kn/p⌋
(5-B)
∑ s.t. Fk Xk = b2 k
s.t. λj xij + sk−i − θki xi0 = 0 , i = 1, …, m;
j=⌊(k− 1)n/p⌋+1
(5-A) 3. The parallel DEA-DW algorithm
⌊kn/p⌋
∑
λj yrj − sk+ − ykr0 = 0, r = 1, …, s;
3.1. Description of the algorithm
r
j=⌊(k− 1)n/p⌋+1
λj , sk−i , sk+ k
r ⩾0, yr0 ⩾yr0 ⩾0. Based on the above discussion, we provide the procedure referred to
Note that z is the optimal objective function value of MP, zk (k = 1, as the Parallel DEA-DW algorithm in Algorithm 1.
…, p) as the optimal objective function value of the kth SP, π as the dual
variables with respect to first three rows of constraints and αk (k = 1, ...,
p) as the dual variables of convexity constraints. All these variables Algorithm 1: Parallel DEA-DW
shall be used extensively in the sequel. In Model (4-A), [θk1(l) , θk2(l) , ..., θkm(l) , Step 1: Transform the standard BCC model (Model-1) to a block angular structure (
Model-2-A). Partition the transformed programming into one MP (Model-4-A) with
yk10(l) , yk20(l) , ..., ykr0(l) , λj(l) (j = ⌊(k − 1)n/p⌋ + 1, ..., ⌊kn/p⌋)], l ∈ Lk denotes nini [1] variables, and p SPs (Model-5-A) with m +s +(n/p) variables. Assign the MP to
extreme points of the kth SP, and ωk(l) (k = 1, 2, ...p; l ∈ Lk ) are convex a computer and p SPs to p computers.
Step 2: Identify an initial feasible basis solution for MP by Big M method. Initialize
multipliers associated with extreme points of the kth SP. In Model (5-A),
I = 0[2].
it is worth stating that CTk is the vector of objective function coefficients Step 3: Solve MP (Model-4-A) and storez, passing π and αk to all SPs.
in the MP corresponding to the variables of the kth SP. Models (4-A) and Step 4: Set objective functions of the SPs and solve the SPs (Model-5-A).
(5-A) have the same structures as the respective matrix forms of Models Step 5: Gathering the optimal objectives zk (k = 1, …, p) of all SPs, if all zk are
positive, then go to Step 6, otherwise go to Step 7.
(4-B) and (5-B) below.
Step 6: The resulting optimal solutionz* = z, the algorithm terminates.
Min E = CT EW Step 7: Letzs = min{zk , zk ⩽ 0}, s ∈ {1,…,p}. Based on the optimal solutions X*s of the
⎡ ⎤ sth SP, generate a specific coefficient column with a new variable to the MP, and
E1 W1
letI = I + 1. Go to Step 3.
⎢ ⎥
⎢ E2 W2 ⎥
⎢ ⎥ (4-B) 1
nini indicates the initial number of variables in the MP.
s.t. A1 EW = [ D1 ⋯ Dp− 1 Dp ]⎢ ⎥ = b1 2
⎢ ⋮ ⎥ I represents the number of iterations in the parallel DEA-DW algorithm, i.e., the
⎣ ⎦
number of column generation.
Ep Wp
To help interpret Algorithm 1, we provide a schematic diagram in Fig. 1,

which shows the interaction mechanism between the MP and SPs.
It is worth noting that in Step 2, we use a Big M method to get an
initial feasible solution, which in turn initiates the iteration process of
the algorithm. As described in Step 3 and Step 4, the MP sends the
computed parameters π and αk (k = 1, 2, ..., p) to each SP, and each SP
sets the objective function with the received values and runs Model (5-
A). Then, each SP returns zk (k = 1, …, p) and Xk* (k = 1, 2, ..., p) to the
MP. When the MP collects all zk (k = 1, …, p) and Xk* (k = 1, 2, ..., p),
computes Model (4-A), and so on and so forth until all zk (k = 1, …, p)
are positive.
In the following, let us return to the two-SP example in Section 2.2.
We can easily observe that the feasible domains represented by the
Constraints (3.2) and (3.3) are all bounded from above. In fact,(θ11 ,θ21 ,θ12 ,
Fig. 1. Schematic diagram of the parallel DEA-DW algorithm.
4
θ22 ), (y11 0 , y12 0 ) and (y21 0 , y22 0 ) are bounded above by 1, y1 0 and y2 0 single DMU is spent on two tasks. One is the MP computation that in
respectively. Thus, it is equivalent to add these upper bounds to SP 1 and creases in size with iteration I, and the other is the parallel computation
SP 2 to make explicitly the feasible domains of the SPs bounded poly of SPs. Initial MP is an LP with nini variables. Every time a column is
hedrons. As a result, we only need to use the extreme points of Con generated by the SPs, the size of MP will be added one. So, the first half
straints (3.2) and (3.3) together with upper bounds in the column of the formula represents the time consumed by the MP, which is the
generation process of DW decomposition. For the reason that extreme sum of a series of equal-difference from CMP (m, s, d) nini to C(m, s, d) (nini + I).
MP
points of the two SPs are sometimes difficult to obtain, we use the Big M And the rest is the time consumed by the SPs and interaction. Since each
method to initiate the procedure. Note that the MP has six constraints SP is an LP model with n/p variables, we get the time of CSP
(m, s, d) n/p. And
(four constraints are already shown in Constraints (3.1) and the other the required interaction time Tint for the comparison of SPs’ optimal
two are convexity constraints corresponding to two SPs). Thus, we use solution is added (Note that the interaction overhead costs little because
six artificial variables attached to these six constraints and associate six only the few parameters of MP need to be passed to all SPs and the
big M’s as the coefficients in the objective function of Constraints (3.1) optimal solution of the SP needs to be gathered for comparison). In
to construct a feasible basic solution for Steps 1 and 2. consideration of every iteration will have such a pass, so we multiply it
Next, we can get an initial dual price vector (π, α1 , α2 ) for con by I. Finally, n DMUs need to be evaluated n times, so we are multiplying
structing the objective functions of the two SPs. Taking SP1 as an the outermost part of the Formulation (6) by n. Hence, the proof.
example, its objective function is Min z1 = (CT1 − πT D1 )X1 − α1 . Natu
rally, we may start two computers to solve the SPs in parallel. Once we 4. Numerical study
get the z1 and z2 in the two SPs are all positive, the original problem’s
optimal objective value z* is the same as the optimal value of the current In this section, we use different datasets to demonstrate the feasi
MPz. That is, we get the efficiency value of the DMU under evaluation. bility and validity of the proposed parallel DEA-DW algorithm. We
Otherwise, we choose the optimal solution from the SP with the most perform the parallel DEA-DW algorithm using two sources of datasets:
negative value to generate a column to enter the MP. For example, randomly generated datasets and datasets from Dulá (2011). The first
ifz2 < z1 and z2 < 0, we choose the optimal solution from SP2 to source is computer generated random data. 12n-by-(m + s) data
generate a new variable with the coefficient column matrices are randomly generated, where (m, s) = (2, 2), (3, 3), (4, 4),
( T )T ( )
C2 X2 , D2 X2 , 0, 1 to the MP, where CT2 X2 is added to the objective and (5, 5) and n = 10000, 25000, 50000, the datasets can be viewed at
https://github.com/1660622007/datasets-Parallel_DEA_DW. The other
function and (D2 X2 , 0, 1)T enters the constraints. The remaining steps
source is Dulá (2011), We used 16 data sets with dimension (m, s) = (7,
are to repeat the column generation process iteratively until the pro
8), (10, 10), cardinality n = 25000, 50000, 75000, 100000 and density
cedure converges to an optimal solution.
To sum up, two features of the proposed algorithm can be discerned:
ρ = 0.10, 0.25.
In the following experiments, we will use the following tools and
(1) Scalability. The data needed for interactions are very small and the
software: Gurobi, Python, mpi4py, and a personal laptop (Gurobi 9.1.2,
majority of data are stored and maintained locally in SPs. Also note that
Gurobi Optimization, LLC. “Gurobi Optimizer Reference Manual”. 2021.
SPs are independent of one another and they only need to interact with
[Online]. Available: https://www.gurobi.com; Python 3.9 version;
the MP. (2) Confidentiality. The owners of parts of DMUs’ information
mpi4py, https://mpi4py.readthedocs.io/en/stable/; Personal laptop,
can construct their own SPs and implement computing with private
Intel Core i5-7300HQ CPU @2.50 GHz and 8G memory). We use one
partial data, while the MP is constructed and solved by a central eval
computer with multiple processes running to simulate multiple com
uator. The DMU’s efficiency evaluation can be finished by the interac
puters, where one process is used to solve the MP and each of the
tion mechanism depicted in Fig. 1 with the necessary data exchanges
remaining processes solves an SP. The process-to-process interaction we
without disclosing the information to other data owners.
implement with the support of mpi4py.
The performance results of the parallel DEA-DW algorithm in our
3.2. Computing time analysis of the algorithm generated datasets is shown in Table 1 (3 processes) and in datasets from
Dulá (2011) is shown in Table 2 (3 processes).
In this subsection, we discuss the performance of the proposed par In Tables 1 and 2, we give the three attributes of the dataset:
allel DEA-DW algorithm from the perspective of computing time dimension, cardinality and density, such as dimension = 2i2o, cardinality =
analysis. 10000 and density = 0.62 % mean the dataset has 10,000 DMUs with 2
The current paper assumes that the underlying algorithm to solve the inputs and 2 outputs and the proportion of efficient DMUs is 0.62 %. As
DEA model is the Simplex algorithm. Dulá (2008) demonstrates Simplex mentioned before, nini indicates the initial number of variables in the
algorithm is more efficient than other algorithms such as interior-point MP. Avg. I, Avg. final MP size and SPs size indicate the average iterations,
methods in DEA computation, and provides an approximate relationship the average number of variables in final MP and the number of variables
T = C(m, s, d) n, where T is the time required to solve a standard DEA in each SP, respectively. Avg. computing time means the average
computing time (Unit: Second) required to compute one DMU’s
problem, the factor C(m, s, d) is an inherent data attributes about
efficiency.
dimension (m, s) and density (d). n is the cardinality. It also means that
As shown in Table 1, in the first dataset (2i2o, 10000, 0.62 %), the
the total time required to get the efficiencies of all n DMUs, is T =
number of variables involved in each SP is 5004, including 5000 for λj , 2
C(m, s, d) n2 . Based on this, we can analyze the computing time of the
for θki and 2 for ykro . In other words, the number of variables in each SP is
parallel DEA-DW algorithm and Property 2 summarizes the main result.
determined by the number of DMUs and the number of dimensions
Property 2. Assume p computers in parallel for the calculations of p SPs. involved in each SP, while the number of constraints per SP is 4, which is
The computing time required to generate efficiencies of all n DMUs by the determined by the number of dimensions. The average final number of
parallel DEA-DW algorithm is given in the formula in (6), where Tint indicates variables for MP is 26.62, where the initial MP has 6 variables (i.e., nini ),
the interaction time overhead for passing parameters to all SPs and gathering and 20.62 variables are added through the column generation method
optimal solutions from the SPs in one iteration. (equal to the number of Avg. I).
( ( ) ) It is worth stating that Avg. I is equivalent to the number of infor
(2nini + I)(I + 1) n mation interactions required between the MP and each SP for each DMU,
T1 = MP
C(m, SP
+ C(m, + Tint I n (6)
s, d)
2 s, d)
p and the times of the MP and each SP solved. Therefore, it can be sug
gested that it is the most critical index in our algorithm, which
Proof: For the parallel DEA-DW algorithm, the computing time of a
5
Table 1
The performance of the proposed algorithm in randomly generated datasets with 3 processes.
Dataset nini Avg. I Avg. final MP size SPs size Avg. computing time
Dimension Cardinality Density
2i2o 10,000 0.62 % 6 20.62 26.62 5004 4.21

25,000 0.38 % 6 23.65 29.65 12,504 11.96
50,000 0.19 % 6 23.34 29.34 25,004 23.49
3i3o 10,000 2.88 % 8 33.29 41.29 5006 6.66
25,000 1.50 % 8 34.35 42.35 12,506 17.02
50,000 0.98 % 8 33.06 41.06 25,006 32.78
4i4o 10,000 6.99 % 10 40.63 50.63 5008 8.04
25,000 2.05 % 10 44.58 54.58 12,508 21.98
50,000 3.00 % 10 44.77 54.77 25,008 44.46
5i5o 10,000 13.28 % 12 53.37 65.37 5010 10.62
25,000 8.36 % 12 53.98 65.98 12,510 26.74
50,000 5.65 % 12 56.02 68.02 25,010 55.41
Table 2
The performance of the proposed algorithm in datasets of Dulá (2011) with 3 processes.
Dataset nini Avg. nini Avg. final MP size SPs size Avg. computing time
Dimension Cardinality Density
2i3o 25,000 10 % 7 30.4 42.4 12,510 13.74

25,000 25 % 7 31.3 43.3 12,510 12.32
50,000 10 % 7 27 39 25,010 24.13
50,000 25 % 7 31.4 43.4 25,010 24.23
5i5o 25,000 10 % 12 60.9 67.9 12,514 27.43
25,000 25 % 12 63.3 80.3 12,514 25.32
50,000 10 % 12 60.3 77.3 25,014 54.72
50,000 25 % 12 62.6 79.6 25,014 55.46
7i8o 25,000 10 % 17 97.7 119.7 12,520 42.05
25,000 25 % 17 100.2 122.2 12,520 46.36
50,000 10 % 17 94.9 116.9 25,020 88.81
50,000 25 % 17 101.7 123.7 25,020 90.2
10i10o 25,000 10 % 22 148.5 155.5 12,504 65.98
25,000 25 % 22 147.3 170.54 12,504 64.24
50,000 10 % 22 151.8 158.8 25,004 133.13
50,000 25 % 22 163.1 170.1 25,004 134.97
significantly determines the size of MP and the times of MP and SPs an algorithm and the stark reality that data owned by different owners,
solved, i.e., the main computational consumption. who are unwilling to transfer data to a central location.
To further show the feasibility of the proposed algorithm, we simu In order to address these problems, we first study the combination of
late more computers, i.e., more SPs, and the results about Avg. I are the DEA model and the DW decomposition, and further propose the
shown in Figs. 2–5. parallel DEA-DW algorithm. In addition, we analyze the computing time
The following observations can be made based on Figs. 2–5. (1) As of the proposed algorithm. Subsequently, we demonstrate the feasibility
the number of SPs increases, Avg. I also increases, regardless of the of our algorithm through simulations. It can be found that it is feasible to
dimension. Figs. 2 and 3 show an unambiguous upward trend. However, use DW decomposition to decompose large-scale DEA problems, while
it should be noted that the size of each SP is smaller and the computation the size of SPs depends on the number of decomposed SPs. The parallel
consumption of each SP should be reduced. (2) As the dimension in DEA-DW algorithm is feasible in dealing with the DEA computation
crease, Avg. I is also increasing, regardless of the number of SPs. In problem for large-scale datasets and data confidentiality. The key index,
Figs. 2 and 3, it indicates that the line with the lower dimension is below the average iterations, is tightly related to the number of dimensions and
the line with the higher dimension. And in Figs. 4 and 5, it indicates that the number of SPs, while the cardinality has a small effect on it.
the heights of the columns in different dimensions are significantly Our proposed algorithm provides a solution strategy for a practical
different. The more dimensions, the more variables involved in the MP implementation of DEA computations when large-scale datasets
and the SPs, and as a result, the more iterations are needed to find an exceeding the capacity of RAMs; the amount of information exchanges
optimal solution. (3) The cardinality has a small effect on Avg.I. Figs. 4 in the computation process is small. Moreover, the data privacy issue has
and 5 display graphically that the effect of cardinality in the same attracted widespread public attention. Many scholars have discussed
dimension setting is less prominent. In summary, the critical factors this issue in numerous studies (Horvitz and Mulligan, 2015; Landau,
affecting Avg. I are dimension and the number of SPs, while cardinality 2015). Our proposed method provides a solution that avoids disclosing
has a weak effect on it. DMU information to achieve data confidentiality. Respecting data
confidentiality, our proposed method broadens the scope of applications
5. Conclusion of DEA models. Vaccine distribution is a potential case in point, where
countries can obtain distribution strategies through DEA methods
As data become more and more available, the difficulty with large- without revealing sensitive information.
scale datasets in DEA applications attracts the attention of many re Finally, there are still some limitations left for future development.
searchers. Except for this issue, the data privacy issue has not been First, in terms of the initial feasible solution, we use the Big M method to
discussed in designing an efficient algorithm in DEA research. There is a obtain an initial feasible solution, iterating off this feasible solution the
potential conflict between a requirement of full-data for implementing MP and the SPs may require multiple interactions, and discovering a
6
Fig. 2. Relations between the number of SPs and the average iterations in different cases of cardinality in randomly generated datasets.
Fig. 3. Relations between the number of SPs and the average iterations in different cases of cardinality in datasets of Dulá (2011).
7
Fig. 4. Relations between the dimension and the average iterations in different cases of SPs in randomly generated datasets.
Fig. 5. Relations between the dimension and the average iterations in different cases of SPs in datasets of Dulá (2011).
8
better method to obtain the initial feasible solution may reduce the in Barr, R. S., & Durchholz, M. L. (1997). Parallel and hierarchical decomposition
approaches for solving large-scale data envelopment analysis models. Annals of
teractions. Second, the proposed algorithm may be applied to more
Operations Research, 73(1), 339–372.
scenarios, such as the case where the dataset is stored in a distributed Charnes, A., Cooper, W. W., & Rhodes, E. (1978). Measuring the efficiency of decision
form and so on. making units. European Journal of Operational Research, 2(6), 429–444.
Chen, W. C., & Cho, W. J. (2009). A procedure for large-scale dea computations.
Computers and Operations Research, 36(6), 1813–1824.
CRediT authorship contribution statement Chen, W. C., & Lai, S. Y. (2017). Determining radial efficiency with a large data set by
solving small-size linear programs. Annals of Operations Research, 250(1), 147–166.
Jingjing Ding: Conceptualization, Methodology, Writing – original Dantzig, G. B., & Wolfe, P. (1960). Decomposition principle for linear programs.
draft, Writing – review & editing, Funding acquisition. Shengqing Dellnitz, A. (2022). Big data efficiency analysis: Improved algorithms for data
Chang: Data curation, Software, Visualization, Writing – original draft, envelopment analysis involving large datasets. Computers and Operations Research,
Writing – review & editing. Ruifeng Wang: Data curation, Software, 137, Article 105553.
Horvitz, E., & Mulligan, D. (2015). Data, privacy, and the greater good. Science, 349,
Writing – original draft. Chenpeng Feng: Writing – review & editing, 253–255.
Supervision, Project administration, Funding acquisition. Liang Liang: Dulá, J. H., & Thrall, R. M. (2001). A computational framework for accelerating dea.
Validation, Funding acquisition. Journal of Productivity Analysis, 16(1), 63–78.
Dulá, J. H. (2011). An algorithm for data envelopment analysis. INFORMS Journal on
Computing, 23(2), 284–296.
Declaration of Competing Interest Dulá, J. H., & López, F. J. (2002). Data Envelopment Analysis (DEA) in massive data sets.
In Handbook of massive data sets (pp. 419–437). Boston, MA: Springer.
Dulá, J. H., & López, F. J. (2009). Preprocessing DEA. Computers & Operations Research,
The authors declare that they have no known competing financial
36(4), 1204–1220.
interests or personal relationships that could have appeared to influence Dulá, J. H., & López, F. J. (2013). Dea with streaming data. Omega, 41(1), 41–47.
the work reported in this paper. Dulá, J. H. (2008). A computational study of dea with massive data sets. Computers &
Jie, T. (2020). Parallel processing of the build hull algorithm to address the large-scale
Data availability DEA problem. Annals of Operations Research, 295, 453–481.
Khezrimotlagh, D., Zhu, J., Cook, W. D., & Toloo, M. (2019). Data envelopment analysis
Data will be made available on request. and big data. European Journal of Operational Research, 274(3), 1047–1054.
Khezrimotlagh, D., & Zhu, J. (2020). Data Envelopment Analysis and Big Data: Revisit
with a Faster Method. In Data Science and Productivity Analytics (pp. 1–34). Boston,
Acknowledgment MA: Springer.
Khezrimotlagh, D. (2021). Parallel Processing and Large-Scale Datasets in Data
Envelopment Analysis. In Data-Enabled Analytics. International Series in Operations
The authors would like to thank the guest editor and the anonymous Research & Management Science (pp. 159–198). Cham: Springer.
reviewers for their constructive comments and invaluable suggestions. Korhonen, P. J., & Siitari, P. A. (2007). Using lexicographic parametric programming for
Jingjing DING and Shengqing CHANG are joint first authors. This identifying efficient units in dea. Computers & Operations Research, 34(7),
2177–2190.
research is supported by National Natural Science Foundation of China Korhonen, P. J., & Siitari, P. A. (2009). A dimensional decomposition approach to
(Nos. 71771074, 71971072, 72188101, 71971074). identifying efficient units in large-scale dea models. Computers & Operations
Research, 36(1), 234–244.
Landau, S. (2015). Control use of data to protect privacy. Science, 347, 504–506.
References
Yu, A., Shi, Y., & Zhu, J. (2021). Acceleration of Large-Scale DEA Computations Using
Random Forest Classification. In Data-Enabled Analytics. International Series in
Ali, A. I. (1993). Streamlined computation for data envelopment analysis. European Operations Research & Management Science (pp. 31–50). Cham: Springer.
Journal of Operational Research, 64(1), 61–67. Zhu, Q., Wu, J., & Song, M. (2018). Efficiency evaluation based on data envelopment
Ali, A. I. (1994). Computational Aspects of DEA. In Data Envelopment Analysis: Theory, analysis in the big data context. Computers & Operations Research, 98, 291–300.
Methodology, and Applications (pp. 63–88). Netherlands: Springer.
Banker, R. D., Charnes, A., & Cooper, W. W. (1984). Some models for estimating
technical and scale inefficiencies in data envelopment analysis. Management science,
30(9), 1078–1092.

Computers & Industrial Engineering: Jingjing Ding, Shengqing Chang, Ruifeng Wang, Chenpeng Feng, Liang Liang

Uploaded by

Copyright:

Available Formats

You might also like

Computers & Industrial Engineering: Jingjing Ding, Shengqing Chang, Ruifeng Wang, Chenpeng Feng, Liang Liang

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computers & Industrial Engineering: Jingjing Ding, Shengqing Chang, Ruifeng Wang, Chenpeng Feng, Liang Liang

Uploaded by

Copyright:

Available Formats

Computers & Industrial Engineering 175 (2023) 108875

Contents lists available at ScienceDirect

Computers & Industrial Engineering

Parallel DEA-Dantzig-Wolfe algorithm for massive data applications

Available online 5 December 2022

In reality, the full dataset might belong to different owners. In addition, ∑

reality, we argue that an algorithm should be of great value that enables λj ≥ 0, j = 1, ⋯, n;

⋮ In Model (3-B), CT represents the coefficients of the objective function,

straints is also decomposed into p variables (ykr0 , k = 1, ..., p). Property 1 0 ⋯ ⋯ 0 Fp

To help interpret Algorithm 1, we provide a schematic diagram in Fig. 1,

Dimension Cardinality Density

2i2o 10,000 0.62 % 6 20.62 26.62 5004 4.21

Dimension Cardinality Density

2i3o 25,000 10 % 7 30.4 42.4 12,510 13.74

You might also like