Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

Accepted Article

Accepted Manuscript
Title: Big Data Analytics in Operations Management

Authors: Tsan-Ming Choi, Stein W. Wallace, Yulan Wang

DOI: https://doi.org/doi:10.1111/poms.12838
Reference: POMS 12838
To appear in: Production and Operations Management

Please cite this article as: Choi Tsan-Ming., et al., Big Data
Analytics in Operations Management. Production and Operations Management (2017),
https://doi.org/doi:10.1111/poms.12838

This article has been accepted for publication and undergone full peer review but has not
been through the copyediting, typesetting, pagination and proofreading process, which may
lead to differences between this version and the Version of Record. Please cite this article as
doi: 10.1111/poms.12838
Article Type: Original Article

Big Data Analytics in Operations Management

Tsan-Ming Choi1

Business Division, Institute of Textiles and Clothing, Faculty of Applied Science and Textiles,

The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong.

Email: jason.choi@polyu.edu.hk; Phone: 852-27666450.

Stein W. Wallace

Department of Business and Management Science,

NHH Norwegian School of Economics, NO-5045 Bergen, Norway

Email: Stein.Wallace@nhh.no; Phone: +47 55 95 93 84.

Yulan Wang

Faculty of Business, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong

Email: yulan.wang@polyu.edu.hk; Phone: 852-27664683.

1
For correspondence.

This article has been accepted for publication and undergone full peer review but has not been
through the copyediting, typesetting, pagination and proofreading process, which may lead to
differences between this version and the Version of Record. Please cite this article as doi:
10.1111/poms.12838

This article is protected by copyright. All rights reserved.


Final version for Production and Operations Management.
Accepted Article
Big Data Analytics in Operations Management

Abstract: Big data analytics is critical in modern operations management (OM). In this paper, we first
examine the existing big data related analytics techniques, and identify their strengths, weaknesses
as well as major functionalities. We then discuss various big data analytics strategies to overcome
the respective computational and data challenges. After that, we examine the literature and discuss
how different types of big data methods (techniques, strategies and architectures) can be applied to
different OM topical areas, namely forecasting, inventory management, revenue management and
marketing, transportation management, supply chain management, and risk analysis. We also
investigate real world applications of big data analytics in top branded enterprises. Finally, we
conclude the paper with a discussion of a future research agenda.

Key Words: Big data analytics, big data methods, operations management, data-driven optimization,
applications.

History: Received: 8 December 2017; accepted: 20 December 2017 by Kalyan Singhal after one
revision.

1. Introduction

1.1. Background

We are now in the big data era. Internet of things (IoTs), cloud computing (Passacantando et al.
2016), wireless sensor networks (Takaishi et al. 2014; Ding et al. 2016), and social media are all
commonly used terminologies related to big data in our everyday lives. Big data here refers to the
situation when the dataset exhibits several characteristics, such as high volume, high variety, and

This article is protected by copyright. All rights reserved.


high “required” data processing velocity 2 . There is no doubt that future socio-economic
developments all rely heavily on big data and the related information technologies and methods.
Operations management (OM) is commonly known as the discipline which employs scientifically
Accepted Article
sound analytical methods to help make optimal (or near optimal) decisions for organizations3. It is
inherently related to the use of data. To solve OM problems, computing algorithms based on
statistical and mathematical models are needed. Thus, big data analytics4 is in fact closely related to
OM and should be regarded as one of the most prominent recent developments in the field.

In the big data era, new challenges emerge regarding the computing requirements and
strategies to conduct OM analysis. In particular, we observe that more and more companies and
organizations are employing big data related technologies such as information and communication
technology (ICT), enterprise resources planning (ERP) systems, cloud computing, IoTs, and social
media in their operations. All these sensor and computing systems store and manipulate a massive
amount of data which is highly heterogeneous (including both structured and unstructured data
points) and diversified (Drosou et al. 2017), and requires very speedy processing. This requirement
leads to the rapid development of big data analytics, which motivates us to develop this paper.

To be specific, in this paper, we first search the OM literature and review existing big data
analytics techniques and strategies. We then provide a concise review of the literature in different
important OM topical areas. After that, we discuss how big data methods (techniques, strategies and
architectures) can be applied in different topical areas; namely forecasting, inventory management,
revenue management and marketing, transportation management, supply chain management, and
risk analysis. Some real world applications of big data analytics in top enterprises are also examined.
Finally, we conclude the paper with a discussion of a future research agenda.

1.2. Methodology

For research methodology, we do not intend to report an exhaustive review of the topic. Instead, we
focus on searching via Web of Science portals on papers published in SCI/SSCI journals in the
operations research and management science category. We also supplement with Google Scholar
searches using primary keywords such as “big data, data driven, data analytics”, supplemented by

2
Notice that there are other “V”s which relate to big data. Refer to Choi, Gao, et al. (2017).
3
In this paper, OM includes management science and operations research with an emphasis on employing
analytical and scientific methods.
4
In this paper, “big data analytics” is treated as a singular term.

This article is protected by copyright. All rights reserved.


secondary keywords like “operations”, “operational”, “management”, “marketing”, “optimization”.
Some additional papers are mostly found based on our own experience in the area and
recommendations by others5. All major searching was done in July-August, 2017. Case study
Accepted Article
materials were added afterwards with searches in September 2017. As a remark, there are prior
review and discussion papers related to big data analytics and/or their applications in operations.
For example, Chen et al. (2012) examine the business intelligence and analytics literature and
highlight the impacts brought by big data. Chen and Zhang (2014) review the literature and present
the big data related techniques and methods from an information science technical perspective. Hu
et al. (2014) present an overview of big data, from its definition, history to paradigms. They also
investigate the big data value chain which covers the entire big data cycle from data generation,
data acquisition, data storage to data analysis. From a system engineering perspective, Wang and He
(2016) discuss the main challenges and fundamental strategies on big data computing. The authors
also highlight the challenges and importance of uncertainty in big data analytics. Other reviews and
discussion papers related to big data analytics include an editorial discussion on research issues
around big data in information systems research (Agarwal and Dhar 2014), a systematic review on
the business value of big data (Wamba et al. 2015), a review on the role played by cloud computing
in big data analytics (Hashem et al. 2015), an update of the development in big data analytics for
business applications and risk analysis (Choi, Chan and Yue 2017), a taxonomy of the literature on
IoTs-supported big data analytics (Ahmed et al. 2017) and an examination of big data analytics for
risk analysis (Choi and Lambert 2017).

Most recently, Guha and Kumar (2017) discuss the emergence of big data research and they
examine the topic from the following three perspectives: information systems, operations and
supply chain management, and healthcare. Feng and Shanthikumar (2017) propose analytical
models for probable future research in manufacturing and demand management in the big data era.
Fisher and Raman (2017) explore the use of big data in retail operations. Despite having a number of
review papers around big data analytics, to our best knowledge, none of them explicitly highlight the
OM studies with big data analytics methods (techniques, strategies and architectures), and discuss
how big data analytics maps into OM and the respective applications. This paper hence bridges this
important gap and positions itself as the pioneering review on the topic.

5
We thank Professor Kalyan Singhal for recommending a few important related papers to us.

This article is protected by copyright. All rights reserved.


2. Big Data Analytics Techniques

Big data analytics involves the processing of data from different sources in different formats. For
Accepted Article
example, data can come from the web, social media, ERP systems, and cloud platforms, and they can
be given in text, graphic, audio and video formats. This hence creates terminologies such as web
analytics, social analytics, network analytics, text analytics, and multi-media analytics (see Chen et al.
(2012) and Hu et al. (2014)). In addition, data processing schemes can be split into three types,
namely batch processing, real time (or near real time) stream processing, and interactive processing
(with human interactive inputs-outputs). There are technological supporting platforms for each type
of processing (Chen and Zhang 2014). For instance, Apache Hadoop6 is probably the most famous
batching process software platform (and it implements the computational paradigm, following the
divide and conquer strategy, called Map/Reduce). Dryad and Pentaho Business Analytics are other
examples of batching processing platforms. For real time stream processing, SAP Hana is a software
platform. Storm and S4 are also well-established real time streaming systems which support big data
analytics. For interactive processing systems, Dremel by Google and Apache Drill are examples. As
the technical details behind these schemes are beyond the scope of this review, we refer interested
readers to Chen and Zhang (2014) for more details. In the following, we examine several commonly
used techniques for big data analytics. Note that these techniques are not mutually exclusive and
they naturally overlap to some extent.

Statistics: Statistics is a well-established area and it aims to provide a scientific framework to


collect (e.g., by sampling), analyze and draw inference and conclusion (e.g., statistical testing). Many
statistical methods have been developed to highlight relationships (e.g., correlations), and statistical
regression is commonly used in practice (Huang and Chaovalitwongse 2015). Multivariate statistical
analysis is also a powerful tool for business analytics. Statistics is known to be fast and hence can be
used to overcome the speedy computation requirements related to big data analytics. However,
standard statistical methods (such as the ones in standard business statistics) are usually not
versatile enough to fit the other requirements of big data analytics such as the need to deal with
heterogeneous and unstructured data sets (Chen and Zhang 2014).

6
Hadoop has been improved, e.g., by an integration with “R” to enable parallel processing, as well as other
extensions including Hadoop-ML (Wu et al. 2014).

This article is protected by copyright. All rights reserved.


Machine learning: In artificial intelligence, well-established methods such as neural networks,
support vector machines, and statistical machine learning are all classified as machine learning. In
fact, machine learning provides algorithms for computers to discover knowledge and make decisions
Accepted Article
by first learning from the given data. In big data analytics, machine learning methods have to be
improved, both for the supervised and unsupervised learning schemes. Deep machine learning,
parallel support vector machines, fast learning (Sun et al. 2017), distributed machine learning (Xing
et al. 2015), ontology learning (Lau et al. 2015) and models like Map/Reduce are all machine learning
techniques for dealing with big data problems.

Data mining: Data mining is a process of extracting insights from a given data set. It is the
cornerstone for business intelligence and big data analytics (Choi et al, 2017). It can be used in areas
such as market segmentation, collaborative processes (Fan et al. 2017), classification, clustering
(Fahad et al. 2014) and regression. Presently, data mining is highly specialized with a lot of different
functional areas and approaches. For instance, we have sequential and temporal mining, spatial
mining, process mining, privacy-preserving mining, network mining, web mining, etc., all of which
are associated with big data analytics. Usually, data mining models are developed based on machine
learning and statistics. For some challenges associated with data mining with big data, including the
multi-source data mining mechanism and the dynamic data mining methods, see Wu et al. (2014).

Optimization: Optimization is a standard analytical approach to finding the optimal (or near
optimal) solutions in quantitative decision-making problems. In business applications, methods like
genetic algorithms (Kershenbaum 1997), stimulated annealing, particle filters, and many other
evolutionary algorithms (Potvin 2009) are well-developed ways to find solutions in a reasonably
short time. In big data analytics, computational optimization methods face challenges on memory
and computational time, convergence and identification of globally optimal solutions, and the need
of real-time optimization (Huang and Chaovalitwongse 2015).

Others: In addition to the above four mainstream and major big data analytics techniques,
other techniques such as social network analysis (Banerjee et al. 2016), clustering algorithm analysis
(Fahad et al. 2014), data envelopment analysis (Zhu et al. 2017), and visualization analysis (Strehl
and Ghosh 2003) are known to be useful for big data analytics.

Table 2.1 summarizes the strengths and weaknesses of the four major techniques, and the
corresponding development areas to cope with the big data challenges. Table 2.2 Further shows the
major functionalities of the four major techniques in big data analytics.

This article is protected by copyright. All rights reserved.


Table 2.1. Strengths, weaknesses, and the needed extensions of the major big data analytics
techniques
Accepted Article
Methods Strengths Weaknesses Focused Development Areas for Big
Data Analytics

Statistics Well-established, fast, Not rigorous and versatile 1. Parallel processing.


and analytically tractable enough to deal with big data
challenges such as 2. Statistical computing and
heterogeneous data types learning.

3. Hybrid methods.

Machine Versatile and flexible in Time consuming in training 1. Deep machine learning.
Learning making use of data to
capture complex 2. Scaling up machine learning.
behaviors
3. Fast learning algorithms.

4. Parallel support vector


machine.

5. Parallel processing.

Data Mining Combining statistical and Suffering the weaknesses of 1. Clustering techniques.
machine learning models the underlying models
which make it versatile 2. Distributed and parallel
to deal with different processing.
types of data
3. Multi-media processing.

Optimization Well-established and Traditional optimization 1. Real time optimization.


analytically tractable methods may fail to satisfy
the big data requirements 2. Data reduction.
such as speedy processing
3. Parallel processing.
time
4. Large-scale optimization.

From Table 2.1, it is obvious that different big data analytics techniques have their respective
strengths and weaknesses. Thus, recent research is exploring how to better utilize them for different
kinds of applications.

This article is protected by copyright. All rights reserved.


Table 2.2. Major functionalities and current trends of the major big data analytics techniques
Accepted Article
Methods Functionalities in Big Data Analytics

Statistics 1. Determine correlations and data patterns, and identify the data relationship (e.g., by regression) in a
quick manner.

2. Show whether a sample can be used to denote the population, which helps to reduce data
requirements and computational time.

3. Serve as a basic technique to support other techniques such as data mining.

Machine 1. By intelligence, learn, evolve and capture behaviors of the systems under studies.
Learning
2. Can capture complex relationships in the systems but require substantial training time and memory.

3. Flexible and able to support the processing of different data types by image processing, pattern
recognition, text recognition, etc.

4. Serve as a basic technique to support other technique such as data mining.

Data Mining 1. Extract useful information from data by employing statistical and machine learning models.

2. Clustering analysis, segmentation analysis, dynamic data mining with multi-source huge datasets.

Optimization 1. Find the optimal solution of quantitative analytical models.

2. Require extensions to deal with real time processing, parallel processing, etc in large scale
optimization.

3. Big Data Analytics Strategies

Big data analytics faces various challenges which make them different from the typical data analytics.
From the data side, these challenges include having a massive amount of data points (big data
volume, high dimension), the presence of complex data (high variety of data with different classes
and types), and the existence of high uncertainty. From the computing side, many existing methods
are not flexible enough and “unscalable” to adapt to the requirements of big data. They also suffer

This article is protected by copyright. All rights reserved.


the curse of dimensionality in which they cannot cope with huge-dimensional problems. To
overcome these challenges, there are a few critical strategies in the literature7. We examine them
one by one as follows.
Accepted Article
Divide and Conquer: By its nature, big data is huge in size and beyond the computational
power of the existing information systems that the organization is having. Thus, an intuitive and
fundamental method is to break down the big data into multiple pieces and make them small
enough to be solved one by one. We can then obtain the final analysis by combining the separate
results. This is the general idea behind the “divide and conquer” strategy, which is very commonly
adopted in computing, e.g., in super-computer systems with huge databases, even before the
emergence of “big data analytics”. Note that recent literature has discovered that granular
computing, which employs granules like subsets, classes or clusters, is able to establish computing
models to analyze big data. As granular computing also aims to explore smaller-sized problems in
granules from the original big problem, it is also a special kind of “divide and conquer” strategy.

Distributed and Parallel Processing (DPP): Facing a big dataset, one may also process the data
by multiple parallel and distributed computing systems. This concept is consistent with the divide
and conquer strategy. However, distributed and parallel processing focuses on the importance of
having parallel processing so that the big dataset is being analyzed at the same time by multiple
distributed processors. It has a high degree of flexibility. Recent research also highlights the
importance of distributed machine learning (see, e.g., Xing et al. 2015).

Incremental Learning using New Cases (ILNC): In machine learning, the training process
requires time and when we are given many data points, the training time becomes even more
substantial. This is a hurdle to big data analytics because we need to have quick processing and even
want to achieve real time processing. The ILNC approach aims to incrementally improve the
machine learning algorithms by using the new cases, i.e., new data blocks. This approach requires
the presence of good computing memory so that the knowledge discovered by the trained data sets
will be well-stored.

Statistical Inference: Statistical inference includes statistical sampling and relationship


establishment. In big data analytics, the idea is to make use of a statistical approach to learn about

7
The big data analytics field is developing rapidly. These strategies do not mean to be exhaustive but they do
represent the commonly seen “mainstream” methods to deal with big data challenges, e.g., see Chen et al.
(2014), Hu et al. (2014), and Wang and He (2016).

This article is protected by copyright. All rights reserved.


the relationship between samples and the population. This helps to justify if it is sufficient to process
a smaller sample from the big data population and overcomes the respective challenge.
Accepted Article
Feature Selection: To reduce dimensionality, the feature selection strategy is commonly
adopted. Its idea is to determine a subset from the big dataset which is good enough to represent its
core features. This is a critical strategy to overcome the curse of dimensionality faced by many
optimization methods. Addressing Uncertainty with Learning (AUL): Big data analytics faces the
low veracity problem in which the data sets may have missing data in the data collection process.
This gives rise to the data uncertainty problem and there are recent studies proposing the
uncertainty-based learning strategy for big data analytics which employs methods like
fuzziness-based learning (Wang and He 2016).

Scalability: If the computing systems (e.g., the analytical models or optimization methods) are
scalable, they are more versatile to cope with the need of big data analytics. As a result, it is
important to develop versatile computing systems which are flexible and scalable with respect to
computational power so that they can fulfill the requirements of big data analytics.

Heuristics: In the standard OM literature, for many problems which are difficult to solve in a
reasonable time (e.g., NP hard problems), we develop heuristics to try to find near-optimal solutions
by numerical methods, and then identify bounds. This approach is still applicable in big data
analytics to address the computational time issue.

This article is protected by copyright. All rights reserved.


Table 3.1. A summary of the major big data analytics strategies
Accepted Article
Strategies Details

Divide and Break down the big data into multiple pieces and make them small enough to be solved
Conquer one by one, including the granular computing method.

DPP Process data by multiple parallel and distributed computing in multiple processors.

ILNC Improve the machine learning algorithms incrementally by using the new cases.

Statistical Learn about the relationship between samples and the population, and save
Inference computation effort by processing samples instead of population.

Feature Selection Select a subset from the big dataset to represent its core features.

AUL Deal with missing data by fuzziness methods.

Scalability Ensure the computing system is scalable to deal with big data challenges as “no size fits
all”.

Heuristics Determine the near-optimal solutions and identify the bounds within time and memory
constraints.

4. Big Data in OM Studies

In this section, we review the OM studies related to big data. We classify this review into various OM
topical areas, based on the papers collected.

Forecasting: Among all OM topical areas supported by big data analytics, if we plan to choose
one to start with, “forecasting” is probably the most intuitive and direct one. Traditionally,
forecasting relies heavily on historical data, expert advice and market information. In the big data
era, we have more and more available sources of information, which potentially can enhance the
performance of forecasting8. In the literature, Baughman et al. (2016) report the IBM Global
Technology Services (GTS) team’s research in forecasting web traffic patterns. To be specific, at that
time, IBM’s current practice in terms of cloud platform resources allocation required the
participation of humans so as to meet the demand. The GTS team aims to make it automatic by

8
Notice that it is still controversial whether forecasting with big data really matters significantly as there is also
a high cost associated with big data computation and processing. See Nikolopoulos and Petropoulos (2017) for
a recent study.

This article is protected by copyright. All rights reserved.


employing multiple analytical techniques such as simulation and numerical analysis tools. They
develop a system which can forecast the web traffic demand in near real time. Applying their system
to golf and tennis tournaments, the authors find that their system can reduce the cloud computing
Accepted Article
resource by 50% and also save labor costs (because of the automatic nature of the system). Ferreira
et al. (2016) conduct a case study on an online retailer called Rue La La for its demand forecasting
using data-driven machine learning techniques. By having accurate demand forecasting, the
company can also optimize its pricing decisions, especially for the newly launched products that
have never been sold before. The authors also conduct a field experiment and illustrate that the
proposed demand forecasting system can help improve revenue by 9.7%. Liu et al. (2016) explore
how data (including texts, videos, images, audios, and numbers) from social media such as Twitter
can be used for forecasting. The authors combine techniques which include machine learning, data
mining and cloud computing in their analysis. They conduct experiments by using 400 billion
Wikipedia pages and two billion Tweets. They show that the information content and the timeliness
of the data sources (e.g, Twitter) are most critical to the forecasting performance. See-To and Ngai
(2016) make use of customer review data to conduct timely forecasting of demand distributions, i.e.
“nowcasting”. Based on data sets from the fashion industry, they test the proposed method. They
find that the proposed method can help visualize the key features of the demand distribution as well
as uncover how online customer comments can be used for demand forecasting. Chong et al. (2017)
develop a system to help predict consumer demand by using neural networks with the use of online
reviews and marketing data. Using data from Amazon.com, the authors investigate how the number
of reviews, volume of reviews, online discounts, free deliveries, etc. affect the product demand.
They develop a big data system and use Node.JS agents to examine the Amazon.com webpages, and
employ the obtained data for neural networks analysis. They provide scientific evidence to prove
that the incorporation of online reviews and other promotional factors can help predict product
demand. Cui et al. (2017) explore the daily sales forecasting for an online retailer with the use of big
data (including social media data). By using various machine learning methods, the authors show
that social media data can lead to significant forecasting improvement for forecasting accuracy. They
highlight the importance of using social media information in sales forecasting. Lau et al. (2017)
study the design of big data analytics methodology with the goal of improving sales forecasting. The
authors propose the use of a parallel aspect-oriented sentiment analysis method for mining the
customer online comments on products. They also evaluate the sentiment-enhanced sales
forecasting approach using the co-evolutionary extreme learning machines. They test the proposed
methods using real big datasets and confirm that sales forecasting performance is improved. Most
recently, Sagaert et al. (2017) study the tactical sales forecasting by using “temporal big data”. By

This article is protected by copyright. All rights reserved.


conducting a case study on a supplier in the tire industry, the authors develop a new method which
can automate the identification of critical factors which are related to sales. By aggregating data
across markets and regions, the new method can yield more accurate forecasts compared to the
Accepted Article
existing method in the case study company.

Inventory Management: Inventory control is a critical topic in OM. In the literature, Huang and
Van Mieghem (2014) adopt the statistical approach to explore clickstream data in inventory control
problems. The authors explore a problem in which the retailers feature products online but they
take orders in stores (i.e. offline). By analyzing the empirical click and order data, the authors
develop via dynamic programming a decision support model. They also show empirically that the
clickstream data is statistically significant to predict the timing and amount of orders offline. They
report a computational study that their proposed decision support model can yield a reduction of
3% inventory holding cost and 5% inventory backordering cost. Van Jaarsveld and Scheller-Wolf
(2015) develop a stochastic programming based algorithm for inventory management in an
industrial-scale assemble-to-order system. Due to the problem’s large-scale nature, it is a big data
related optimization problem in inventory control. The authors consider a continuous time model
and derive the optimal base-stock policies. They reveal that the first-come first-served policy in
component allocation performs reasonably well, and further demonstrate that the no-holdback
allocation policies outperform the first-come first-served policy. Recently, Bertsimas et al. (2016)
employ a data-driven optimization technique called conditional stochastic optimization to explore
inventory control with big data. The authors make use of four-year point-of-sales and inventory data
across the retail network in multiple locations of a retail company. They use Google Geocoding API
to obtain the specific coordinates of store locations and employ the search engine Google’s search
query volume to understand the market attention paid to different items. Altogether, by combining
all sources of information, they decide the optimal inventory management scheme for the retail
network.

Revenue Management and Marketing: Big data is important in marketing, revenue


management and some service operations. In the literature, a couple of insightful discussion papers
are published in recent years. For instance, Rust and Huang (2014) discuss how big data would
revolutionize service research and transform marketing science studies. Aloysius et al. (2016) discuss
the applications of big data in retail environments. Tarvin et al. (2017) report a real case study on
how a company faces big data challenges. For technical research, by reviewing analytical models on
the “passenger name record” of booking cancellations, Morales and Wang (2010) explore via data
mining a way to improve service-booking revenue management. The authors identify a set of

This article is protected by copyright. All rights reserved.


variables that can help describe the booking cancellation behaviors of customers. This helps
managers better understand why some customers choose to cancel bookings. Noting that peer
influence is critical in marketing for exploring product demand and promoting the right social change,
Accepted Article
Aral and Walker (2014) conduct experiments on social media (Facebook) by randomly manipulating
messages sent by users of a certain function. The authors identify the effects of tie strengths and
embeddedness. Their findings illustrate how social analytics can be employed to improve marketing
operations. Lu et al. (2016) propose a video-based automated recommender system for shoppers of
garment products. The system is scalable and flexible. The authors employ video-based data
collection and computer vision technologies to get insights from the consumer preferences. To be
specific, by keeping track of shoppers’ behavior in stores, the system compares them with focal
customer information, which is known and stored in the database (including preferences, purchasing
behavior, etc). The authors conduct an empirical study to show the application of the proposed
video-based automated recommender system and find that it is highly applicable. Culotta and Cutler
(2016) develop a method to automatically give brand perception ratings by mining inputs from
Twitter. The authors conduct their analysis by using over 200 brands and various brand perception
attributes, and compare the respective inference results with survey data. Their proposed new
method is also flexible and scalable to cope with the big data challenges. Xue et al. (2016) investigate
the optimal pricing for personalized bundles. The authors employ historical sales data to estimated
utility functions for different market segments. They identify the optimal pricing policy and test it
using empirical data from an information technology service provider. The authors find that the
data-driven pricing policy is highly effective and can improve the personalized bundle pricing
significantly. Finally, Mukherjee and Sinha (2017) study the product recall decisions for medical
devices. The authors examine the highly unstructured and big data from user-generated reports for
the negative events associated with medical devices. They find that the negative user feedbacks
tend to be over-reaction. They identify the sources of judgment bias in medical device’s recall
decisions and propose ways to improve by using a big data analytic approach.

Transportation: Transportation, including traffic control, is a pertinent topic in OM. In the


literature, Lv et al. (2015) employ a deep learning approach to study traffic flow predictions with big
traffic flow data. The authors derive a new method to consider the temporal and spatial correlations
together, and develop a stacked auto-encoder, trained by a greedy layer-wise approach, to learn the
traffic flow features. They report promising experimental results. Shang et al. (2017) explore the
cargo logistics risk (CLR), defined as the deviation of the actual arrival time from the planned arrival
time, by using Bayesian statistics. The authors focus on a flexible estimation of the conditional
density function of the CLR by making use of big air cargo data. Their findings help logistics

This article is protected by copyright. All rights reserved.


companies to differentiate the sources of CLR, which can be recurrent or just from disruptions.
Chung et al (2017) conduct an analytical study on flight delays using a big data set from a major
airline in Hong Kong with respect to its flight information in 112 airports all around the world. The
Accepted Article
authors propose the use of cascading neural networks to improve flight schedule forecasting and
then apply it in aircrew pairing optimization problems. Their computational results show that the
new method increases forecasting accuracy on flight delays, which leads to a substantial
improvement in crew pairing performance. They further propose a dynamic reserve crew strategy,
which yields a significant reduction in operational costs. Xie et al. (2017) examine the deployment of
big data analytics to study accidents in logistics networks. They develop a new grid-based
cell-structured framework, which can make use of big datasets from transit counter turnstiles,
taxies, and even the social media at the same time. The data are analyzed and result in some
indicators for the pedestrian crash models. The authors construct a model which can associate the
grid-based cell-specific risk factors to the traffic crash costs. The authors argue that big data analytics
can give a more precise estimation of the related risk factors, which help to identify the hotspot of
traffic crashes for the implementation of pro-active measures. Jamshidi et al. (2017) develop a new
method that assesses the likelihood of having rail failures by exploring rail surface defects. The
authors employ video camera records to identify rail surface defects. They make use of an intelligent
image processing method to collect big data of rail surface defects, which include the measureable
lengths of these defects. The authors conduct a real case study on a Dutch railway and report very
satisfactory performance of their proposed rail failure assessment system.

Supply Chain Management: Big data has a huge influence on supply chain and logistics
management. It was predicted early that big data analytics would revolutionize supply chain design
(Waller and Fawcett 2013) and may change product lifecycle management in the supply chain (Li et
al. 2015). Big data analytics also affects the optimization of service parts in after-sales operations
management (Boone et al 2016). In the literature, Wang, Gunasekaran and Ngai (2016) study a
distribution network design optimization problem, with the use of big data. The authors consider the
situation in which the supply chain planner can use big data to determine the optimal number of
distribution centers and assign customers to them. They employ a mixed-integer programming
approach and conduct simulation studies to illustrate the performance of the optimization model.
Kaur and Singh (2017) propose a mixed integer nonlinear programming model to address the
environmentally sustainable procurement and logistics operations in supply chains. Owing to the
problem’s complexity and the need to deal with big data and real time analysis, the authors develop
a heuristic. Testing the heuristic by using randomly generated data instances shows that the
heuristic performs well. Papadopoulos et al. (2017) make use of unstructured big data coming from

This article is protected by copyright. All rights reserved.


social media as well as structured data to develop a framework to explain supply chain resilience
towards disasters. In addition to the big data analysis, the authors also conduct a quantitative survey
and report their statistical findings. The authors uncover that information sharing and public-private
Accepted Article
relationships are important enablers of supply chain resilience towards disasters. Li et al. (2016)
conduct an exploratory customer demand analysis in supply chains with e-commerce. The authors
reveal how demand chain management can enhance supply chain management by using website
data and applying data analytics. The authors argue that demand chain management matches well
with big data and e-commerce and they together can bring significant benefits to the supply chain.
Badiezadeh et al. (2017) employ data envelopment analysis (DEA) to assess sustainability of supply
chain systems in the presence of big data. The authors propose a “double frontier network DEA” to
help calculate the efficiency of a multi-stage process. They present a case study to illustrate how
their proposed method can employ data to help rank supply chains with respect to their respective
sustainability scores. We note that there are some analytical economic studies on supply chain
systems with the considerations of big data environments. They include a study on supply chain
coordination with investment decisions in the big data era (Liu and Yi 2017) and an investigation on
impacts of social media on the performance of quick response systems in the big data environment
(Choi 2017).

Risk Analysis: Risk analysis includes activities such as risk assessment, risk monitoring, and risk
control. Undoubtedly, risk analysis, for both business operations (Choi, Chan and Yue 2017) and
non-profit making organization (like governments), would benefit by proper use of big data. In the
literature, Allodi and Massacci (2017) study the cyber-crime problem by using big data. The authors
develop a quantitative scheme for assessing cyber security risk with data from the security centers.
Their proposed scheme can give quantitative probability estimates to help fight untargeted
cyber-attacks towards the organization. The authors conduct an analysis by using real data from a
financial institution to show that their proposed big data risk assessment scheme is effective. Biffis
and Chavez (2017) use a data mining approach to show how to mine satellite big data to yield
weather indices. The indices are critical for weather risk management and has impacts on the
agricultural food industry. The authors develop a data-driven risk transfer scheme. They conduct a
real case study by exploring Mozambique’s maize production. The authors illustrate how weather
data from rainfall and temperature can be used to create risk profiles. They argue that their
proposed framework can lead to a cost saving (from insurance) of 30%. Lopez-Cuevas et al. (2017)
propose a new analytical framework to study “mood” as a proxy of behavior, and reveal how
disruptive events may affect different populations in the presence of risk. In the proposed
framework, the authors first illustrate the mechanism to employ big data from different social media

This article is protected by copyright. All rights reserved.


sources to learn about behavior. They then develop visual analytics to compare the internal and
external behavior of people in a community. Finally, the framework helps to determine the events
that are treated as disruption sources. The authors demonstrate the applicability of the proposed
Accepted Article
framework by applying it to several groups with real social network situations. Lorca et al. (2017)
develop a decision support system to support post-disaster debris and waster management. The
authors focus on the post-disaster operations such as collecting, delivering, reducing and recycling
the debris. Their proposed decision support system can optimize and balance multiple costs (e.g.,
the environmental and financial expenses). The authors show the use of the decision support system
by a case analysis on Hurricane Andrew.

Others: Big data analytics is also employed in various other domains such as healthcare and
retailing. Interested readers can refer to Guha and Kumar (2017) and Fisher and Raman (2017) for
more discussions.

5. Mapping Big Data Analytics Methods to OM

From the above sections, we have examined various big data techniques, strategies, and studies in
the literature. In this section, we explore how different analytical techniques and big data
architectures map into the examined OM topical areas by combining the results.

In fact, it is known that good big data analytics and applications are more than just the proper
deployment of techniques and strategies. In particular, the complete big data architecture’s design is
critical (see Chen and Zhang 2014). From the papers reviewed above, we have found a couple of
generic big data architectures9 (denoted by BDA 1, BDA 2, BDA 3, and BDA 4) and we present them
in the Appendix. To be specific, BDA 1 represents the architecture for the case with batch processing.
Under BDA 1, data sources are collected by the software agents in the workstation. Strategies Z with
batch processing are adopted and linked with the corporate database. Analytic techniques Y are
employed to generate the output and also update the corporate database. BDA 2 is rather similar to
BDA 1 except that the focus is on real time processing and Strategies Z have to support real-time
stream processing. This also calls for real time deployment of the analytic techniques Y to generate

9
Note that there are some subtle differences in terms of, e.g., the specific platforms adopted and some
companies have multiple databases. In the four proposed generic big data architectures, we focus on the
operational perspective and highlight the specific data sources, techniques and strategies adopted in each
architecture.

This article is protected by copyright. All rights reserved.


the output and update the corporate database. BDA 3 is a hybrid architecture which combines BDA 1
and BDA 2. Finally, BDA 4 is the most complex architecture because it has to deal with multiple data
sources, including those generated from multiple architectures M and other sources X.
Accepted Article
Table 5.1. A summary of big data methods (techniques, strategies, architectures) and data
sources being used in the review papers

Areas Papers Big Data Big Data Big Data Data Sources X Real Cases
Techniques Y Strategies Z Architectures Involved (if
(if specified) any)

Forecasting Baughman et Discrete event Feature BDA 4 Web analytics, social IBM
al. (2016) simulation, selection,ILNU,distrib analytics (real data
statistics, feature uted and parallel from social media and
selection processing, statistics web pages)
algorithms,
optimization

Liu et al. Machine learning, Combining multiple BDA 4 Social analytics, text
(2016) data mining techniques analytics, web
analytics (real data
from social media and
web pages)

See-To and Statistics Statistical inference Common data Real data from
Ngai (2016) analytics10 fashion
companies on a
major Chinese
e-commerce
platform

Ferreira et al. Machine learning, Statistical inference BDA 4 Web analytics, ERP An online retailer
(2016) optimization system Rue La La

Chong et al. Machine learning Distributed and BDA 4 Web analytics (using
(2017) parallel processing real data obtained
from web crawling –
Amazon.com)

Cui et al. Machine learning Combining multiple Social analytics An online apparel
(2017) techniques retailer

Lau et al. Machine learning Combining multiple BDA 4 Social analytics


(2017) (extreme learning techniques, parallel
machines) processing

Sagaert et al. Statistics Statistical inference BDA 1 Common data A major supplier

10
The term “common data analytics” refers to the case when the data are structured and given in numerical
values.

This article is protected by copyright. All rights reserved.


(2017) analytics in the tire industry

Inventory Huang and Optimization, Statistical inference BDA 2 Web analytics


Management Van Mieghem statistics
Accepted Article
(2014)

Van Jaarsveld Optimization Heuristics Standard numerical


and data analysis
Scheller-Wolf
(2015)

Bertsimas et Optimization Statistical inference BDA 1 Web analytics Sales data from a
al. (2016) retail company

Revenue Morales and Data mining Statistical inference Common data Real reservation
Management Wang (2010) analytics record dataset
and Marketing from a hotel chain
in the UK

Aral and Statistics Statistical inference Social analytics


Walker (2014) (Facebook)

Lu et al. Computer vision Scalability BDA 3 Video/graphic/image A garment retailer


(2016) technologies, analytics
data mining

Culotta and Data mining Scalability BDA 3 Social analytics


Cutler (2016) (Twitter)

Xue et al. Optimization, Statistical inference Common data An information


(2016) Statistics analytics technology
service provider

Mukherjee Machine learning, Statistical inference BDA 1 Unstructured big data The medical
and Sinha optimization device industry
(2017)

Transportatio Lv et al. Machine learning Heuristics BDA 1 Common data


n (2015) analytics

Shang et al. Statistics Statistical inference Common data Real air cargo data
(2017) analytics from a leading
forwarder

Chung et al. Machine learning, Statistical inference BDA 1 Common data A leading Hong
(2017) optimization analytics Kong airline

Xie et al. Statistics Statistical inference BDA 1 Common data Manhattan city
(2017) analytics (from
multiple sources)

Jamshidi et al. Machine learning Heuristics BDA 4 Common data Dutch railway
(2017) analytics network

Supply Chain Wang et al. Optimization Statistical inference BDA 1 Common data
Management (2016) analytics (from
multiple sources)

Li et al. (2016) Statistics Statistical inference Web analytics

Badiezadeh et Data Heuristics Common data Nine Iranian


al. (2017) envelopment analytics tomato pastes

This article is protected by copyright. All rights reserved.


analysis supply chains

Kaur and Optimization Heuristics, statistical BDA 1 Common data


Singh (2017) inference analytics
Accepted Article
Papadopoulos Statistics Statistical inference Web analytics, and Nepal disaster
et al. (2017) survey data relief operations

Risk Analysis Allodi and Statistics Statistical inference BDA 2 Common data A financial
Massacci analytics institution
(2017)

Biffis and Data mining Statistical inference BDA 2 Common data Maize production
Chavez (2017) analytics in Mozambique

Lopez-Cuevas Statistics Statistical inference BDA 3 Social analytics


et al. (2017) (Twitter data)

Lorca et al. Statistics Statistical inference BDA 4 Web analytics Hurricane Andrew
(2017) (including online map
data)

Table 5.1 shows how these four big data architectures, with the respective big data techniques
Y, big data strategies Z and data sources X, would fit into different big data analytics models in the
examined papers. From Table 5.1, we have the following findings:

1. Big Data Techniques: Data mining and machine learning techniques are widely used in OM
studies in forecasting, revenue management and marketing, and transportation management.
They are also used in risk analysis. This observation shows the fact that these OM topical areas
involve complex data patterns which require the use of more versatile techniques like machine
learning and data mining to explore. Optimization is the standard technique for inventory
management, and also commonly used in supply chain management. This is expected because
analytical optimization models are well established in inventory management (e.g., the base
stock policy) and supply chain management. Even in the presence of big data, researchers very
likely will consider the application of optimization techniques to solve these problems. Statistics,
as the basic and most fundamental technique for data analysis, is present in almost all
examined OM topical areas.

2. Big Data Strategies and Using the Multi-Methodological Approach: For the majority of studies,
statistical inference is the strategy adopted to deal with the big data problem. This highlights
the fact that for the respective studies, they actually are exploring relatively simple big data
problems. For some other studies, heuristics and scalability are two important strategies to deal

This article is protected by copyright. All rights reserved.


with the big data problem. This point is also intuitive because heuristics help to identify the
near optimal solution via numerical algorithms within the given time constraints. The heuristics
strategy is hence a measure for “feasibility” in many cases. The scalability strategy is critical and
Accepted Article
especially important to deal with big data challenges. Expectedly, more OM studies would
consider this strategy and factor in their analysis in the presence of big data. Finally, note that
for most real world applications of big data analytics (such as the IBM project reported by
Baughman et al. (2016)), they adopt a multi-methodological approach (Choi et al. 2016) in their
strategy to cope with the big data challenge. This provides an alternative reason to explain why
researchers should consider multiple methods in conducting OM research in the big data era. It
is critical to combine multiple methods to develop the “big data analytics framework”.

3. Big Data Architectures: In OM studies, the use of batch processing is still popular and common.
This is consistent with the observations that most studies in Table 5.1 are based on batch
processing. However, as real time processing is critical for risk analysis, we do see that more
applications are associated with it. Moreover, we see that the use of BDA 4 (combining multiple
big data architectures) and BDA 3 (combining both real time stream processing and batching
processing together) are quite common.

4. Data Sources: Social media data and web data are very commonly used to conduct studies in
the big data era due to their “public-data” nature. As such, web analytics and social analytics
have been widely observed in the reviewed OM studies. In addition, most reviewed OM studies
are still using the common data analytics method which refers to the analysis based on
structured datasets with numerical data points. This makes the analysis easier but has not
completely realized the true big data nature of having a large variety of data formats and data
sets.

5. Real Cases Based Studies: It is encouraging to see that many reviewed papers report real case
studies. This is an important feature of big data based studies because we have to use real
world relevant data to conduct experiments and analyses. We expect this trend to continue and,
hopefully, more real case based OM studies on big data analytics will appear in the future.

6. Real World Cases and Applications

In order to explore real world applications of big data analytics in operations, we conduct a case
study in this section. We choose to identify the world top enterprises in this case study because they
have the needed resources to develop and deploy big data analytics and we also have relatively

This article is protected by copyright. All rights reserved.


more public information to conduct our research. Findings from the case studies can provide
guidance to managers and organizations on the proper deployment of big data analytics.
Accepted Article
To be specific, we follow the world’s most valuable 100 brands taken from Forbes.com, and
identify the highest ranking one in each category for further exploration11. The result is shown in
Table 6.1. After identifying these giant enterprises, we continue our search by focusing our attention
on the company’s website, annual reports and the news from Forbes.com. The results are
summarized in Table 6.2 and some supplementary details are available from the authors upon
requests.

Table 6.1. Most valuable branded company in each industrial category (from Forbes.com 2017)

Industrial Category Company Ranking

Technology Apple 1

Beverages Coca-Cola 5

Leisure Disney 7

Automotive Toyota 8

Restaurants McDonald’s 9

Apparel Nike 16

Luxury Louis Vuitton 20

Alcohol Budweiser 22

Financial Services American Express 23

Retail Walmart 24

Heavy equipment Caterpillar 82

11
The categorization follows the ones as shown on Forbes’ webpage
[https://www.forbes.com/powerful-brands/list/ (accessed 18 September 2017)]. We do not include those
brands that are categorized as “diversified” (e.g., GE, Siemens, BASF, and Philips) or with very limited
information as well as some big data related service providers.

This article is protected by copyright. All rights reserved.


Table 6.2. Big data applications in the case study companies
Accepted Article
Company Big Data Technologies Employed Major Areas

Apple Mobile analytics; Hadoop Revenue management and marketing: new products design; new
service-bundle-products development

Coca-Cola Mobile analytics; social analytics; AI; image Revenue management and marketing: new products design (e.g. tastes); new
recognition; augmented reality customization service; bottle packaging

Disney Machine learning Revenue management and marketing: customer experience, customization,
park operations

Toyota AI (robotics) Revenue management and marketing: new product design, and pricing; new
service development.

McDonald’s Mobile analytics; mobile computing (iBeacon) Revenue management and marketing: marketing campaigns and promotion;
membership scheme

Nike Machine learning, mobile computing Demand forecasting; Manufacturing; Revenue management and marketing:
new product design, and pricing; new service development (e.g., speedy
customization)

Louis Vuitton Social media analytics Revenue management and marketing: real time fashion show; product pricing

Budweiser Virtual reality; AI Revenue management and marketing: new product design, and customer
experience

American Machine learning Revenue management and marketing: customer experience, new service
Express development; Risk management

Walmart Mobile computing, AI, facial recognition Inventory management: auto-replenishment; Revenue management and
marketing: pricing, customer services, visual merchandising; Store operations:
auto-check out

Caterpillar Machine learning, mobile computing Operations: optimization of resource allocation; Revenue management and
marketing: use of power and fuel

From Table 6.2, it is obvious that all these big enterprises have used big data analytics for
revenue management and marketing activities. This is intuitive as big data from the market,
including consumers, would provide a valuable source of information for these enterprises to
improve their business operations and marketing activities such as product offering, new product
development, market segmentation and pricing. In addition, big data analytics and applications are
also commonly seen in many timely business models such as customized service and individual
product offering. Other important activities in which big data analytics plays a critical role in practice
include demand forecasting and inventory replenishment and management.

This article is protected by copyright. All rights reserved.


7. Concluding Remarks and Future Research Opportunities

There is no doubt that we are now in the big data era. Big data analytics is critical in all kinds of
Accepted Article
organizations and enterprises. OM, as a field which focuses on the optimal use of resources to
improve efficiency and effectiveness of operations, should also take the opportunity to develop itself
to work well with big data.

In this paper, we have reviewed various existing big data related analytics techniques. To be
specific, we have highlighted the importance of statistics, machine learning, data mining, and
optimization models for supporting big data analytics. The strengths and weaknesses of them have
been examined and compared, and their major functionalities have been studied. Then, we have
introduced and discussed various big data analytics strategies such as divide and conquer,
distributed and parallel processing, incremental learning and statistical inference. The core features
of them have been concisely investigated. After that, we have reviewed the related literature and
reported how big data analytics has been applied in topical areas such as forecasting, inventory
management, revenue management and marketing, transportation management, supply chain
management and risk analysis. We have proposed and developed different kinds of big data
architectures. We further revealed how different types of big data techniques, strategies and
architectures can be applied to these OM topical areas. Finally, from exploring publicly available
information on how large scale enterprises use big data analytics, we have uncovered further
insights into the real world applications of big data analytics. We believe that these findings are
valuable to both practitioners and academics who are interested in how big data analytics can be
used in OM.

From our exploration, we have identified a few promising areas that can be studied in the
future:

1. Optimal choices of big data analytics techniques and strategies: From the above analysis, both
companies in the real world and academic studies have used many different kinds of big data
techniques and strategies in operations. However, are they using the best techniques and
strategies? How to determine the best techniques and strategies? These are some fundamental
questions which have not been well-answered. They hence deserve deeper exploration and
further studies.

2. Big data architectures: In this paper, we have proposed four different categories of BDA
architectures (see the Appendix). Despite trying to capture the most essential real world
elements and simplify the picture, these architectures are far from being perfect and

This article is protected by copyright. All rights reserved.


comprehensive. For example, the architecture BDA4 can be further classified into many different
forms. Future research can hence be conducted to further identify different kinds of big data
architectures. A taxonomy can be established. In addition, how to determine the optimal
Accepted Article
architectures for different real world scenarios is also an interesting question to explore.

3. Application areas: From the review and real practice examination, “revenue management and
marketing” is a popular area in which big data analytics and the related tools have been applied
extensively. However, from the analysis and review above, there are relatively few published
papers and real world enterprises focusing on supply chain management with big data
applications. Thus, supply chain management is a definitely an under-explored area. The reason
behind can be explained by the fact that big data analytics for supply chain management is
challenging because it requires multiple supply chain members to work closely together for the
use of big data. Thus, in future research, it will be promising and challenging to investigate how
big data analytics can be applied for critical issues such as strategic partnership and channel
coordination in supply chain systems.

4. Real world issues: In this paper, we have studied many real world applications of big data
analytics, especially in large-scale enterprises. On one hand, these studies are introductory and
not deep enough. In the future, more in-depth case studies can be conducted to reveal more
insights regarding their applications of big data analytics. On the other hand, the use of big data
analytics is associated with many social issues such as data privacy, threats to human and social
welfare (e.g., the emergence of artificial intelligence), etc. These should also be studied in the
future so that proper rules can be imposed to ensure the use of big data analytics is ethically
sound and will contribute positively to the society.

Acknowledgements:

We are grateful to the Editor in Chief, Professor Kalyan Singhal for his great support and important
advice on this paper. Tsan-Ming Choi’s research is partially supported by The Hong Kong Polytechnic
University (Grant Number: G-YBGR ). Yulan Wang’s research is partially supported by The Hong Kong
Polytechnic University (Grant Number: G-YBQR).

This article is protected by copyright. All rights reserved.


References

Agarwal, R., V. Dhar. 2014. Editorial – big data, data science, and analytics: The opportunity and
Accepted Article
challenge for IS research. Information Systems Research 25(3) 443-448.
Ahmed E, I. Yaqoob, I. Hashem, I. Khan, A. Ahmed, M. Imran, A. V. Vasilakos. 2017. The role of big
data analytics in internet of things. Computer Networks 129(2) 459-471.
Ale B. 2016. Risk analysis and big data. Safety and Reliability 36(3) 153-165.
Allodi, L., F. Massacci. 2017. Security events and vulnerability data for cyber security risk. Risk
Analysis 37(8) 1607-1627.
Aloysius, J.A., H. Hoehle, S. Goodarzi, V. Venkatesh. 2016. Big data initiatives in retail environments:
Linking service process perceptions to shopping outcomes. Annals of Operations Research.
Forthcoming.
Aral, S., D. Walker. 2011. Creating social contagion through viral product design: A randomized trial
of peer influence in networks. Management Science 57(9) 1623–1639.
Aral, S., D. Walker. 2012. Identifying influential and susceptible members of social networks. Science
337(6092) 337–341.
Aral, S., D. Walker. 2014. Tie strength, embeddedness, and social influence: A large-scale networked
experiment. Management Science 60(6) 1352-1370.
Arunachalam, D., N. Kumar, J.P. Kawalek. 2017. Understanding big data analytics capabilities in
supply chain management: Unravelling the issues, challenges and implications for practice.
Transportation Research – Part E. Forthcoming.
Badiezadeh, T., R.F. Saen, T. Samavati. 2017. Assessing sustainability of supply chains by double
frontier network DEA: A big data approach. Computers and Operations Research. Forthcoming.
Banerjee, S., S. Sanghavi, S. Shakkottai. 2016. Online collaborative filtering on graphs. Operations
Research 64(3) 756-769.
Baughman, A.K., R. Bogdany, B. Harrison, B. O´Connell, H. Pearthree, B. Frankel, C. McAvoy, S. Sun, C.
Upton. 2016. IBM predicts cloud computing demand for sports tournaments. Interfaces 46(1)
33-48.
Bertsimas, D., N. Kallus, A. Hussain. 2016. Inventory management in the era of big data. Production
and Operations Management 25(12) 2002-2013.
Biffis, E., E. Chavez. 2017. Satellite data and machine learning for weather risk management and
food security. Risk Analysis 37(8) 1508-1520.
Boone, C.A., B.T. Hazen, B. Skipper, R.E. Overstreet. 2016. A framework for investigating
optimization of service parts performance with big data. Annals of Operations Research.
Forthcoming.

This article is protected by copyright. All rights reserved.


Cerchiello, P., P. Giudici. 2016. Big data analysis for financial risk management. Journal of Big Data
318 (12 pages).
Chen, C.L.P., C.Y. Zhang. 2014. Data-intensive applications, challenges, techniques and technologies:
Accepted Article
A survey on big data. Information Sciences 275 314-347.
Chen, H., R.H.L. Chiang, V.C. Storey. 2012. Business intelligence and analytics: From big data to big
impact. MIS Quarterly 36(4) 1165-1188.
Choi, T.M. 2017. Incorporating social media observations and bounded rationality into fashion quick
response supply chains in the big data era. Transportation Research - Part E. Forthcoming.
Choi, T.M., H.K. Chan, X. Yue. 2017. Recent development in big data analytics for business operations
and risk management. IEEE Transactions on Cybernetics 47(1) 81-92.
Choi, T.M., T.C.E. Cheng, X. Zhao. 2016. Multi-methodological research in operations management.
Production and Operations Management 25(3) 379-389.
Choi, T.M., J. Gao, J.H. Lambert, C.K. Ng, J. Wang. 2017. Optimization and Control for Systems in the
Big-Data Era: Theory and Applications, New York: Springer.
Choi, T.M., J.H. Lambert. 2017. Advances in risk analysis with big data. Risk Analysis 37(8)1435-1442.
Chong, A.Y.L., E. Ch’ng, M. J. Liu, B. Li. 2017. Predicting consumer product demands via big data: The
roles of online promotional marketing and online reviews. International Journal of Production
Research 55(17) 5142-5156.
Chung, C.H., H.L. Ma, H.K. Chan. 2017. Cascading delay risk of airline workforce deployments with
crew pairing and schedule optimization. Risk Analysis 37(8) 1443-1458.
Cui, R., S. Gallino, A. Moreno, D. J. Zhang. 2017. The operational value of social media information.
Production and Operations Management. Forthcoming.
Culotta, A., J. Cutler. 2016. Mining brand perceptions from Twitter social networks. Marketing
Science 35(3) 343-362.
Daneshmand, A., F. Facchinei, V. Kungurtsev, G. Scutari. 2015. Hybrid random/deterministic parallel
algorithms for convex and nonconvex big data optimization. IEEE Transactions on Signal
Processing 63(15) 3914-3929.
Ding, X., Y. Tian, Y. Yu. 2016. A real-time big data gathering algorithm based on indoor wireless
sensor networks for risk analysis of industrial operations. IEEE Transactions on Industrial
Informatics 12(3) 1232-1242.
Drosou, M., H.V. Jagadish, E. Pitoura, J. Stoyanovich. 2017. Diversity in big data: A review. Big Data
5(2) 73-84.
Facchinei, F., G. Scutari. 2015. Parallel selective algorithms for nonconvex big data optimization. IEEE
Transactions on Signal Processing 63(7) 1874-1889.

This article is protected by copyright. All rights reserved.


Fahad, A., N. Alshatri, Z. Tari, A. Alamri, I. Khalil, A. Y. Zomaya, S. Foufou, A. Bouras. 2014. A survey
of clustering algorithms for big data: Taxonomy and empirical analysis. IEEE Transactions on
Emerging Topics in Computing 2(3) 267-279.
Accepted Article
Fan, S., X. Li, J.L. Zhao. 2017. Collaboration process pattern approach to improving teamwork
performance: A data mining based methodology. INFORMS Journal on Computing 29(3) 438-456.
Feng, Q., G. Shanthikumar. How research in production and operations management may evolve in
the era of big data. Production and Operations Management. Forthcoming.
Ferreira, K.J., B.H.A. Lee, D. Simchi-Levi. 2016. Analytics for an online retailer: Demand forecasting
and price optimization. Manufacturing and Service Operations Management 18(1) 69-88.
Fisher, M., A. Raman. Using data and big data in retailing. Production and Operations Management,
forthcoming.
Guha, S., S. Kumar. 2017. Emergence of big data research in operations management, information
systems, and healthcare: Past contributions and future roadmap. Production and Operations
Management, forthcoming.
Hazen, B.T., J. B. Skipper, J. D. Ezell, C. A. Boone. 2016. Big data and predictive analytics for supply
chain sustainability: A theory-driven research agenda. Computers and Industrial Engineering 101
592-598.
Hashem, I. A.T., I. Yaqoob, N. B. Anuar, S. Mokhtar, A. Gani, S. U. Khan. 2015. The rise of “big data”
on cloud computing: Review and open research issues. Information Sciences 47 98-115.
Ho, T. H., N. Lim, S. Reza, X. Xia. 2017. Causal inference models in operations management.
Manufacturing and Service Operations Management 19(4) 509-525.
Hu, H., Y. Wen, T. Chua, X. Li. 2014. Toward scalable systems for big data analytics: A technology
tutorial. IEEE Access 2 652-687.
Huang, S., W.A. Chaovalitwongse. 2015. Computational optimization and statistical methods for big
data analytics: Applications in neuroimaging. INFORMS Tutorials in Operations Research 2015
71-88.
Huang, T., J.A. Van Mieghem. 2014. Clickstream data and inventory management: Model and
empirical analysis. Production and Operations Management 23(3) 333-347.
Jamshidi, A., S. Faghih-Roohi, S. Ha izadeh, A. N ez, R. Babuska, R. Dollevoet, Z. Li, B. De Schutter.
2017. A big data analysis approach for rail failure risk assessment. Risk Analysis 37(8) 1495-1507.
Kaur, H., S.P. Singh. 2017. Heuristic modeling for sustainable procurement and logistics in a supply
chain using big data. Computers and Operations Research. Forthcoming.
Kershenbaum, A .1997. When genetic algorithms work best. INFORMS Journal on Computing 9(3)
254-255.

This article is protected by copyright. All rights reserved.


Lau, R. Y. K., J. L. Zhao, W. Zhang, Y. Cai, E. W. T. Ngai. 2015. Learning context-sensitive domain
ontologies from folksonomies: A cognitively motivated method. INFORMS Journal on Computing
27(3) 561-578.
Accepted Article
Lau, R.Y. K., W. Zhang, W. Xu. 2017. Parallel aspect-oriented sentiment analysis for sales forecasting
with big data. Production and Operations Management. Forthcoming.
Li, J., F. Tao, Y. Cheng, L. Zhao. 2015. Big data in product lifecycle management. International Journal
of Advanced Manufacturing Technologies 81 667-684.
Li, L., T. Chi, T. Hao, T. Yu. 2016. Customer demand analysis of the electronic commerce supply chain
using big data. Annals of Operations Research. Forthcoming.
Liu, P., S.P. Yi . 2017. A study on supply chain investment decision-making and coordination in the big
data environment. Annals of Operations Research. Forthcoming.
Liu, X., P.V. Singh, K. Srinivasan. 2016. A structured analysis of unstructured big data by leveraging
cloud computing. Marketing Science 35(3) 363-388.
López–Cuevas, A., J. Ramírez-Márquez, G. Sanchez-Ante, K. Barker. 2017. A community perspective
on resilience analytics: A visual analysis of community mood. Risk Analysis 37(8) 1566-1579
Lorca, Á., M. Çelik, Ö. Ergun, P. keskinocak. 2017. An optimization-based decision-support tool for
post-disaster debris operations. Production and Operations Management. Forthcoming.
Lu, S., L. Xiao, M. Ding. 2016. A video-based automated recommender (VAR) system for garments.
Marketing Science 35(3) 484-510.
Lv, Y., Y. Duan, W. Kang, Z. Li, F. Wang. 2015. Traffic flow prediction with big data: A deep learning
approach. IEEE Transactions on Intelligent Transportation Systems 16(2) 865-873.
Morales, D.R., J. Wang. 2010. Forecasting cancellation rates for services booking revenue
management using data mining. European Journal of Operational Research 202 554-562.
Mukherjee, U.K., K.K. Sinha. 2017. Product recall decisions in medical device supply chains: A big
data analytic approach to evaluating judgement bias. Production and Operations Management.
Forthcoming.
Nguyen, T., L. Zhou, V. Spiegler, P. Ieromonachou,Y. Lin. 2017. Big data analytics in supply chain
management: A state-of-the-art literature review. Computers and Operations Research.
Forthcoming.
Nikolopoulos, K., F. Petropoulos. 2017. Forecasting for big data: Does suboptimality matter?
Computers and Operations Research. Forthcoming.
Papadopoulos, T., A. Gunasekaran, R. Dubey, N. Altay, S. J. Childe, S. Fosso-Wamba. 2017. The role of
big data in explaining disaster resilience in supply chains for sustainability. Journal of Cleaner
Production 142 1108-1118.

This article is protected by copyright. All rights reserved.


Passacantando, M., D. Ardgna, A. Savi. 2016. Service provisioning problem in cloud and multi-cloud
systems. INFORMS Journal on Computing 28(2) 265-277.
Potvin, J.Y. 2009. Evolutionary algorithms for vehicle routing. INFORMS Journal of Computing 21(4)
Accepted Article
518-548.
Richtarik, P., M. Takac. 2016. Parallel coordinate descent methods for big data optimization.
Mathematical Programming – Series A 156 433-484.
Rust, R.T., M.H. Huang. 2014. The service revolution and the transformation of marketing science.
Marketing Science 33(2) 206-221.
Sagaert, Y. R., E. Aghezzaf, N. Kourentzes, B. Desmet. 2017. Temporal big data for tactical sales
forecasting in the tire industry. Interfaces. Forthcoming.
See-To, E.W.K., E.W.T. Ngai. 2016. Customer reviews for demand distribution and sales nowcasting:
A big data approach. A big data approach. Annals of Operations Research. Forthcoming.
Shang, Y., D. Dunson, J. S. Song. 2017. Exploiting big data in logistics risk assessment via Bayesian
nonparametrics. Operations Research. Forthcoming.
Simchi-Levi, D. 2017. The new frontier of price optimization. MIT Sloan Management Review 59(1)
22-26.
Song, M. L., R. Fisher, J. Wang, L. Cui. 2016. Environmental performance evaluation with big data:
Theories and methods. Annals of Operations Research. Forthcoming.
Strehl, A., J. Ghosh. 2003. Relationsip-based clustering and visualization for high-dimensional data
mining. INFORMS Journal on Computing 15(2) 208-230.
Sun, F., G. Huang, Q. M. J. Wu, S. Song, D. C. Wunsch II. 2017. Efficient and rapid machine learning
algorithms for big data and dynamic varying systems. IEEE Transactions on Systems, Man and
Cybernetics – Systems 47(10) 2625-2626.
Tarvin, D. A., L. Sipeki, A.M. Newman, A.S. Hering. 2017. Lessons learned from a company dealing
with big data. Interfaces. Forthcoming.
Takaishi, D., H. Nishiyama, N. Kato, R. Miura. 2014. Toward energy efficient big data gathering in
densely distributed sensor networks. IEEE Transactions on Emerging Topics in Computing 2(3)
388-397.
Van Jaarsveld, W., A. Scheller-Wolf. 2015. Optimization of industrial-scale assemble-to-order
systems. INFORMS Journal of Computing 27(3) 544-560.
Waller, M.A., S.E. Fawcett. 2013. Data science, predictive analytics and big data: A revolution that
will transform supply chain design and management. Journal of Business Logistics 34(2) 77-84.

This article is protected by copyright. All rights reserved.


Wamba, S. F., S. Akter, A. Edwards, G. Chopin, D. Gnanzou. 2015. How ‘big data’ can make big impact:
Findings from a systematic review and a longitudinal case study. International Journal of
Production Economics 165 234-246.
Accepted Article
Wang, G., A. Gunasekaran, E.W.T. Ngai. 2016. Distribution network design with big data: Model and
analysis. Annals of Operations Research. Forthcoming.
Wang, X., Y. He. 2016. Learning from uncertainty for big data. IEEE Systems, Man, and Cybernetics
Magazine 2(2) 26-32.
Wu, X., X. Zhu, G. Wu, W. Ding. 2014. Data mining with big data. IEEE Transactions on Knowledge
and Data Engineering 26(1) 97-107.
Xie, K., K. Ozbay, A. kurkcu, H. Yang. 2017. Analysis of traffic crashes involving pedestrians using big
data: Investigation of contributing factors and identification of hotspots. Risk Analysis 37(8)
1459-1476.
Xing, E. P., Q. Ho, W. Dai, J. K. Kim, J. Wei, S. Lee, X. Zheng, P. Xie, A. Kumar, Y. Yu. 2015. Petuum: A
new platform for distributed machine learning on big data. IEEE Transactions on Big Data 1(2)
49-67.
Xue, Z., Z. Wang, M. Ettl. 2016. Pricing personalized bundles: A new approach and an empirical study.
Manufacturing and Service Operations Management 18(1) 51-68.
Zhu, Q., J. Wu, M. Song. 2017. Efficiency evaluation based on data envelopment analysis in the big
data context. Computers and Operations Research. Forthcoming.

This article is protected by copyright. All rights reserved.


Appendix: Big Data Architectures for OM problems

Data Sources X: “X” can be web, social media, sensors, corporate database, etc.
Accepted Article
Analytic Techniques Y: “Y” can be machine learning methods (e.g., neural networks), optimization
models, statistical models, data mining approach, etc.

Strategies Z: “Z” can be distributed and parallel processing, feature selection, statistical inference,
etc.

Multiple Architectures M: “M” can include big data architectures 1, 2, 3 or a mix of them.

Data Output
Sources X

Software agents in the Analytic techniques Y


workstation

Strategies Z with batch processing (e.g., scalable Corporate


servers hosting virtual machines; distributed and database
parallel batch processing).

Figure 5.1. Big data architecture 1 (BDA 1) with batch processing.

This article is protected by copyright. All rights reserved.


Data
Accepted Article
Sources X

Software agents in the


workstation Output
Corporate
database

Strategies Z with real time stream processing Analytic techniques Y


(e.g., ILNC, feature selection).

Figure 5.2. Big data architecture 2 (BDA 2) with real time processing.

This article is protected by copyright. All rights reserved.


Corporate
Accepted Article
database Batch processing with Batch processing
Data Strategies Z1 view
Sources X

Software agents in Analytic techniques (e.g., neural


the workstation networks, optimization) Output

Real time stream processing with Strategies Z2


Real time stream processing
view

Figure 5.3. Big data architecture 3 (BDA 3) with real time processing and batch processing together.

Output data generated by multiple architectures M

Corporate
database Output
Analytic techniques Y

Data Sources X

Figure 5.4. Big data architecture 4 (BDA 4) with multiple data sources (from X and multiple
architectures M).

This article is protected by copyright. All rights reserved.

You might also like