Toward Understanding Outcomes Associated With Data Quality Improvement

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/319466212

Toward understanding outcomes associated with data quality improvement

Article  in  International Journal of Production Economics · September 2017


DOI: 10.1016/j.ijpe.2017.08.027

CITATIONS READS

5 261

4 authors:

Benjamin Thomas Hazen Fred Karl Weigel


Fachhochschule Oberösterreich Lipscomb University
103 PUBLICATIONS   2,262 CITATIONS    17 PUBLICATIONS   93 CITATIONS   

SEE PROFILE SEE PROFILE

Jeremy D. Ezell Bradley C Boehmke


Auburn University Air Force Institute of Technology
10 PUBLICATIONS   430 CITATIONS    20 PUBLICATIONS   60 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Toward Developing Competencies for a Healthcare Administration Program: A Delphi Study View project

Sustainability & Big data analytics View project

All content following this page was uploaded by Bradley C Boehmke on 11 October 2017.

The user has requested enhancement of the downloaded file.


International Journal of Production Economics 193 (2017) 737–747

Contents lists available at ScienceDirect

International Journal of Production Economics


journal homepage: www.elsevier.com/locate/ijpe

Toward understanding outcomes associated with data quality improvement


Benjamin T. Hazen a, Fred K. Weigel b, *, Jeremy D. Ezell c, Bradley C. Boehmke a,
Randy V. Bradley d
a
Department of Operational Sciences, Air Force Institute of Technology, Wright-Patterson Air Force Base, OH, USA
b
Department of Management, Lipscomb University, Nashville, TN, USA
c
Department of Computer Information Systems and Business Analytics, James Madison University, Harrisonburg, VA, USA
d
Department of Marketing and Supply Chain Management, The University of Tennessee, Knoxville, TN, USA

A R T I C L E I N F O A B S T R A C T

Keywords: Business analytics is driving the way organizations compete. However, the decisions made as a result of any
Data quality analytics process are greatly dependent on the quality of the data from which they are based. Scholars suggest
Quality control several data quality improvement methods, and commercial software is now available that can help improve data
Supply chain innovation
quality. Although organizations are motivated to find ways to improve analytic capabilities and using these
Aviation logistics
methods has been shown to improve some measures of data quality, there is little understanding of the tangible
Case study
and intangible outcomes of employing these methods. In this study, we explore outcomes that arise from a data
quality improvement process implementation in an operations management environment. Over a three-year
period, we conducted a longitudinal single case study at an organization that maintains a large fleet of
aircraft, collecting and analyzing qualitative interviews and observations. The findings suggest outcomes (both
positive and cautionary) associated with the implementation of data quality improvement processes. These
include increased stakeholder commitment to data quality and business analytics, as well as an over-emphasis on
program metrics to the peril of operational outcomes, among others. From the basis of our findings, we construct a
research agenda, which includes testable Propositions regarding outcomes of data quality improvement.

1. Introduction their business decision-making processes. Gathering, storing, and making


sense of voluminous data through knowledge generation is a complex
1.1. Motivation for study process (Wamba et al., 2015), however, and there are many opportunities
throughout the process for data integrity to be compromised.
Decision-making professionals within modern organizations provide High-quality data are paramount to optimal analytics performance,
management with strategic analysis and operational recommendations particularly decision-making effectiveness. The ramifications of low-
based on data collected and stored by the firm. The volume of data quality data have been noted in the literature for decades (e.g., Wang
flowing through the contemporary firm can be overwhelming, moti- and Strong, 1996), and efforts continue today to find ways to provide
vating new ways of thinking about how data are organized and analyzed higher quality data and more contextual information about the data to
(Hey, 2010). The importance of high-quality data for decision-making improve decisions (e.g., Price and Shanks, 2011). For instance, research
provides a strong impetus for organizations to actively improve all as- has focused on a variety of means to monitor and enhance data quality
pects of their business analytics activities. Scholarly research increasingly (Lenz and Borowski, 2010; Pierchala et al., 2009; Redman, 1992, 2001,
suggests that business analytics is quickly becoming a separate capability, 2008; Sparks and OkuGami, 2010a,b). But, even monitoring and
which an organization can leverage to compete in the marketplace (Akter awareness of the sub-par quality of data used for decision-making are not
et al., 2016; Chen et al., 2012; Davenport, 2006). For example, LaValle sufficient unless the organization takes active steps to correct data quality
et al. (2010), suggest that an organization's analytics capability is a sig- issues. It is this active effort to enhance data quality, that is to improve
nificant marketplace differentiator; the top performing firms in their the completeness, accuracy, consistency, and timeliness (Batini et al.,
study were twice as likely as low performing firms to apply analytics to 2009; Haug and Arlbjørn, 2011; Jones-Farmer et al., 2014) that should

* Corresponding author. One University Park Drive, Department of Management, Nashville, TN 37204-3951, USA.
E-mail addresses: Benjamin.hazen@live.com (B.T. Hazen), fred.weigel@lipscomb.edu (F.K. Weigel), ezelljd@jmu.edu (J.D. Ezell), bradleyboehmke@gmail.com (B.C. Boehmke),
rbradley@utk.edu (R.V. Bradley).

http://dx.doi.org/10.1016/j.ijpe.2017.08.027
Received 27 September 2016; Received in revised form 15 July 2017; Accepted 30 August 2017
Available online 4 September 2017
0925-5273/© 2017 Elsevier B.V. All rights reserved.
B.T. Hazen et al. International Journal of Production Economics 193 (2017) 737–747

lead to more effective decision-making augmented by improved analytic 2016) grows just as fast. The analysis of large data sets is a growing and
performance. Given the importance of data for strategic decision-making competitively necessary trend in business and industry (Chen et al.,
by the modern firm, actions taken by the firm to improve data quality 2012). While business analytics continues to proliferate, the nature and
should be impactful in many areas of the firm, in both positive and structure of the data are changing as a firm's information infrastructure
negative ways. Unfortunately, extant literature provides little evidence evolves from one that primarily stores structured quantitative data to one
regarding the consequences of implementing data quality improve- that relies on a combination of structured quantitative and unstructured
ment methods. qualitative data (Tan et al., 2015). As the types of data and data sources
increase, so does the potential for data quality problems (Hazen
1.2. Research objective and question et al., 2014).
The quality of decisions within the firm depends on the quality of the
Although it is intuitive that efforts to enhance the completeness, ac- data that are analyzed (Dyson and Foster, 1982; Warth et al., 2011). Data
curacy, consistency, and timeliness of data should lead to positive out- quality issues have long been present in organizations and negatively
comes, current research insufficiently addresses this conjecture. Research impact even the simplest analytics efforts (Porter and Rayner, 1992),
must move beyond merely improving data monitoring and awareness such as using data warehousing and evidence-based, data-grounded
activities. Rather, leadership and managers need to understand the im- analysis for organizational transformation (Cooper et al., 2000). These
pacts gained from data quality improvement projects and processes. problems are increasing in proportion to the scale of information tech-
Consequently, the purpose of this study is threefold: (1) To investigate nology and data usage within the modern business enterprise. Data
outcomes realized from the adoption and employment of data quality quality problems arise from those activities surrounding the collection,
improvement initiatives; (2) To illustrate the after-effects and conse- storing, retrieval, and processing of data within the firm, areas which are
quences of such actions; and (3) to clarify how these quality improve- part of the overall business analytics cycle (Dutta and Bose, 2015). It
ment steps impact decision-making outcomes in the organization. Thus, would be these activities, then, that should be the focus of any data
our primary research question is: How does the employment of data quality improvement processes.
quality improvement methods impact a firm's analytics and busi- A review of the literature reveals numerous articles discussing the
ness processes? problem of data quality (Ballou and Pazer, 1985; Ballou et al., 1998;
To answer this question, we approached the study from a post- Huang et al., 1999; Pipino et al., 2002; Redman, 1996; Wang and Strong,
positivist worldview (Creswell, 2013). This worldview is appropriate 1996). Both the academic and practitioner literature has long stated the
because it was our intent to condense the ideas we identified into a small need for improved data quality; however, the literature regarding applied
group of testable Propositions; we also felt that using multiple methods of methods for measuring, monitoring, and controlling data quality is still in
observing and analyzing the phenomenon was necessary for the clearest early stages of development (Bose, 2009; Warth et al., 2011). Whereas
understanding (Benbasat et al., 1987; Creswell, 2013; Dube and Pare, the use of methods to monitor and improve the quality of manufacturing
2003). Additionally, having access to view the creativity of personnel at and service processes is well-researched and implemented in practice,
all levels of the organization leads to a rich yield of data (Voss et al., there has been an only limited use of methods to monitor and improve
2002). Thus, the knowledge generation of the organizations, coupled the quality of the data used to manage these manufacturing and service
with the direct observations of an embedded research provide for supe- processes. Even more limited is research on the implications of using such
rior theory generation than analysis from an outside perspective, such as data quality improvement methods.
what might be found through survey data collection (Benbasat et al., Although this literature stream is still emerging, there are some
1987; Voss et al., 2002). Finally, we believed taking a more inclusive, proposed means by which a firm can monitor and improve its overall data
holistic view of the organization would provide greater insights into the quality. For instance, statistical process control charts have evolved to
data quality improvement methods (Voss et al., 2002). In contrast, we consider more complex and sophisticated processes and data types (Ho
concluded that an external view of the organization would not provide and Quinino, 2013; Jones-Farmer et al., 2014; Ou et al., 2012; Wu et al.,
sufficient insight to properly understand the phenomenon or develop 2009). Control charts have been introduced to speed the signaling pro-
strong propositions. cess, increase probabilities, minimize false alarms, control for multi-
Drawing from the post-positivist worldview, we address the research dimensional process measures, consider complex data structures, and
question through a longitudinal case study that highlights both positive work in high dimensional, online applications. More comprehensive
and cautionary impacts associated with the implementation of a data applications of data quality improvement are included in the work of
quality improvement program. From the basis of our findings, we Redman (1992, 1996, 2001), who recommends using methods to
construct a research agenda, which includes testable Propositions monitor accuracy over time. Pierchala et al. (2009) published a report on
regarding outcomes of data quality improvement. data quality improvement methods used by the National Highway Traffic
The remainder of this article is headed with a literature review of the Safety Administration to monitor the data quality in the Fatality Analysis
concepts behind data quality, its measures, its impact on the business Reporting System and identified some measures of data quality
analytics cycle, and extant data quality improvement methods. Subse- improvement.
quently, we review the case study methodology that guided our inves- Sparks and OkuGami (2010a,b) considered a more sophisticated
tigation, data collection, and analysis. In our findings section, we present method for flagging biased measures in monitoring the consistency of
our results and results-based Propositions that we hope will guide future large volumes of data. They made the distinction between spatial, tem-
research and scholarship in this area. After that, we discuss both the poral, and multivariate consistency within data and recommended three
theoretical and managerial implications of the study results and conclude different types of monitoring methods to use for different scenarios. In
with remarks upon the study's limitations and potential future research another application, Lenz and Borowski (2010) developed an automatic
activities. error tracking and correction application based on using a commercially
available data warehouse integration, auditing, and reporting tool.
2. Background and literature review Although scholars have suggested the use of data quality control
mechanisms, the applications have not been widely adopted in practice.
2.1. Background: data quality monitoring and improvement In many cases, the implementations of such control processes are rudi-
mentary. In other cases, these applications fail to account for de-
As technology evolves at its characteristically rapid pace, information velopments regarding the theoretical and practical dimensions of data
systems' capacities to store more data, from more sources, yielding ever- quality present in the literature (Ezell et al., 2016). This might be due, in
larger datasets requiring ever more sophisticated analyses (Wang et al., part, to a lack of evidence to suggest the efficacy and tenability of using

738
B.T. Hazen et al. International Journal of Production Economics 193 (2017) 737–747

data quality improvement processes in practical business settings. This members can reasonably expect to value that initiative and see it as
research examined the implementation and use of an application similar critical to their personal and professional success. Management scholars
to that proposed by Lenz and Borowski (2010), which combines auto- have also focused on the development of appropriate metrics for moni-
mated error-finding with manual data correction. toring critical processes (Port and Bui, 2009; Singh et al., 2009), with the
understanding that established metrics can motivate employee behavior
2.2. Data and decision quality improvement (Rungtusanatham, 2001). In short, behavior within the organization is
expected to be driven by what is measured (Fenton and Pfleeger, 1998;
Quality improvement methods have long been closely associated with Powell and Woerndl, 2008) and by what is visibly supported by the top
process improvement initiatives (Mitra, 2008; Roberts, 1959), including management team (Hwang and Schmidt, 2011; Wigfield and Eccles,
applications in information systems (Curtis et al., 1992; Weller, 2000). 2000; Wixom and Watson, 2001).
These methods can be employed as a way to identify areas or processes When there are unknown levels of data quality, those using the data
within the overall business analytics lifecycle (Dutta and Bose, 2015) that to support decision-making must spend time and effort to verify their
are in need of improvement, with a focus on controlling and improving data, correct any issues, or find alternate data sources during the decision
data quality. Once properly diagnosed, perhaps through root-cause process (Forgionne, 1999). To this end, information signaling the quality
analysis or other quality improvement methods, solutions can be of data has been shown to be useful (Fisher and Chengalur-Smith, 2003).
implemented to improve data quality. Similar to what is suggested by However, gathering such information comes at a cost and can often slow
Pierchala et al. (2009), once process issues have been ruled out as causes the decision-making process (Even et al., 2010a, 2010b; Price and
of poor data quality, other issues can be investigated for potential im- Shanks, 2011). A reasonable expectation—stemming from the liter-
provements, including technological and information system issues. ature—is that organizations actively engaged in processes targeting in-
Bartlett (2013), discusses the importance of data management in the creases in data quality will off-load this labor from the decision makers
analytically-enabled organization and notes how specific data quality and speed up strategic and operational decision-making.
improvement approaches can be paired with traditional summary sta- For firms to realize benefits of using business analytics, the data
tistics to monitor the quality of data as it enters the data warehouse systems themselves must be viewed as useful and relied upon (i.e., used)
(2013, p. 241). to make quality decisions. Delone and McLean's (1992, 2003) notion of
With competitive business decisions increasingly made based on in- information systems success provides a foundation for this concept. One
sights extracted from the analysis of massive data sets as opposed to of the tenets of the Information Systems Success Model lies in the inter-
traditional statistical sampling (Mayer-Sch€ onberger and Cukier, 2013), relationship between information quality and system quality in leading to
the use of sophisticated structured data quality monitoring tools is intention to use and user satisfaction to produce the overall net benefits
essential for the modern enterprise. It intuitively follows that the quality (Weigel and Hazen, 2014).
of data used for decision-making is related to the quality of the decision Isson and Harriott (2013), note the importance of governance of data
made when using such data. This relationship has received some dis- as it is produced and used throughout the organization, calling data
cussion in the literature (Ballou and Tayi, 1999; Cooper et al., 2000; quality and the understanding of the organization's foundational pro-
Fisher and Kingma, 2001; Forgionne, 1999; Raghunathan, 1999; Shin, cesses for ensuring high-quality data “the heart of business analytics” (p.
2003; Wang and Strong, 1996; Wixom and Watson, 2001); yet, empirical 77). When poor data quality is rampant and when individuals within the
evidence that delineates this relationship is deficient. organization feel that they cannot count on quality data, use of business
Firms experience organizational learning when data gathered or analytics tools to make decisions will likely decrease. Cooper et al.
generated (from internal or external sources) are applied to organiza- (2000), Shin (2003), and Wixom and Watson (2001), provide notable
tional processes (Argyris and Sch€ on, 1999; Cohen and Levinthal, 1990; examples, among others (see Ballou and Tayi, 1999), of the negative
Cook and Yanow, 1993; Crossan et al., 1999). When these data are effect poor data quality can have on users' perceptions and use of infor-
increasingly used to guide future learning and organizational activities, mation technology systems and tools implemented for business analytics.
firms improve their absorptive capacity (Pavlou et al., 2006; Roberts This idea is based partly on tenets of the Technology Acceptance Model
et al., 2012). The gathering and storing of data is a critical part of the (Davis, 1989; Davis et al., 1989), with its basis on the Theory of Reasoned
Business Analytics cycle, and organizations continually need to gather Action (Fishbein and Ajzen, 1975). When an information technology is
and create new information to innovate, remain competitive, and stay perceived as being useful—as may be the case when data quality is
ahead of market peers (Castiaux, 2007; Drucker, 1991; Grant, 1996a). perceived to be high—individuals are more likely to use it (Venkatesh
The absorptive capacity literature highlights the positive impact that this and Bala, 2008; Venkatesh et al., 2012; Weigel et al., 2009). Also,
continuous data gathering can have not only on data quality but also on organizational focus on data quality, along with resultant process
the firm's ability to integrate that data into its routines and processes changes, improves the work environment and increases feelings of
(Roberts et al., 2012). With higher volumes of data, whose quality the employee empowerment in those who use the analytical tools, which
firm continuously acts to improve, decision makers throughout and subsequently increases tool usage (Kim and Gupta, 2014).
beyond the organization can potentially make better decisions (Shin, Redman (2008), makes clear that efforts to improve data quality
2003; Wixom and Watson, 2001). require an organization-wide effort, and the benefits yielded from this
effort can be plentiful. Data flows through the firm similar to a
2.3. Behavioral and analytics cycle improvements manufacturing process (Ballou et al., 1998; Wang and Kon, 1993), and its
improvement is not limited to just one department or division (Kahn
Management support is consistently recognized as contributing to et al., 2002). The issue of data quality in the enterprise is one founda-
increased performance for firm processes and initiatives in which such tional part of an enterprise's business analytics plan (Bartlett, 2013),
support is garnered (e.g., Hwang and Schmidt, 2011; Wixom and Watson, moving the organization further along its path through the analytics
2001). This aligns well with expectation-value theories which link indi- maturity model (Davenport and Harris, 2007; Isson and Harriott, 2013).
vidual choices and performance to their value of a task and cognitive Efforts to improve data quality enable those throughout the company to
expectation of success (Wigfield and Eccles, 2000). The literature sug- assess not just where data originate and how they are processed and
gests that implementation of a data quality improvement program serves stored, but what data should be collected. Improved data quality is not the
as both a proxy for and signal of management support of enhanced data only antecedent to better decision-making (McAfee and Brynjolfsson,
quality specifically, and the business analytics process in general. If the 2012), but such improvements driven by managerial focus and changes
top management team voices their support for a corporate initiative and to organizational culture do play an important role (Lin and McDo-
includes allocation of resources to support that initiative, organizational nough, 2011).

739
B.T. Hazen et al. International Journal of Production Economics 193 (2017) 737–747

Finally, efforts to improve the quality of data used by the firm are not 25,000 records per month at this location.
entirely related to improved decision-making in a linear fashion. With its
roots in economic theory, the law of diminishing returns—originally
formulated by Ricardo, Malthus, West, and Torrens during correspon- 3.2. Data collection
dence over land rent related to corn-wage rates (Hollander, 1997)—
provides a foundation for the concept that the organization must find a Fig. 1 details the timeframe of data collection activities that occurred
balance between needed data quality and efforts to achieve further im- during this study. We placed one researcher in the organization during
provements. The information systems and management literature have execution of the DIP to observe the transformation. The rest of the
noted how limited resources can act as an organizational factor, limiting research team focused on research design and data analysis for the
the extent and success of a technology or process (Cohen and Levinthal, duration of the study. The embedded researcher directly observed the
1990; Fichman, 2004b). For instance, the effort to achieve an 80% DIP activities, strengthening the overall participatory aspects and rigor of
quality improvement might be worth the benefit, but increasing from the study. Embedding the researcher in the organization has its basis in
80% to 90% might come at a cost that is not, and could potentially impact general systems theory, with a view that phenomena can only be un-
the visibility of success of the data quality improvement project itself. derstood by viewing it as a whole rather than looking at individual parts
The literature suggests that organizations must determine how far they in isolation (Bertalanffy, 1968). The extension of the general systems
need to drive their data quality efforts, and at what point further efforts theory for this study flows from this concept that by placing the
will not yield entirely beneficial results. researcher in the system in a participatory role as well as a simultaneous
researcher role, the understanding would be greater than an external
view only (McIntosh, 2010; Patton, 2015).
3. Method
The embedded researcher made observations over a three-year
period. The first six months preceded the establishment of the DIP.
In consideration of the nature of the phenomenon of interest, we
During that period, the researcher gathered observations and informa-
chose an inductive single case study approach for our analysis (Patton,
tion regarding historical data quality to construct a baseline picture of the
2015; Yin, 2014). The focus of this longitudinal case study is on the data
before-state. Implementation began at the seven-month point, and
quality problem viewed in the operations management perspective. A
personnel immediately implemented all aspects of the DIP. However, it
U.S. Air Force (USAF) base in the continental United States that main-
took approximately six months of change management to affect changes
tains a large fleet of aircraft was the chosen site for this case study. The
at the organization. The final two years consisted of frequent review of
suitable unit of analysis is the organizational level and the appropriate
the process, and assessment of behaviors and outcomes. We objectively
commander granted permission for this study. We collected data
measured and compared data quality with before-state measures. The
regarding outcomes associated with the implementation of a data quality
researcher directly observed meetings and discussed changes and out-
improvement program. The base is representative of other USAF bases
comes with participants, to include management, DIP managers, and
throughout the world and is similar to civilian airfields from the
shop supervisors. Key participants also recorded and reported their ob-
perspective of airfield operations, aircraft maintenance, and operations
servations regarding behavioral changes and other outcomes.
management. Indeed, the general nature of the business practices and
In addition, using a purposive (nonprobability) sampling approach
operations supported by the data systems and data quality improvement
(Patton, 2015), the embedded researcher interviewed ten users repre-
program under investigation are similar to most production settings.
senting six different user bases over the course of nine weeks at the
2½-year-mark of the study. Of those interviewed, two informants are
3.1. Case study background
administrators who are responsible for AirData system management,
including the DIP. Also included in the sample of interviewees were DIP
With the assistance of a variety of interrelated information systems,
monitors, who are responsible for correcting errors in the database.
the USAF manages the world's largest fleet of aircraft. AirData (a pseu-
Maintenance technicians have, as their primary responsibility, the task of
donym) is the system dedicated to managing and documenting mainte-
inputting data after performing maintenance actions; therefore, the
nance activities for eight specific airframes and is used by over 30,000
researcher interviewed maintenance technicians as well. Interviews also
personnel worldwide. Data inputs are transmitted from each base into a
included maintenance managers who are responsible for overseeing the
centralized mainframe, where data are assimilated and interfaced with a
maintenance technicians and using reports to make operational-level
variety of other USAF and Department of Defense systems. Outputs of
AirData consist of informational reports that assist managers in making
fleet management and aircraft sustainability decisions, as well as aircraft
status determinations that directly impact operational capabilities. 6 Months Before Adoption:
Of note, AirData data are not used solely for routine reports; they are Pre-Improvement Observations
used for a myriad of purposes. In addition to scheduled reporting pur-
poses, for example, the data are used to document and predict fleet-wide Immediately Preceding
aircraft maintenance requirements. Additionally, operations managers at adoption:
higher echelons forecast aircraft availability using AirData data, which Baseline Assessment of Data
informs decisions regarding overseas operations, deployments, and other
long-term missions. Using AirData data, operations managers at the base
Post-Adoption:
level can determine the ideal aircraft employment schedule. Illustrating
Observations Regarding Data
another use of the data, logistics managers discern the best locations at Improvement Process
which to stage and house each aircraft for maximum effectiveness Implementation and Management
and efficiency.
Before being transmitted to the mainframe, data from the base are
subjected to a Data Integrity Program (DIP) process. This process consists 2 Years Post-Adoption:
of two steps. First, an automated program flags data records that are Improvement Process Review
suspected of containing errors regarding accuracy, consistency, and Outcome Assessment
completeness, and timeliness. Second, appointed DIP monitors manually
assess the flagged record to determine if there is indeed an error and
make any required corrections. The AirData system processes roughly Fig. 1. Major Milestones during Data Collection.

740
B.T. Hazen et al. International Journal of Production Economics 193 (2017) 737–747

decisions. Ensuring there was a full spectrum of perspectives, the training for data entry and data management personnel. Whether found
researcher interviewed senior managers who are responsible for in root cause analysis or continuous monitoring and control, interview
tactical-level decision-making. To round out the complement of data subjects reported that a focus on improving data quality through studying
consumers and decision-makers, the researcher included senior execu- the overall process by which data are generated and flow through the
tives who use AirData to make strategic-level aircraft fleet decisions in organization (Curtis et al., 1992) led to better data quality. By under-
the interviewee pool. The purpose of these interviews was to gather standing the overall process, organizations can focus on the aspects of
additional data to refine the coding scheme and emerging nomological data quality that are most important to the data consumers (Wang and
network as well as to ensure that no other significant phenomena were Strong, 1996; Bitterer and Newman, 2007). Understanding these di-
excluded. Therefore, the researcher employed both semi-structured and mensions of data quality allows an organization to operationally define
unstructured interview techniques. As part of the interview, there were measures for these dimensions and apply tools to actively monitor for
five common questions that the researcher asked all participants (see data quality problems (Hazen et al., 2014). Thus, supported by literature
Appendix); subsequently, in the interview, the researcher asked partici- and as revealed through the case study interviews, we propose
pants to talk freely about anything else related to the DIP. the following:
Proposition 1a. Organizations that implement data quality improvement
3.3. Validity and reliability
methods will experience an increase in data quality.
Validity and reliability were considered at all phases of the research. In our study, we found that decision-makers realized that higher-
To establish construct validity, our assessment included multiple sources quality decisions were an outcome of the DIP. Decisions regarding
of evidence (Krippendorff, 2004). As previously mentioned, we captured part-ordering, fault troubleshooting, aircraft mission planning, and
archival data from AirData and related systems, interviews, participant similar decisions upon which data from AirData are based improved, as
observations, direct observations, and before/after data regarding data both decision-makers and those who manage decision-makers reported.
quality. Further, the researcher interviewed employees at all levels As one manager mentioned, “I can actually count on the data now,
related to AirData and the DIP. To strengthen our analysis, key in- instead of just guessing as to whether or not it can be trusted. The de-
formants at the organization reviewed drafts of the case study for errors cisions I make are more accurate now, and that shows up in our metrics.”
in content and chronology (Yin, 2014). The feedback of these informants Another more senior manager relayed, “My parts managers are making
was paramount in the refining process and considerably improved the better decisions, and I can see that across some of our metrics.” Indeed,
accuracy of the findings. managers in this case study overwhelmingly supported the notion that
Participants of the study also assessed the data; seeking and inte- the enhanced levels of data quality improved their decision-making.
grating their feedback in assessing the findings were essential to the rigor These findings support extant literature, which states that the quality
of the study. To ensure internal validity, we addressed rival explanations of data used for decision-making is related to the quality of the decision
in the findings of this study (Shadish et al., 2002, p. 508). In addition to made when using such data (Ballou and Tayi, 1999; Cooper et al., 2000;
including the participants in assessing the data, we used power quotes Wang and Strong, 1996; Wixom and Watson, 2001). Thus, DIP efforts
from the interviews; our intent was to develop the Propositions as a allow organizations to identify and remedy root causes of poor data
means to demonstrate support for cause-effect relationships. quality, which can result in quality improvements for both the data and
The approach to enhancing external validity of this single case study decisions (Curtis et al., 1992; Weller, 2000; Pierchala et al., 2009).
was to use theory and literature to aid in Proposition development.
Proposition 1b. Organizations that implement data quality improvement
Comparing, contrasting, and integrating findings to extant theory and
methods will experience increased decision quality.
developing Propositions based on both extant theory and the emerging
data allowed for more generalizable findings (Shadish et al., 2002). Our analysis revealed that improvements to data quality throughout
Focusing on enhancing reliability, we established and adhered to a the organization could benefit not only data generated internally within
detailed protocol for the case study (Yin, 2014, p. 240). Another view on the firm, but also the means by which the firm captures and interprets
reliability is to consider the consistency between the results and the data information from both inside and out. Our interviews found that efforts
(Merriam and Tisdell, 2016, p. 251). To ensure this consistency, we to improve data quality improved the ability of the organization as a
triangulated results through multiple methods of data collection and whole to learn, supporting the literature's suggested benefits of increased
analysis; by providing a thorough audit trail of the process, we absorptive capacity of the firm (Roberts et al., 2012). At the organiza-
strengthened the study further (Merriam and Tisdell, 2016). Combined tional level, the findings of our study suggest that data quality
with the interviews and participant-observations, there are ample improvement efforts grounded in changes to data generation and moni-
methods with which we sufficiently triangulated multiple sources of data toring processes, and the information systems that enable them, yield
in order to enhance validity and reliability of the findings (Merriam, benefits. Due to the nature of the focal organization and the fact that data
1998; Yin, 2014). at the organizational-level are transmitted and shared across other
business units inside and outside of the USAF, the improved data at the
4. Results and discussion focal organization was shown to have some positive cascading effects,
where the enhanced level of quality bolstered the view of data to that of a
The analysis and discussion take the form of Proposition development knowledge resource (Grant, 1996b). This is because the users of the data
and presentation. We describe the findings in relation to extant research often saw data as having little value in its previous, poor-quality state;
and theory to form each proposition. The findings are presented in this however, as quality improved, so did the usefulness of the data, both to
section to describe implications for practitioners and scholars. the focal organization and its partners.
Although the unit of analysis is the organizational level, the research
4.1. Data and decision quality improvement revealed both vertical and horizontal implications to instituting the DIP,
which enabled other organizations to capitalize on the focal firm's
In the case study interviews, organizational members noted that poor knowledge resources. For instance, senior managers in a lateral organi-
data quality originated from initial data entry, not data systems or zation noted how the higher quality data derived from the focal orga-
transfer processes. As part of the overall data improvement process in the nization helped to forecast workload within their organization, and led to
organization, members first investigated root-causes and identified base reduced inventory costs derived from more accurate service life infor-
process issues (Pierchala et al., 2009). In this case study, solutions to mation. Similarly, managers at a vertical organizational unit remarked
address data quality problems included the addition of processes such as that higher quality data achieved using DIP allowed for faster, more

741
B.T. Hazen et al. International Journal of Production Economics 193 (2017) 737–747

reliable decision-making. These findings support the literature that pos- Additionally, the improvement may be impacted by the data workers'
tulates that benefits from improved data quality are not limited to just increased perceived behavioral control, or “perceived ease … of performing
one department or division but can have cascading impacts across the the behavior” (Ajzen, 1991, p. 188). At the commencement of the DIP
entire value chain (Redman, 2008; Kahn et al., 2002). tool, when initial data quality metrics were first tracked, initial error
rates averaged above 30%. After implementation, error rates before
Proposition 1c. Data quality improvement initiated by one organizational
correction average four percent and are corrected using DIP to an average
element will positively affect decision quality in intra- and inter-organizational
of 0.15%. This suggests that even the mere the existence of the program
elements with which the initiating element shares data.
itself continues to motivate data quality improvements and influence
workers' attitudes toward enhancing data quality, indicating an effect of
4.2. Improvement in data quality behavioral aspects
employees' attitudes toward behavior (Ajzen, 1991, 2011). Indeed, as one
technician noted, “I never thought anyone looked at this stuff before. I
The research literature suggests that there are also indirect means by
guess they need [the data] to be right so that we may take our time now.”
which data quality improvement methods can be used to enhance data
quality and contribute to data-driven business analytics. The result of our Proposition 3a. Organizations that implement data quality improvement
analysis suggests that instituting data quality improvement initiatives is a methods will experience positive behavior changes in their employees, partic-
means for top managers to demonstrate support for and emphasize the ularly those involved in data handling processes (3a).
importance of business analytics in their organization, and drive
Proposition 3b. The careful level of attention by employees will, in turn,
employee expectations (Hwang and Schmidt, 2011; Wigfield and Eccles,
lead to fewer errors overall.
2000; Wixom and Watson, 2001), which leads to greater commitment
toward high quality data. When management initiates a program for Proposition 3c. The careful level of attention by employees will lead to a
monitoring and improving data quality, this may indicate interest in the greater understanding of the need for high-quality data throughout the
targeted data management processes, signaling to those who input and organization.
manage data throughout the organization that the process is a manage-
rial priority. For instance, one technician stated, “now that we have to do
4.3. Analytics cycle improvements
[the DIP process], I assume leadership really cares about the data.”
Another remarked, “I know that managers at headquarters are using a lot
In our case study, participant responses suggested support for For-
of data to make decisions. I suppose they need the data to be good to do
gionne's (1999) note that decision makers must dedicate time and re-
that, so we're trying to do better.”
sources to clean data vital for analysis. For instance, one manager noted,
The DIP became a high-visibility item, and DIP compliance was
“I used to spend a lot of time trying to figure out if the information was
emphasized through meetings and establishment of DIP metrics. This
correct. This added several minutes—if not hours—when trying to
emphasis provides a corollary to Wixom and Watson's (2001) finding of
determine the right maintenance actions to take.” Another manager said,
the association between high levels of data quality and high levels of
“I used to have to go through the shop's logbooks to see what really
perceived net benefits. One manager noted, “I think that the workforce is
happened, because you know, the [information system] was never right.”
getting the point that the data doesn't just go into some ‘system’ and that
A supervisor mentioned previously having to validate certain tasks with
people are actually relying upon it to support decisions and to make their
actual individuals that did the work to determine if the data were correct.
lives easier, too.”
Indeed, some research suggests that spending the time to find more ac-
Proposition 2a. Organizations that implement data quality improvement curate data is not always of benefit, due to the added time that it takes
methods will be viewed by their employees and others as being committed to and the fact that one cannot always track down the correct data (Frank,
high-quality data. 2008). In fact, current literature suggests that approximately 80% of the
analytic cycle time is often consumed by cleaning the data and ensuring
This level of management support also implies support of business
they are correct and properly prepared for analysis (Dasu and Johnson,
analytics that relies heavily on the quality of data on which it runs. Thus,
2003; Wickham, 2014). One finding of note is that some managers have
Proposition 2b. The outcome of the aforementioned level of management to balance the anticipated benefit of verifying data with the need to make
support will be higher levels of data quality. a timely decision. As such, tradeoffs would have to be made. However,
after the DIP was instituted, managers spend much less time verifying
Proposition 2c. The outcome of the aforementioned level of management
data and, if they still needed to verify data (often due to having to make
support will be higher levels of decision quality.
an important, high-dollar decision), they had better insight into these
In a similar vein, use of data quality improvement methods implies tradeoffs. Upon implementing data quality improvement methods, the
specific metrics that are being monitored as well as visible signals to focal organization realized higher quality data and information, which
employees that management is monitoring their ability to achieve those evoked further use and satisfaction and ultimately improved
metrics (Rungtusanatham, 2001). We witnessed this phenomenon in our decision-making, again supporting the Delone and McLean (2003) model
case study. Analysis of interview responses revealed that the metrics for information systems success due to aspects of quality. This cycle
established with the use of the DIP had an effect on employees of continued over time, as witnessed by the embedded researcher,
establishing a subjective norm, thus affecting employees' focus because providing support for the feedback loops and interrelationships described
those who work with data felt that their actions were being closely in the Information Systems Success Model.
monitored. Ajzen's theory of planned behavior (Ajzen, 1991, 2011) ad-
Proposition 4. Organizations that implement data quality improvement
dresses this effect of subjective norm on one's intention toward a
methods will experience reduced costs for verifying data quality. Organizations
behavior, and ultimately the behavior itself (1991, 2011). One supervisor
will also be better positioned to successfully balance costs of verification with
noted, “if my section's DIP numbers are bad, then I have to explain it.”
costs of reduced quality to maximize resources, decision quality, and
Another mentioned, “They are rating us on this now, so we have no
net benefits.
choice but to do it.” As such, data workers provided greater attention to
processes involved in data entry, compilation, and management. For In our case study interviews, one manager noted, “… we are doing so
instance, a technician mentioned, “it's easier to just do it right the first much more with our data now. We've implemented new predictive an-
time. If the DIP says I have errors, then I just got to go in and do it over alytics tools that we never would have been able to use before.” This
again.” The before-and-after data quality metrics support such finding supports recent literature where employees reported that the
improvement. improved quality data makes the analytics methods and tools available to

742
B.T. Hazen et al. International Journal of Production Economics 193 (2017) 737–747

them more useful (Venkatesh and Bala, 2008; Venkatesh et al., 2012) and This study found that the DIP gave uniform attention to all data.
increased feelings of empowerment on the part of the decision-making Although some data were more critical to decision-making than other
employees (Kim and Gupta, 2014). Through purposeful data improve- data, this distinction was never made clear. One maintenance technician
ment processes, managers may be able to identify and control problems noted, “why does anyone care about this [general maintenance task]?”
with data quality by using data quality improvement methods before Yet, a manager mentioned how he needs to make certain decisions about
poor data quality becomes a pandemic and discourages use. Our case which most technicians are not familiar. Indeed, different data were used
study results and the literature lead us to propose the following: to support different decisions at different levels, and although most in-
dividuals understood the decisions that are important to subordinates,
Propositions 5a. Organizations that implement data quality improvement
most were not familiar with the requirements of levels above them. Thus,
methods will experience an increase in the use of business analytics tools
when instituting data quality improvement initiatives, it is important to
because the underlying data is perceived to be of high quality (5a).
communicate analytics needs across the organization. This helps orga-
Propositions 5b. An increase in the use of business analytics tools due to nizations understanding how information flows through, and is lever-
the underlying perception that the data are of high quality will, in turn, lead to aged by, the organizations. Extant literature suggests that by
higher quality decisions. organizations understanding the overall information flow, they can focus
on the aspects of data quality that are most important to the various data
If a situation exists in which data quality is already perceived as being
consumers within the firm (Wang and Strong, 1996; Bitterer and New-
poor, and many potential users are dissuaded from employing business
man, 2007).
analytics, demonstrating improvement in data quality may be one way in
At the current time, such communication is largely absent at the case
which management can energize their analytics efforts. Assuming that
study organization. However, this communication need not be difficult.
data quality improvement initiatives have commenced, quantifiable ev-
One suggestion is to develop documentation and guidance summarizing
idence of data quality improvement can be shared (as seen in the case
what data are required for which decisions at what levels of the orga-
study), which demonstrates improvements in data quality to users. This
nization; the generated documentation and guidance should then be
sense of increased reliability of data was shown to instill confidence in
disseminated across an organization. When DIPs are first implemented,
managers who wish to use the data for decision-making, but have been
organizational resources can be scarce. By delineating which data are
hesitant because of low confidence in the data, and subsequently, the
most important for decision-making, the firm can limit resources to those
decisions derived from such data. This finding complements Expectation-
data sources and systems that produce the most valuable data. Of course,
Value theories (Wigfield and Eccles, 2000) by suggesting that enhanced
the quality of all data in the organization can be improved given enough
levels of confidence in data quality lead to increased confidence in the
time and resources; but, a focus on the most valuable and initially
decisions made in consideration of such data.
important data can provide a quick demonstration of the worth of
Proposition 6a. Members of organizations that implement data quality improvement efforts. Although data-documenting materials would
improvement methods will experience an increase in confidence related to require periodic updates, such materials could go a long way to routin-
data quality. izing and improving the business analytics cycle, because those at all
levels would have a better understanding of how data are used. Addi-
Proposition 6b. An increase in confidence related to data quality will in-
tionally, data could be ranked not only by its use and value, but also by
crease confidence in decision-making.
which data are of the poorest quality. Overall, firms should focus their
Proposition 6c. : Confidence in decision-making can enhance decision initial data quality improvement efforts on the most compelling set of
processes, and lead to more timely decisions. data used for decision-making within the firm. This supports the litera-
ture that suggests total data quality management should not be viewed as
Proposition 6d. Confidence in decision-making can enhance decision
an end-state but, rather, a process of continual improvement and treat-
processes, and lead to higher quality decisions.
ment of data as an organizational resource (Eppler and Helfert, 2004;
Additionally, one manager remarked, “we're simply better at doing Haug et al., 2011; Marsh, 2005).
analytics now.” Another stated: “I wasn't a big proponent of using ana-
Proposition 8a. As the clarity and precision of the decision-making data
lytics before. This was due somewhat to not understanding the value. It
requirement increases, data quality improves.
was also due to the fact that I knew our data were [expletive].” Most in
management agreed that the DIP was not a panacea, but they generally Propositios 8b. As communication of the pertinent data requirements in-
agreed that it was an antecedent to their employment of data analytics. creases vertically through the organization, data quality improves. Commu-
This study found that higher confidence in the quality of data spurs nication of data requirements can support data quality improvement efforts, as
company-wide discussions regarding the right questions that need to be well as business analytics performance.
asked supporting Kahn et al. (2002) and Redman (2008). The informa-
tion technology department can coordinate efforts to modify the infra- 4.4. Potential drawbacks
structure to alter data collection, storage, modeling, and analysis
processes based on the refined set of questions being asked by the Consistent with the Hawthorne Effect, we recognized that as
management team. Since organization-wide attention is already focused personnel became more familiar with analytics—and with the accom-
on issues with data and decision-making, communication of results from panying extra attention from management—those using DIP also gained
this refinement and reanalysis process is already occurring across divi- proficiency in data analysis. As maintenance personnel recognized im-
sional boundaries beyond the level pre-DIP in the case study organiza- provements in efficiency and effectiveness based on the value of data
tion. As found from the case study interviews, management saw little analytics, they became more focused toward mastering their interaction
value in investing resources in a full analytics process until they with the DIP and the analysis process.
perceived a level of quality to their data justifying the effort. This has Conversely, we found that some managers focused too much on DIP
been a common theme found in industry research (Marsh, 2005). results, to the point where DIP metrics were treated as equal to actual
Because any DIP is a company-wide effort, aligning everyone's view of performance indicators. Although important, once any metric is on
“reality,” all parts of the analytics cycle have the potential to be posi- management's radar, there is a chance that it will be overemphasized.
tively impacted. Managers seemed to address the “lack of” DIP compliance, even when it
was in the 90þ percentile (vice the 100% completion as required in the
Proposition 7. Increases in data quality driven by data quality improve-
organization) in the same manner as low aircraft availability metrics
ment will yield better analytics cycle outcomes.
(primary organizational performance measures). Subordinates were

743
B.T. Hazen et al. International Journal of Production Economics 193 (2017) 737–747

sometimes more apt to focus on 100% completion of the DIP to the peril data are its ability to provide decision-makers with the effective insights
of other tasks, which may be have been more important. Data quality to choose from among options for potential organizational actions, a view
improvement initiatives represent another process and suite of metrics of data quality set firmly in the decision theory view (Berger, 1985; Karr
that must be managed. Managers responsible for such programs might et al., 2006). When taken together, the above Propositions suggest the
focus too much attention on the program vice the outcome by failing to alteration of data-generating organizational processes and inclusion of
consider cost, time, and quality levels appropriate for a project's objec- sophisticated information systems designed with a focus on improving
tives (Liberatore and Pollack-Johnson, 2013). Outcomes and achieve- data quality.
ment goals overall can be harmed when an over-emphasis on achieving To this end, Premkumar et al. (2005) specifically mention that one
metric or performance levels comes at the expense of helping organiza- strategy organizations facing competitive uncertainty and increased
tional members resolve underlying organizational issues (Choo, 2011). dependence on information for decision-making follow is to “redesign
This is often referred to as Goodhart's law, which suggests when a mea- business processes” and “[implement] integrated information systems
sure becomes a target, it ceases to be a good measure because “any that improve information flow and reduce uncertainty” (p. 260) (p. 260)
observed statistical regularity will tend to collapse once pressure is (p. 260) (p. 260) (p. 260) (p. 260). Further, Maier et al. (2012), note the
placed upon it for control purposes” (Goodhart, 1975a, 1975b). The more importance of a systematic assessment of organizational capabilities and
any quantitative indicator is used for decision-making, the more subject practices to provide a realistic view of the company's abilities and how
it will be to corruption pressures, and the more apt it will be to distort and they compare against major competitors before making process changes.
corrupt the processes it is intended to monitor (Campbell, 1979). Man- Attempts to deliberately improve data quality through redesign of data
agers are cautioned to prioritize the need for data quality improvement in generation and management processes such that the DIP examined in this
consideration of other operational processes. research, can help to reduce uncertainty surrounding data used to inform
decision-making (Jones-Farmer et al., 2014; Karr et al., 2006). Methods
Proposition 9. Over-emphasis on data quality metrics by firm leadership
that are integrated into a data-generation and monitoring process can
tend to accompany a related decrease in other organizational perfor-
equip the organization with the tools it needs to handle ever-increasing
mance metrics.
volumes of data. Organizations of all sizes, and particularly their deci-
This study found that initiating the DIP within the organization sion makers, can benefit from the reduced cognitive load realized by
required a substantial level of additional work. No additional manpower improving data generation and quality monitoring processes, and pre-
was provided, and every new function was added to existing workloads. venting information overload (Eppler and Mengis, 2004); thereby
Management was required to initialize and track new metrics related to improving data-grounded evidence-based decision-making.
DIP; maintenance supervisors were required to ensure completion; ana- The relationships uncovered in this research inform the development
lysts had to establish and then manage the program, and maintenance of Fig. 2, the primary model. Fig. 2 illustrates a conceptual model of
workers were required to spend additional time validating data. Many outcomes associated with data quality improvement initiatives as
individuals, especially maintenance workers and supervisors, were con- explained in Propositions 1 through 7 and Proposition 10 above. The
cerned about the additional workload and commented that if man-hours solid arrows indicate the primary effects. The single-dashed lines show
were used to improve data quality, then those were man-hours not being the secondary effects, and the double-dashed lines illustrate ter-
used to conduct other work and achieve existing operational re- tiary effects.
quirements. One maintenance supervisor remarked, “it's one more thing Additionally, Fig. 3 illustrates the proposed relationship between
to do. Our plates are full … if we add this to our plate, something else is management emphasis on data quality and the impact of this emphasis
going to need to come off.” Another mentioned, “the [DIP] is a good idea on the non-data quality, more general organizational outcome metrics as
… but who's going to do it? I don't have the people to do that.” described in Proposition 9. Organizational leaders must balance appro-
To date, the organization investigated in the study has yet to find the priate attention to organizational changes needed to improve the quality
optimal level of quality that should be achieved to properly balance of data used in analytics with a focus on preventing neglect with regards
program costs with benefits to prevent a diminishing returns effect to other organizational outcomes due to resource-drain. Finally, Fig. 4
(Cohen and Levinthal, 1990; Fichman, 2004a; Hollander, 1997). The illustrates the challenge for organizational leadership: as data quality
questions of “how many resources will be needed” and “what level of improves, the effort required to make successive improvements in-
quality is needed to achieve the decision-making benefits sought” are creases, as explained in Proposition 10.
questions stakeholders within the firm will need to address as DIPs are
planned and put into place. The improvement of data quality is an 6. Implications of findings
organization-wide event with end-to-end impacts for the firm, and with
firm-wide increases in workload to handle the monitoring and 6.1. Theoretical contributions
improvement of data.
This study makes several important theoretical contributions. This is
Proposition 10. Once major and impactful data quality issues in the firm
the first study to comprehensively assess direct outcomes of employing
have been resolved, successive improvements will require additional workload,
data quality improvement methods. Literature in this area is typically
effort, and resources yielding diminishing returns for the costs involved.
focused on data quality improvement and assumes that positive out-
Managers will need to find ways to weigh these costs against anticipated
comes will naturally emerge, without any supporting evidence. Second,
benefits to achieve optimal ROI.
most of this literature is applied in nature and often contributes to
mathematical theories related to quality improvement. By using an
5. Summary
inductive, single case study approach and integrating the findings with
extant management theory, this is one of the first studies in this area that
When considered through the information processing view of the firm
contributes to the theoretical implications of data quality improvement.
(Galbraith, 1974; Premkumar et al., 2005), ever larger volumes of data
Third, the proposed framework and testable Propositions presented in
that must be consumed by the individual decision-maker will inevitably
this article lay the groundwork for a new area of research regarding the
lead to information overload (Eppler and Mengis, 2004). Eppler and
benefits and drawbacks of data quality improvement in support of
Mengis (2004), state that information overload can be caused by, among
business analytics.
other reasons, “the information itself (its quantity, frequency, intensity,
and quality), … organizational design … and the information technology
that is used” in the firm (p. 330). Ultimately, the true quality and value of

744
B.T. Hazen et al. International Journal of Production Economics 193 (2017) 737–747

Clarity/Precision of
Data Requirements
P8a + P1c + Data Quality of intra- and
inter-organiza onal elements

Communica on of
P8b + P7 +
Data Analy cs Cycle
Data Requirements Quality Outcomes
P2c +
Decision
P1a + P1b + Quality
P2b +
P2a + Stakeholder Perspec ve of
Commitment to Quality

Implemen ng Data P3a + P3b – Data Errors


Posi ve Employee
Quality Improvements Behavior Changes P3c +
Understanding of
P4 – Need for High Quality
Data Quality
Verifica on Costs
P5a +
P6a + Business Analy cs P5b +
Tool Use
Confidence in P6d +
Data Quality
P6b +
Confidence in
Decision-Making P6c +
Decision
Timeliness

Fig. 2. Conceptual Model of Data Quality Improvement Outcomes.

X = Point at which Leadership Emphasis maximizes


non-Data Quality Performance Metrics
High
High
Leadership Emphasis on Data Quality metrics

Effort Required
Low
Low

Low High
Low X High
Impact on non-DQ Performance Metrics
Data Quality Improvements

Fig. 3. Relationship between Leadership Emphasis on Data Quality (DQ) and the Impact Fig. 4. Relationship between Effort Required and Data Quality Improvement (Proposi-
on non-DQ Performance Metrics (Proposition 9). tion 10).

6.2. Managerial contributions increase analytic applications and evidence-based decision-making. In


other words, data improvement efforts are a fundamental requirement
This research also provides relevant implications for managers, who for organizations looking to increase their usage of analytics for better
can glean actionable information as to how they might improve their decision making.
business analytics programs. First, managers can rest assured that active Third, this research highlights potential drawbacks that managers
data quality improvement efforts can result in enhanced data quality and should acknowledge regarding data improvement efforts. These draw-
decision-making outcomes within their organization. Data improvement backs include the potential for biased emphasis regarding performance
efforts also have the potential to have positive effects on decisions made metrics and the uncertainty of optimally allocating resources to these
by firms horizontally and vertically aligned to the focal organization efforts. Thus, organizational leaders must weigh the deserved attention
along with improving the trust and commitment towards analytically required for data quality improvement with that on preventing negative
focused behavior. Thus, managers may use data improvement efforts to organizational outcomes that may accompany the effort.
not only change behaviors and improve decisions within their organi-
zation but also to drive these same attributes across their partner firms. 7. Conclusions and future research
Second, data improvement efforts have shown a long-term reduction
in data quality verification costs. Moreover, these efforts promote greater There are some limitations to this study that may inhibit the gener-
confidence in data quality and, therefore, the analytic and decision alizability of the results. First, the case study was focused on one orga-
outcomes from these data. Consequently, managers can use data nization type and size, and one data quality improvement program.
improvement efforts to promote greater confidence in their efforts to Future research can expand its focus across organizational types and DIP

745
B.T. Hazen et al. International Journal of Production Economics 193 (2017) 737–747

sizes/implementations. This will ensure generalizability of results and Berger, J.O., 1985. Statistical Decision Theory and Bayesian Analysis. Springer-Verlag,
New York, NY.
further examination of contextual factors that impact the relationship
Bertalanffy, L.V., 1968. General Systems Theory: Foundations, Development, Applications
between data quality improvement and outcomes from the decisions Braziller. George Braziller, Inc, New York.
made upon analysis of high-quality data in the organization. A second Bitterer, A., Newman, D., 2007. Organizing for Data Quality. Gartner Research, Stamford,
limitation is that the study used an exploratory, inductive case study CT.
Bose, R., 2009. Advanced analytics: opportunities and challenges. Ind. Manag. Data Syst.
approach. Future studies can use these results as the basis to engage in 109 (2), 155.
both qualitative and quantitative confirmatory techniques to present a Campbell, D.T., 1979. Assessing the impact of planned social change. Eval. Program Plan.
more well-rounded view of the relationship between data quality and 2 (1), 67–90.
Castiaux, A., 2007. Radical innovation in established organizations: being a knowledge
organizational outcomes. Finally, we did not examine the analytic tech- predator. J. Eng. Technol. Manag. 24 (1–2), 36–52.
niques that would use the improved-quality data. The quality and so- Chen, H., Chiang, R.H., Storey, V.C., 2012. Business intelligence and analytics: from big
phistication of analytic techniques used by the organization would both data to big impact. MIS Q. 36 (4), 1165–1188.
Choo, A.S., 2011. Impact of a stretch strategy on knowledge creation in quality
be impacted by the perception of the quality of the firm's data, and would improvement projects. Eng. Manag. IEEE Trans. 58 (1), 87–96.
also serve as a mediating factor between the data and the impact of the Cohen, W.M., Levinthal, D.A., 1990. Absorptive capacity: a new perspective on learning
decisions made upon them. Future studies should take the analytic and innovation. Adm. Sci. Q. 35 (1), 128–152.
Cook, S.D.N., Yanow, D., 1993. Culture and organizational learning. J. Manag. Inq. 2 (4),
technique/decision-making process itself into account when investi- 373–390.
gating the nature of the data quality/analytics outcome relationship. Cooper, B.L., Watson, H.J., Wixom, B.H., Goodhue, D.L., 2000. Data warehousing
Given that organizations are beginning to work more deeply with supports corporate strategy at First American Corporation. MIS Q. 547–567.
Creswell, J., 2013. Research Design: Qualitative, Quantitative, and Mixed Methods
their large sources of data, the quality of that data is becoming increas-
Approaches, fourth ed. Sage Publications, Inc, Thousand Oaks, California.
ingly more important. The business analytics cycle—collecting, storing, Crossan, M.M., Lane, H.W., White, R.E., 1999. An organizational learning framework:
and retrieving the data, and then summarizing, analyzing, and commu- from intuition to institution. Acad. Manag. Rev. 24 (3), 522–537.
nicating the results—critically relies on data that is of high quality. Curtis, B., Kellner, M.I., Over, J., 1992. Process modeling. Commun. ACM 35 (9), 75–90.
Dasu, T., Johnson, T., 2003. Exploratory Data Mining and Data Cleaning, vol. 479. John
Lacking high-quality data, the business decisions made are detrimentally Wiley & Sons.
affected and increasingly ineffectual. Decisions based on outputs from Davenport, T.H., 2006, Jan. Competing on analytics. Harv. Bus. Rev. 84, 98–107.
analytic techniques cannot be of high quality unless both the data Davenport, T.H., Harris, J.G., 2007. Competing on Analytics: the New Science of Winning.
Harvard Business School Publishing Corporation, Boston, MA.
analyzed and the methods themselves are also of a high quality. In this Davis, F., 1989. Perceived usefulness, perceived ease of use, and user acceptance of
research, we have directly observed and assessed the outcomes generated information technology. MIS Q. 13 (3), 319–340.
for both data and decision quality after the implementation of data Davis, F.D., Bagozzi, R.P., Warshaw, P.R., 1989. User acceptance of computer technology:
a comparison of two theoretical models. Manag. Sci. 35 (8), 982–1002.
quality improvement efforts. Additionally, we synthesized the study DeLone, W., McLean, E., 1992. Information systems success: the quest for the dependent
findings with extant management literature and theory, showing the variable. Inf. Syst. Res. 3 (1), 60–95.
organizational outcomes that can be expected from data quality program Delone, W., McLean, E., 2003. The DeLone and McLean model of information systems
success: a ten-year update. J. Manag. Inf. Syst. 19 (4), 9–30.
implementations. The findings suggest that subjecting data to contem- Drucker, P.F., 1991. The new productivity challenge. (cover story). Harv. Bus. Rev. 69 (6),
porary quality control measures will help focus organizational attention 69–79.
on data quality enhancement and improving analytics processes, which Dube, L., Pare, G., 2003. Rigor in information systems positivist case research: current
practices, trends, and recommendations. MIS Q. 597–636.
invariably lead to better decision-making.
Dutta, D., Bose, I., 2015. Managing a big data project: the case of ramco cements limited.
Also, the resultant proposed framework lays a useful groundwork for Int. J. Prod. Econ. 165, 293–306.
the research community to further explore the relationships between Dyson, R.G., Foster, M.J., 1982. The relationship of participation and effectiveness in
data quality improvement and organizational outcomes. Such relation- strategic planning. Strat. Manag. J. 3 (1), 77.
Eppler, J.M., Helfert, M., 2004. A classification and analysis of data quality costs. In:
ships have been assumed in the literature to be of a positive nature, and Paper Presented at the International Conference on Information Quality, Cambridge,
this study provides evidence to confirm these assertions. However, our MA.
analysis posits that these relationships are much more complex and Eppler, J.M., Mengis, J., 2004. The concept of information overload: a review of literature
from organization science, accounting, marketing, MIS, and related disciplines. Inf.
contextualized in nature. The improvement program analyzed in this Soc. 20 (5), 325–344.
study can suggest a starting point for practitioners seeking a way to Even, A., Shankaranarayanan, G., Berger, P.D., 2010a. Evaluating a model for cost-
improve their organization's data quality. Additionally, the Propositions effective data quality management in a real-world CRM setting. Decis. Support Syst.
50 (1), 152–163.
can help the practitioner gain top management and stakeholder support Even, A., Shankaranarayanan, G., Berger, P.D., 2010b. Managing the quality of marketing
for data quality improvement initiatives, as outcomes and the organiza- data: cost/benefit tradeoffs and optimal configuration. J. Interact. Mark. 24 (3),
tional costs associated with them are composed. 209–221.
Ezell, J.D., Hazen, B.T., Hall, D.J., Jones-Farmer, L.A., 2016. Enhancing data and decision
quality with statistical process control. In: Warkentin, M. (Ed.), The Best Thinking in
References Business Analytics from the Decision Sciences Institute. Pearson FT Press, Upper
Saddle River, NJ, pp. 17–33.
Ajzen, I., 1991. The theory of planned behavior. Organ. Behav. Hum. Decis. Process. 50 Fenton, N.E., Pfleeger, S.L., 1998. Software Metrics: a Rigorous and Practical Approach,
(2), 179–211. second ed. PWS Publishing Co, Boston, MA.
Ajzen, I., 2011. The theory of planned behaviour: reactions and reflections. Psychol. Fichman, R.G., 2004a. Going beyond the dominant paradigm for information technology
Health 26 (9), 1113–1127. innovation research: emerging concepts and methods. J. Assoc. Inf. Syst. 5 (8),
Akter, S., Wamba, S.F., Gunasekaran, A., Dubey, R., Childe, S.J., 2016. How to improve 314–355.
firm performance using big data analytics capability and business strategy alignment? Fichman, R.G., 2004b. Real options and IT platform adoption: implications for theory and
Int. J. Prod. Econ. 182, 113–131. practice. Inf. Syst. Res. 15 (2), 132–154. http://dx.doi.org/10.1287/isre.1040.0021.
Argyris, C., Sch€
on, D.A., 1999. Organizational Learning II: Theory, Method, and Practice. Fishbein, M., Ajzen, I., 1975. Belief, Attitude, Intention and Behavior: an Introduction to
Addison-Wesley Publishing Company, Reading, MA. Theory and Research. Addison-Wesley, Reading, MA.
Ballou, D.P., Pazer, H.L., 1985. Modeling data and process quality in multi-input, multi- Fisher, C.W., Chengalur-Smith, I.S., 2003. Impact of experience and time on the use of
output information systems. Manag. Sci. 31 (2), 150–162. data quality information in decision making. Inf. Syst. Res. 14 (2), 170–188.
Ballou, D.P., Tayi, G.K., 1999. Enhancing data quality in data warehouse environments. Fisher, C.W., Kingma, B.R., 2001. Criticality of data quality as exemplified in two
Commun. ACM 42 (1), 73–78. disasters. Inf. Manag. 39 (2), 109–116.
Ballou, D.P., Wang, R., Pazer, H., Tayi, G.K., 1998. Modeling information manufacturing Forgionne, G.A., 1999. An AHP model of DSS effectiveness. Eur. J. Inf. Syst. 8 (2),
systems to determine information product quality. Manag. Sci. 44 (4), 462–484. 95–106.
Bartlett, R., 2013. A Practitioner's Guide to Business Analytics. McGraw Hill, New York, Frank, A.U., 2008. Analysis of dependence of decision quality on data quality. J. Geogr.
NY. Syst. 10 (1), 71–88.
Batini, C., Cappiello, C., Francalanci, C., Maurino, A., 2009. Methodologies for data Galbraith, J.R., 1974. Organization design: an information processing view. Interfaces 4
quality assessment and improvement. Assoc. Comput. Mach. Comput. Surv. 41 (3), (3), 28–36.
1–52. Goodhart, C., 1975a. Monetary Relationships: a View from Threadneedle Street Papers in
Benbasat, I., Goldstein, D.K., Mead, M., 1987. The case research strategy in studies of Monetary Economics, vol. I. Reserve Bank of Australia.
information systems. MIS Q. 369–386.

746
B.T. Hazen et al. International Journal of Production Economics 193 (2017) 737–747

Goodhart, C., 1975b. Problems of Monetary Management: the UK Experience Papers in Porter, L.J., Rayner, P., 1992. Quality costing for total quality management. Int. J. Prod.
Monetary Economics, vol. I. Reserve Bank of Australia. Econ. 27 (1), 69–81.
Grant, R.M., 1996a. Prospering in dynamically-competitive environments: organizational Powell, P., Woerndl, M., 2008. Time to stop researching the important things. Eur. J. Inf.
capability as knowledge integration. Organ. Sci. 7 (4), 375–387. Syst. 17 (2), 174–178.
Grant, R.M., 1996b. Toward a knowledge-based theory of the firm. Strat. Manag. J. 17 Premkumar, G., Ramamurthy, K., Saunders, C.S., 2005. Information processing view of
(S2), 109–122. organizations: an exploratory examination of fit in the context of interorganizational
Haug, A., Arlbjørn, J.S., 2011. Barriers to master data quality. J. Enterp. Inf. Manag. 24 relationships. J. Manag. Inf. Syst. 22 (1), 257–294.
(3), 288–303. Price, R., Shanks, G., 2011. The impact of data quality tags on decision-making outcomes
Haug, A., Zachariassen, F., Van Liempd, D., 2011. The costs of poor data quality. J. Ind. and process. J. Assoc. Inf. Syst. 12 (4).
Eng. Manag. 4 (2), 168–193. Raghunathan, S., 1999. Impact of information quality and decision-maker quality on
Hazen, B.T., Boone, C.A., Ezell, J.D., Jones-Farmer, L.A., 2014. Data quality for data decision quality: a theoretical model and simulation analysis. Decis. Support Syst. 26
science, predictive analytics, and big data in supply chain management: an (4), 275–286.
introduction to the problem and suggestions for research and applications. Int. J. Redman, T.C., 1992. Data Quality: Management and Technology. Bantam Books, New
Prod. Econ. 154, 72–80. York, NY.
Hey, T., 2010, Nov. The big idea: the next scientific revolution. Harv. Bus. Rev. 88, 56–63. Redman, T.C., 1996. Data Quality for the Information Age. Artech House Publishers,
Ho, L.L., Quinino, R.C., 2013. An attribute control chart for monitoring the variability of a Norwood, MA.
process. Int. J. Prod. Econ. 145 (1), 263–267. Redman, T.C., 2001. Data Quality: the Field Guide. Digital Press, Boston, MA.
Hollander, S., 1997. The Economics of Thomas Robert Malthus, vol. 4. University of Redman, T.C., 2008. To solve this data-driven crises, we need better data. Retrieved from.
Toronto Press. http://blogs.hbr.org/cs/2008/09/we_need_better_data_to_solve_t.html.
Huang, K., Lee, Y., Wang, R.Y., 1999. Quality Information and Knowledge. Prentice Hall, Roberts, N., Galluch, P.S., Dinger, M., Grover, V., 2012. Absorptive capacity and
Saddle River, NJ. information systems research: review, synthesis, and directions for future research.
Hwang, M.I., Schmidt, F.L., 2011. Assessing moderating effect in meta-analysis: a re- MIS Q. 36 (2), 625–648.
analysis of top management support studies and suggestions for researchers. Eur. J. Roberts, S.W., 1959. Control chart tests based on geometric moving averages.
Inf. Syst. 20 (6), 693–702. Technometrics 1 (3), 239–250.
Isson, J.P., Harriott, J., 2013. Win with Advanced Business Analytics: Creating Business Rungtusanatham, M., 2001. Beyond improved quality: the motivational effects of
Value from Your Data. John Wiley & Sons, Hoboken, New Jersey. statistical process control. J. Oper. Manag. 19, 653–673.
Jones-Farmer, L.A., Ezell, J.D., Hazen, B.T., 2014. Applying control chart methods to Shadish, W., Cook, T., Campbell, D., 2002. Experimental and Quasi-experimental Designs
enhance data quality. Technometrics 56 (1), 29–41. for Generalized Causal Inference. Houghton Mifflin Co., Boston, MA.
Kahn, B.K., Strong, D.M., Wang, R.Y., 2002. Information quality benchmarks: product and Shin, B., 2003. An exploratory investigation of system success factors in data
service performance. Commun. Assoc. Comput. Mach. 45 (4), 184–192. warehousing. J. Assoc. Inf. Syst. 4 (1), 141–170.
Karr, A.F., Sanil, A.P., Banks, D.L., 2006. Data quality: a statistical perspective. Stat. Singh, R., Keil, M., Kasi, V., 2009. Identifying and overcoming the challenges of
Methodol. 3 (2), 137–173. http://dx.doi.org/10.1016/j.stamet.2005.08.005. implementing a project management office. Eur. J. Inf. Syst. 18 (5), 409–427.
Kim, H.-W., Gupta, S., 2014. A user empowerment approach to information systems Sparks, R., OkuGami, C., 2010a. Data quality: algorithims for automatic detection of
infusion. Eng. Manag. IEEE Trans. 61 (4), 656–668. unusual measurements. In: Lenz, H.-J., Schmid, W., Wilrich, P.-T. (Eds.), Frontiers in
Krippendorff, K., 2004. Content Analysis an Introduction to its Methodology, second ed. Statistical Quality Control, vol. 10. Springer, Heidelberg.
Sage Publications, London. Sparks, R., OkuGami, C., 2010b, Aug.. Data quality: algorithims for automatic detection of
LaValle, S., Lesser, E., Shockley, R., Hopkins, M.S., Kruschwitz, N., 2010, Winter. Big data, unusual measurements. In: Paper Presented at the 10th International Workshop on
analytics, and the path from insights to value. MIT Sloan Manag. Rev. 52, 21–31. Intelligent Statistical Process Control, Seattle, WA.
Lenz, H.J., Borowski, E., 2010, Aug. Business data quality control - a step by step Tan, K.H., Zhan, Y., Ji, G., Ye, F., Chang, C., 2015. Harvesting big data to enhance supply
procedure. In: Paper Presented at the 10th International Workshop on Intelligent chain innovation capabilities: an analytic infrastructure based on deduction graph.
Statistical Process Control, Seattle, WA. Int. J. Prod. Econ. 165, 223–233.
Liberatore, M.J., Pollack-Johnson, B., 2013. Improving project management decision Venkatesh, V., Bala, H., 2008. Technology acceptance model 3 and a research agenda on
making by modeling quality, time, and cost continuously. Eng. Manag. IEEE Trans. 60 interventions. Decis. Sci. 39 (2), 273–315.
(3), 518–528. Venkatesh, V., Thong, J.Y.L., Xu, X., 2012. Consumer acceptance and use of information
Lin, H.-E., McDonough III, E.F., 2011. Investigating the role of leadership and technology: extending the unified theory of acceptance and use of technology. MIS Q.
organizational culture in fostering innovation ambidexterity. Eng. Manag. IEEE 36 (1), 157–178.
Trans. 58 (3), 497–509. Voss, C., Tsikriktsis, N., Frohlich, M., 2002. Case research in operations management. Int.
Maier, A.M., Moultrie, J., Clarkson, P.J., 2012. Assessing organizational capabilities: J. Oper. Prod. Manag. 22 (2), 195–219.
reviewing and guiding the development of maturity grids. Eng. Manag. IEEE Trans. Wamba, S.F., Akter, S., Edwards, A., Chopin, G., Gnanzou, D., 2015. How ‘big data’can
59 (1), 138–159. make big impact: findings from a systematic review and a longitudinal case study. Int.
Marsh, R., 2005. Drowning in dirty data? It's time to sink or swim: a four-stage J. Prod. Econ. 165, 234–246.
methodology for total data quality management. J. Database Mark. Cust. Strategy Wang, G., Gunasekaran, A., Ngai, E.W., Papadopoulos, T., 2016. Big data analytics in
Manag. 12 (2), 105–112. logistics and supply chain management: certain investigations for research and
Mayer-Sch€ onberger, V., Cukier, K., 2013. Big Data: a Revolution that Will Transform How applications. Int. J. Prod. Econ. 176, 98–110.
We Live, Work, and Think. Houghton Mifflin Harcourt, Boston. Wang, R.Y., Kon, H.B., 1993. Towards total data quality management (TDQM). In:
McAfee, A., Brynjolfsson, E., 2012. Big data: the management revolution. Harv. Bus. Rev. Wang, R.Y. (Ed.), Information Technology in Action: Trends and Perspectives.
90 (10), 60–69. Prentice-Hall, Englewood Cliffs, NJ.
McIntosh, P., 2010. Action Research and Reflective Practice. Creative and Visual Methods Wang, R.Y., Strong, D.M., 1996. Beyond accuracy: what data quality means to data
to Facilitate Reflection and Learning. Routledge, Oxon. consumers. J. Manag. Inf. Syst. 12 (4), 5–33.
Merriam, S.B., 1998. Qualitative Research and Case Study Applications in Education. Warth, J., Kaiser, G., Kugler, M., 2011. The impact of data quality and analytical
Revised and Expanded from“ Case Study Research in Education.”: ERIC. capabilities on planning performance: insights from the automotive industry. In:
Merriam, S.B., Tisdell, E.J., 2016. Qualitative Research: a Guide to Design and Paper Presented at the 10th International Conference on Wirtschaftsinformatik,
Implementation, fourth ed. Jossey-Bass. Zurich, Switzerland.
Mitra, A., 2008. Fundamentals of Quality Control and Improvement, third ed. John Wiley Weigel, F.K., Hazen, B.T., 2014. Technical proficiency for IS success. Comput. Hum.
& Sons, Hoboken, NJ. Behav. 31, 27–36.
Ou, Y., Wu, Z., Tsung, F., 2012. A comparison study of effectiveness and robustness of Weigel, F.K., Landrum, W.H., Hall, D.J., 2009. Human-technology adaptation fit theory
control charts for monitoring process mean. Int. J. Prod. Econ. 135 (1), 479–490. for healthcare. In: Paper Presented at the Twelfth Annual Conference of the Southern
Patton, M.Q., 2015. Qualitative Evaluation and Research Methods, fourth ed. SAGE Association for Information Systems (SAIS), Charleston, SC.
Publications, Inc, Thousand Oaks, California, USA. Weller, E.F., 2000. Practical applications of statistical process control [in software
Pavlou, P.A., El Sawy, O.A., 2006. From IT leveraging competence to competitive development projects]. IEEE Softw. 17 (3), 48–55.
advantage in turbulent environments: the case of new product development. Inf. Syst. Wickham, H., 2014. Tidy data. J. Stat. Softw. 59 (10), 1–23.
Res. 17 (3), 198–227. Wigfield, A., Eccles, J.S., 2000. Expectancy-value theory of achievement motivation.
Pierchala, C.E., Surti, J., Peytcheva, E., Groves, R.M., Kreuter, F., Kohler, U., Young, J., Contemp. Educ. Psychol. 25, 68–81.
2009. Control charts as a tool for data quality control. J. Off. Stat. 25 (2), 167–191. Wixom, B.H., Watson, H.J., 2001. An empirical investigation of the factors affecting data
Pipino, L.L., Lee, Y.W., Wang, R.Y., 2002. Data quality assessment. Commun. Assoc. warehousing success. MIS Q. 25 (1), 17–41.
Comput. Mach. 45 (4), 211–218. Wu, Z., Jiao, J., He, Z., 2009. A single control chart for monitoring the frequency and
Port, D., Bui, T., 2009. Simulating mixed agile and plan-based requirements prioritization magnitude of an event. Int. J. Prod. Econ. 119 (1), 24–33.
strategies: proof of concept and practical implications. Eur. J. Inf. Syst. 18 (4), Yin, R.K., 2014. Case Study Research: Design and Methods. 2013. Sage Publications.
317–331.

747

View publication stats

You might also like