Professional Documents
Culture Documents
Practical Formulas For Estimating The Value of Quality Data
Practical Formulas For Estimating The Value of Quality Data
Publications.iaidq.org
Our clients report that up to 50% of a team's time is spent reconciling poor quality data. The need for
reconciliation is generally driven by two factors:
The quality of data is not sufficient for the purpose intended. For example, the fields are not complete;
the company names are not standardized, leading to duplicate records; the information is not timely, etc. The
process to “fix” the data is generally simple, albeit time-consuming, when we deal with a single system. This
entails determining what high quality data means and changing the data in the source system. Reactively
“fixing” data over and over is a huge drain on productivity.
The same data is pulled from two or more systems and they don’t match. In many companies, the same
data is replicated in literally dozens of applications. The complexity increases ten-fold when systems are not
consolidated (or are poorly consolidated, often rendering the data unusable for executive-level decisions). In
this scenario, the time spent reconciling data is similar to the above example, but the effort to “fix” the data is
much more daunting since the effort involves multiple systems. This process is always painful. Reconciliation
costs, development time, maintenance costs, and technical resources could be dramatically minimized if all
systems leveraged one single source of truth. Although getting to just a single data source may not be realistic,
organizations can achieve value by consolidating shared business data (e.g. customer, product, account).
Additionally, subsequent hidden costs can be eliminated by virtue of eliminating the manual data aggregation
that occurs for point-in-time data needs.
http://iaidq.org/publications/doc2/oneal-2012-03.shtml 28/03/2012
IAIDQ--publications-- Practical Formulas for Estimating the Value of Quality Data Página 2 de 3
The financial impact of manual data reconciliation is also simple, yet time-consuming to calculate. The
productivity cost is merely the number of hours spent on the reconciliation and “fixing” the data, multiplied by
the fully loaded cost of the employee. Generally this adds up to a significant cost over time, which means
investments in technology solutions yield a positive return on investment.
The process to access and retrieve data becomes time consuming when the best data source is unknown, when
it’s unclear who can authorize the use of data from a specific source, and when the complete content (definitions
and derivations) of data are not well understood. Up to 20% of the time spent to create a new report can easily
be spent locating and retrieving the best data.
Let’s think about the above scenario from a data access point of view. If the time and cost associated with
locating and accessing data from a single source increases exponentially when multiple sources are involved, we
can expect a significant drain on productivity.
Although time consuming to calculate, the same formula applies.
The length of time spent to locate and access data can be drastically minimized, and resources can be better
spent on more value-added activities such as research and analysis leading to new market development.
The above formulas should therefore include the opportunity cost of an employee spending time on low value
activities:
$Impact = [E1(Hours x FTE cost) + E2(Hours x FTE cost) +En(Hours x FTE)] + opportunity cost
of E1, 2, 3, n
One could also argue that there is an increase in software and hardware costs due to poor quality. If these costs
were included, rather than calculating monies already spent (sunk cost), the formula would need to include the
incremental increase in tools, technology and support to cleanse, standardize, validate, master and manage the
data.
3. Project Delays
Aside from impeding day-to-day operations, poor data quality, inconsistent data, and inaccessible data can also
cause project delivery delays. What are the costs for project delays arising from bad data?
Projects can be impacted significantly by data issues – cost and duration can exceed three times the initial
estimates. At one client, a project was supposed to take a year when it was kicked off. Two and a half years
later, phase 1 (of 3) has just gone live with more than half of the budget spent ($3M) and only 50-60% of the
planned functionality delivered. More than half of this overage was directly attributed to data quality issues.
Project delays can also increase the amount of money spent on consulting and implementation services and
software/hardware support, especially if these services began prior to the production date. A project that
exceeds the original budget can also limit a company’s ability to execute on other projects by tying up funding.
To calculate the financial impact of poor quality data on a project, the formula becomes more complex:
$Impact = [E1(Hours x FTE cost) + E2(Hours x FTE cost) +En(Hours x FTE)] + Dconsulting cost
+ D support cost + opportunity cost of delay
E = Employees involved in reconciliation, access and remediation
Step 2: Determine how this value would be impacted based on having good (or bad) access to information.
http://iaidq.org/publications/doc2/oneal-2012-03.shtml 28/03/2012
IAIDQ--publications-- Practical Formulas for Estimating the Value of Quality Data Página 3 de 3
For example, how would an independent third-party (such as a potential buyer) value the organization if the
company was offered without historical information about its products, customers, staff, and risk?
Mike2.0 assumes a starting point of 20%, but indicates that information could provide as much as 50% of the
value of a company. For publicly traded companies, if the total market value of the company is $50 billion, the
value of information and knowledge can be nominally set at $10 billion (20% of the market cap).
To-date, the authors of this formula are yet to find any credible opposition to such a low estimate, yet the
majority of participants who attend their workshops are surprised by the implications.
Conclusion
Data has an intrinsic value to the organization that is seldom fully recognized. Business research and analysis
needs accurate, clean, well-understood data. Without this foundation, analytics effort becomes bogged down in
data reconcilation rather than generating insight. Improved, data-driven decision making is impeded.
Although more difficult to quantify, high quality data enables innovation and therefore fosters competitive
advancement. Product launches and evolution proceed faster and can be integrated more easily when there are
effective processes and tools to manage the data. Ultimately the largest indirect benefit to the organization is the
reduction in reputational risk.
Kelle O’Neal founded First San Francisco Partners in 2007 following her early work
with software and systems integration providers who were key to the development and
provisioning of customer data integration and master data management (MDM)
solutions. Her prior experience includes senior management roles with Siperian as
General Manager, EMEA, GoldenGate Software, Oracle, and Siebel Systems in the U.S.,
Europe and Asia.
Kelle has an unprecedented ability to work through organizational complexity, build
consensus, and drive results at senior levels. Her background in CRM, enterprise
software, and systems intergration enables her to provide expert counsel in the
Enterprise Information Management market to the most complex, global organizations.
Kelle holds an MBA from the University of Chicago Booth School of Business and
earned a B.A. from Duke University. She can be reached at kelle [at] firstsanfranciscopartners [dot] com
and on Twitter at @FirstSanFranMDM
© International Association for Information and Data Quality | Page updated: Tuesday, 13 March 2012
http://iaidq.org/publications/doc2/oneal-2012-03.shtml 28/03/2012