Arc Report Operational Historian

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

ARC VIEW

AUGUST 23, 2018


Operational Historians and Time-Series Data
Platforms for Digital Transformation

By Janice Abel

Keywords

Operational Historian, Real-time Database, Time Series Database, Open


Source Database, Open Source Applications

Summary

The fourth industrial revolution, or Industrie 4.0, has led to exponential


changes in industrial operations and manufacturing. Digital technologies
and sensor-based data are fueling everything from
ARC is seeing new time-series databases
advanced analytics and machine learning to aug-
and new deployment models. How will
these databases affect the operational mented and virtual reality models.
historian markets? How are traditional
historian suppliers responding to Sensor-based data is not easily handled by tradi-
changing market conditions? tional relational databases. As a result, time series
databases are on the rise and, according to ARC
Advisory Group research, this market is expected to grow at over 6 percent
per year. These databases specialize in collecting, contextualizing, and mak-
ing sensor-based data available.

In general, two classes of time-series databases have emerged: well-estab-


lished operational data infrastructures (or operational historians) such as
OSIsoft’s PI system; and newer open source time-series databases, such as
InfluxDB or Graphite.

What’s the Difference?


Functionally, at a high level, they both perform the same task of capturing
and serving up machine and operational data. The differences revolve
around types of data, features, capabilities, and relative ease of use.

VISION, EXPERIENCE, ANSWERS FOR INDUSTRY


ARC View, Page 2

Established Data Infrastructure


The industrial world’s version of commercial-off-the-shelf (COTS) software,
most established data infrastructure solutions can be integrated into opera-
tions relatively quickly. OSIsoft, for example, says its PI System can
sometimes be deployed in a week or less and that it can take advantage of a
broad ecosystem of more than 450 data connectors, third-party analytics, vis-
ualization tools, and other technologies. (According to OSIsoft, Aurelia
Metals was able to obtain a complete return on investment from its PI System
in just 12 days). While most customers have installed the PI System as on-
premise software acquired under a standard perpetual license, ARC antici-
pates seeing more customers subscribe to OSIsoft technology through cloud
services. In some cases, that “cloud” could be a bank of servers sitting in a
location owned by a third-party service provider.

In general, established historian platforms such as the PI System are de-


signed to make it easier to access, store, and share real-time operational data
securely within a company or across an ecosystem.
In general, established historian
While, in the past, industrial data was primarily con-
platforms such as the PI System are
designed to make it easier to access, sumed by engineers and maintenance crews,
store, and share real-time increasingly, that data will be used by financial depart-
operational data securely within a ments, insurance companies, downstream and
company or across an ecosystem. upstream suppliers, equipment providers selling add-
While the associated security
on monitoring services, and others. While the associ-
mechanisms were already relatively
ated security mechanisms were already relatively
sophisticated, they are evolving to
sophisticated, they are evolving to become even more
become even more secure.
secure.

Another major strength of established operational data infrastructures such


as the OSIsoft PI System, is that they were purpose-built and have evolved
to be able to efficiently store and manage time-series data from industrial
operations. As a result, they are better equipped to optimize production, re-
duce energy consumption, implement predictive maintenance strategies to
prevent unscheduled downtime, and enhance safety. The shift from using
the term “data historian” to “data infrastructure” is intended to convey the
value of compatibility and ease-of-use.

New Open Source Products


In contrast, flexibility and a lower upfront purchase cost are the strong suits
for the newer open source products. Not surprisingly, these newer tools are

©2018 ARC • 3 Allied Drive • Dedham, MA 02026 USA • 781-471-1000 • ARCweb.com


ARC View, Page 3

initially being adopted by financial companies (which often have sophisti-


cated in-house development teams) or for specific projects where scalability,
ease-of-use, and the ability to handle real-time data are not as critical. Since
these new systems are somewhat less proven in terms of performance, secu-
rity, and applications, users are likely to experiment with them for tasks in
which safety, lost production, or quality are less critical.

While some of the newer open source time series databases are starting to
build the kind of data management capabilities already typically available in
a mature operational historian, they are not likely to replace operational data
infrastructures in the foreseeable future. Industrial organizations should use
caution before leaping into newer open source technologies. They should
carefully evaluate the potential consequences in terms of development time
for applications, security, costs to maintain and update, and their ability to
align, integrate or co-exist with other technologies. It is important to under-
stand operational processes and the domain expertise and applications that
are already built-into an established operational data infrastructure.

Convergence and Harmony


Rather than compete head on, it’s likely that the established historian/data
infrastructures and open source time-series databases will co-exist in the
coming years. OSIsoft, for instance, is collaborating with open source com-
panies to develop edge technologies to make it easier to link more devices
directly to the PI System as well as have greater local compute and analytic
power.

As the open source time series database companies progressively add distin-
guishing features to their products over time, it will be interesting to observe
whether they lose some of their open source characteristics. To a certain ex-
tent, we saw this dynamic play out in the Linux world.

The Real Database Battle

The most important differences between relational databases, time-series da-


tabases, and data lakes and other data sources is the ability to handle time-
stamped process data and ensure data integrity.

While relational databases are designed to structure data in rows and col-
umns, a time-series database or infrastructure aligns sensor data with time

©2018 ARC • 3 Allied Drive • Dedham, MA 02026 USA • 781-471-1000 • ARCweb.com


ARC View, Page 4

as the primary index. This is relevant because the primary job of the data
management technology is to:

• Accurately capture a broad array of data streams


• Deal with very fast process data
• Align time stamps
• Ensure the quality and integrity of the data
• Ensure cybersecurity
• Serve up these data streams in a coherent, contextualized way for opera-
tional personnel

To gain maximum value from sensor data from operational machines, data
must be handled relative to its chronology or time stamp. Because the time
stamp may reflect either the time when the sensor made the measurement,
or the time when the measurement was stored in the historian (depending
upon the data source), it is important to distinguish between the two.

Time series data technologies - whether open source databases or established


historians - are built for real-time data. Relational databases, in contrast, are
built to highlight relationships, including the metadata attached to the meas-
urement (alarm limits, control limits, customer spend, bounce rate,
geographic distribution between different data points, etc.). Relational tech-
nologies can be applied to time series data, but this requires substantial
amounts of data preparation and cleaning and can make data quality, gov-
ernance, and context at scale difficult.

Data lakes, meanwhile, score well on scalability and cost-per-GB, but poorly
on data access and usability. Not surprisingly, while data lakes have the
most volume of data, they have the fewest users. As with time series technol-
ogies, the market will decide which and how these different technologies get
used. But this will take time.

Why Use an Operational Data Infrastructure?

ARC believes that modern operational historians and data infrastructures,


such as the OSIsoft PI System will be key enablers for the digital transfor-
mation of industry. Industrial organizations should give serious
consideration when investing in modern operational historians and data
platforms designed for industrial processes. Ten things to consider when se-
lecting a data infrastructure for operations:

©2018 ARC • 3 Allied Drive • Dedham, MA 02026 USA • 781-471-1000 • ARCweb.com


ARC View, Page 5

• Data quality - The ability to ingest, cleanse, and validate data. For ex-
ample, are you really obtaining an average – e.g. if someone calibrates a
sensor – will the average include the calibration data? If an operator or
maintenance worker puts a controller in manual, has an instrument that
failed, or is overriding alarms, does the historian or data-base still record
the data? Will the average include the manual calibration setpoint?

• Contextualized data - When dealing with asset and process models


based on years of experience integrating, storing, and accessing indus-
trial process data and its metadata, it’s important to be able to
contextualize data easily. A key attribute is the ability to combine differ-
ent data types and different data sources. Can the historian combine data
from spreadsheets and different databases or data sources, precisely syn-
chronize time stamps and be able to make sense of it?

• High-frequency/high-volume data - It’s also important to be able to


manage high-frequency, high-volume data based on the process require-
ments, and expand and scale as needed. Increasingly, this includes edge
and cloud capabilities.

• Real-time accessibility - Data must be accessible in real time so the in-


formation can be used immediately to run the process better or can be
used to prevent abnormal behavior. This alone can bring enormous in-
sights and value to organizations.

• Data compression - Deep compression based on specialized algorithms


that compress data, but enables users to reproduce a trend, if needed.

• Sequence of events - SOE capability enables user to reproduce precisely


what happened in operations or a production process.

• Statistical analytics - Built in analytics capabilities for statistical spread-


sheet-like calculations to perform more complex regression analysis.
Additionally, time series systems should be able to stream data to third
party applications for advanced analytics, machine learning (ML) or ar-
tificial intelligence (AI).

• Visualization - The ability to easily design and customize digital dash-


boards with situational awareness that enable workers to easily visualize
and understand what is going on.

©2018 ARC • 3 Allied Drive • Dedham, MA 02026 USA • 781-471-1000 • ARCweb.com


ARC View, Page 6

• Connectability - Ability to connect to data sources from operational and


plant equipment, instruments, etc. While often time-consuming to build,
special connectors can help. OPC is a good standard, but may not work
for all applications.

• Time stamp synchronization - Ability to synchronize time stamps based


on the time the instrument is read wherever the data is stored – on-prem-
ise, in the cloud, etc. These time stamps align with the data and metadata
associated with the application.

• Partner ecosphere – Can make it easy to layer purpose-built vertical ap-


plications onto the infrastructure for added value.

Recommendations
When choosing operational historians, data infrastructures, and time-series
databases, many issues need to be considered and carefully evaluated within
a company’s overall digital transformation process. These include type of
data, speed of data, industry- and application-specific requirements, legacy
systems, and potential compatibility with newly emerging technologies.
Both established operational data infrastructures and the newer open source
platforms continue to evolve and add new value to the business, but the sig-
nificant domain expertise now embedded within the former should not be
overlooked.

For further information or to provide feedback on this article, please contact your
account manager or the author at jabel@arcweb.com. ARC Views are published and
copyrighted by ARC Advisory Group. The information is proprietary to ARC and
no part of it may be reproduced without prior permission from ARC.

©2018 ARC • 3 Allied Drive • Dedham, MA 02026 USA • 781-471-1000 • ARCweb.com

You might also like