Professional Documents
Culture Documents
A Foundation For Data Analytics in Manufacturing Using Ibapda and Open-Source Machine Learning Tools
A Foundation For Data Analytics in Manufacturing Using Ibapda and Open-Source Machine Learning Tools
Learning Tools
Marcelo Murta Cardoso
INTRODUCTION
A great number of digital signals of process variables is already available in most steel plants as part of the PLC or computer-
based automation of their operations. While a data logging or historian system is also already implemented in most plants, it
is a common perception that the benefit or value of the process data may not be fully appreciated. The availability of open-
source software with state-of-the-art capabilities for statistical analyses and machine learning applications adds to the notion
that value creation may not be maximized.
To further advance digitalization, Gerdau Fort Smith has expanded use of an existing ibaPDA data historian system by an
upgrade to a data server version and integration with open-source statistical and machine learning tools. Upgraded with a
historical data server, the ibaPDA system provides the data lake foundation as well as the first level of analytics for data
cleaning, filtering, and process variable aggregation. Process events triggered upon data ingestion are used to automatically
execute statistical analyses or machine learning models implemented using open-source libraries for automatic notifications,
anomaly detection and process diagnostics.
There are dozens of different vendors of systems for the storage of signals with time series of values of process variables.
Known as data historians, process historians or enterprise historians, the historian software packages were first sold in the
mid-1980s for the storage of data for regulatory, reporting, asset availability and diagnostic purposes (1). Gerdau Fort Smith
has chosen the ibaPDA as the plant’s historian given several references at other Gerdau plants. Originally known for fast data
acquisition, the ibaPDA system has been the preferred solution for several OEMs of process lines that requires fast data
acquisition such as Hot Strip Mills. The first ibaPDA workstations were installed at Fort Smith in 2012.
Troubleshooting
Troubleshooting of equipment faults is the most traditional application and it is the main use of the ibaPDA system at
Gerdau. Video camera storage has added troubleshooting capability with visual inspection of synchronized process signals
and video. Figure 2 shows the main components of the ibaPDA system with the ibaAnalyzer as the main application for
visual analysis and preliminary statistical analysis of equipment or process issues.
ANALYTICS PLATFORM
Once raw data is collected, a typical data science project encompasses the steps shown in the horizontal bar of Figure 1. After
data ingestion, Exploratory Data Analyses can be carried out to detect the need for further processing, Data Cleaning and
filtering. If a set of valid data, pertinent to an analysis question, is retrieved from the historical data base of the ibaHD server,
a statistical or machine learning Model can be training for insights or prediction in response to an analysis question. Once an
acceptable model performance is achieved, Deployment of the model into a production environment can be implemented (3).
Data Export
A paid add-on feature enables exporting of raw or aggregated data of ibaAnalyzer reports for use in Exploratory Data
Analysis or model training. Automatic report generation or data export can also be triggered by a real time event using the
ibaDatCoordinator service. The ibaDatCoordinator is a powerful component that enables the ibaPDA system to be the
central element of the analytics platform. This is due to the ability of the ibaDatCoordinator to also run “*.bat” batch files (4)
with commands of the command-line interface. The batch files are executed by the ibaDatCoordinator upon a time-based or
event-based triggered situation. The ibaDatCoordinator and the execution of a “*.bat” script file are shown on the diagram of
Figure 1. Direct API, Application Programming Interface, for queries of the historical database can also be developed for data
exporting from the ibaHD server.
Model Execution
Real time events can be set up in the ibaDatCoordinator service to trigger the execution of jobs which carries out different
tasks. A job task can generate a report, export data, run a batch file, among other tasks. A previously trained model or
statistical analysis can be therefore automatically executed as a script or “*.exe” executable file in a batch file of a job task
triggered by an ibaDatCoordinator event. ibaDatCoordinator jobs can also be scheduled to be executed on a given recurrent
time basis.
APPLICATION EXAMPLE
The bag houses of the melt-shop in Fort Smith have a Continuous Emission Monitor System to monitor NOx generated by
several possible sources including 4 ladle preheaters, burners, and oxygen lances of 2 EAFs. The CEMS equipment would
alarm for possible trend to exceed limit for NOx emission, but it is oftentimes difficult to find the source of the increased
NOx generation. Signals of process variables that could potentially impact NOx generation are available at the iba system but
difficult to interpret for correlation investigation.
Analysis Question: Which equipment should be checked for increased NOx generation?
Visual inspection of the trends of process variables and CEMS readings does not provide quantitative results. This is a rather
common situation when data is available but visual inspection alone is not enough to the required analysis question. In many
cases, we have the data, but the underlining information or the value in the data is not directly determined.
SUMMARY
Pros
1. Simplifies development of Machine Learning projects as the ibaPDA system provides for the first phases of the analytics
process while also providing automatic data extracting for model development as well as model deployment
2. Streamlines development of Proof-of-Concept projects as it minimizes modification of existing process control
applications
3. Low cost as the ibaPDA usually originally justified for troubleshooting functionality while license free open-source
software provide state of the art machine learning capability
4. Several possible applications for process diagnostics and optimization
5. Narrows the bridge between process engineers and automation engineers
Cons
1. Processing time doesn’t allow for applications that requires fast or ‘real-time’ response
2. Contingent on IT infrastructure reliability for alarming using emails, smartphone notifications, SharePoint, or webpage
reports
3. Still requires development of interfaces for alarming on existing SCADA or level 1 Human Machine Interfaces, HMIs
4. Data acquisition needs to be stopped for modifications in the I/O configuration
CONCLUSION
The ability to trigger the execution of “*.bat” batch files is a powerful feature available at the ibaPDA system that enables a
joint platform with open-source machine learning tools. The analytics platform provides the foundation for all steps in the
development of machine learning projects. It also streamlines and simplifies prototyping Proof of Concept projects given that
statistical analysis and machine learning models can be created and deployed into an operational environment as originally
developed by the process engineer or data scientist. Although not recommended for ‘real-time’ control applications, the
presented platform suits well to optimization and diagnostics analytics of either industrial processes or equipment reliability
without requiring significant software programming development. By contrast, deployment of ‘real-time’ statistical analyses
or machine learning models would require development of API interfaces and programming of new or modified run time
applications.
REFERENCES
1. Michael Risse, “The Data Historian’s History Told”, Control Engineering, https://www.controleng.com/articles/the-
data-historians-history-told/
2. ibaPDA website, https://www.iba-ag.com/en/process-connectivity
3. Microsoft Documentation, “The Team Data Science Process lifecycle”, https://docs.microsoft.com/en-
us/azure/architecture/data-science-process/lifecycle
4. Batch file entry at Wikipedia, https://en.wikipedia.org/wiki/Batch_file
5. H2O.ai Documentation, “H2O AutoML Tutorial”, https://docs.h2o.ai/h2o-tutorials/latest-stable/h2o-world-
2017/automl/index.html