Factor Analysis in Fault Diagnostics Using Random Forest: Nagdev Amruthnath

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

1

Factor Analysis in Fault Diagnostics using


Random Forest
Nagdev Amruthnath

 operational and maintenance metrics for U.S. industry based


Abstract—Factor analysis has been widely used in on a survey by NASA can be seen in table 1. There are various
classification problems for identifying specific factors that are advantages to predictive maintenance such as an increase in
significant to particular classes. This type of analysis is widely productivity and quality, and a decrease in product cost. Lost
used in application such as customer segmentation. Today, factor
productivity has been one of the undesirable forces in failing
analysis is prominently being used in fault diagnosis to identify
the significant factors to study the root cause of a specific fault. to achieve JIT in manufacturing which involves producing the
In this paper, a real case of an industrial rotating machine is highest quality products at the least cost in the lowest possible
considered where vibration and ambient temperature data is lead time [1]. In the age of connected devices, IoT sensors
collected for monitoring the health of the machine. Gaussian have played a vital role in monitoring critical assets in real
mixture model-based clustering is used to cluster the data into time across various manufacturing environments. The cost and
significant groups, and spectrum analysis is used to diagnose
security of these sensors might also be considered as one of
each cluster to a specific state of the machine. The important
features that attribute to a particular mode of the machine are the driving forces for condition-based monitoring [2]. Physics-
identified by using random forest classification model. The based data, process data or a hybrid of both is one of the most
important features for specific modes of the machine are used to sought data in condition-based monitoring. Today, different
conclude that the clusters generated are distinct and have a techniques such as machine learning, A.I, data mining, and
unique set of important features. statistics are used to monitor the machine continuously. Some
of the commonly used models for condition-based monitoring
Index Terms—fault diagnosis, predictive maintenance, fault
detection, machine learning, A.I, random forest are binary classification, PCA-T2 and SPE statistic [3], k-
means clustering, one class support vector machine (SVM)
I. INTRODUCTION and logistic regression.

F AULT diagnosis has been one of the critical components


in predictive maintenance. In manufacturing, when the
machine fails, most of the maintenance time is spent towards
Table 1: Operation and maintenance metrics for U.S.
Industries [4]
investigating the failure through the process of trial and error Metric Benchmark
or experience. Even in cases where the fault is detected early, Equipment availability > 95%
using detection techniques; in many cases, the time to Schedule compliance > 90%
investigate the failure is much higher than actual maintenance
time. With early failure detection, the maintenance can be emergency maintenance percentage < 10%
performed during the non-production time which may reduce maintenance overtime percentage < 5%
the cost of lost production time but, it still increases the preventive maintenance completion
maintenance cost. High maintenance costs have been driving percentage > 90%
manufacturing industries from total preventive maintenance preventive maintenance budge/cost 15% - 18%
(TPM) to predictive maintenance (also called condition-based
predictive maintenance budge/cost 10% - 12%
monitoring) where the maintenance on a machine is performed
when it’s needed. There are various advantages over this
Fault diagnosis is a process of diagnosing the failure mode
technique that has been cited in the past such as driving the
of the machine. For example, in a motor, the failure mode can
maintenance cost down, utilizing the complete life of part and
be an imbalance, shaft displacement or bearing issues. One of
increase in the production time. Hence, a significant transition
the well-known and most commonly used diagnoses in the
has been observed from “just-in-case” to “just-in-time”
past and even today is vibration spectrum analysis. The
maintenance where critical machines are monitored
vibration data in the time domain is converted to the frequency
continuously to observe their health and any deviation from
spectrum, and this spectrum is diagnosed to identify the failure
their normal condition is the early stages of degradation. The
mode of the machine. Today, this is still one of the most
popular techniques and widely used in the manufacturing.
Nagdev Amruthnath, is with Department of IEE and EDMM, Western
Michigan University, Kalamazoo, MI, 49009, USA (e-mail:
Spectrum analysis has also evolved from vibration to sound
nagdev.amruthnath@wmich.edu). and other signal data. Although, this is still one of the most
2

popular techniques, one of the main challenges of this method operates in a high-temperature environment. Vibration sensors
is the need for high domain knowledge and experience. In are mounted on X-axis and Y-axis. The vibration data is
most cases with hundreds of machines in a manufacturing collected in time series at a sample rate of 2048 Hz. This data
facility, this method almost becomes impractical and is collected every 5 minutes for 5 months continuously. Time
expensive. Due to this issue supervised machine learning signals are converted to frequency signals, and the features are
techniques are mostly used to diagnose the faults in the extracted in both domains. Since the operating frequency for
machines. Some of the most commonly used classification machinery was known to be 26.1Hz, the features in frequency
models for fault diagnosis are multi-class SVM, K-nearest domain were collected around this band. Total of 36 vibration
neighbor, neural networks, and decision trees. In changing features and ambient temperature around this machinery was
environment such as manufacturing, if the classification obtained.
models are not trained for all states of the machine then, a new
state of the machine (not part of the trained model) will be In most clustering models, the number of clusters to be
misclassified to a known state. Hence, unsupervised learning formed is user-defined. The most commonly used techniques
techniques have become more popular in fault state detection for finding the optimal number of clusters are by using AIC,
using clustering. Some of the commonly used techniques in BIC, within sum of square (WSS), gap statistic or silhouette
the Gaussian finite mixture model [5], self-organizing map, with the method. As the size of the data increases, AIC and
hierarchical clustering and density-based clustering. BIC methods fail to provide the optimal number of clusters
[8]. In such cases, WSS and silhouette width is calculated for
Factor analysis is a technique that involves identifying k clusters. Using elbow method for WSS, optimal number of
significant factors for a particular group (or cluster). This has clusters can be identified. For silhouette width, the kth cluster
been widely used in classification problems such as customer that provides the maximum separation is considered as the
segmentation where the critical factors that affect each optimal number of clusters. In this research, both WSS and
customer’s decision are identified to achieve specific goals. silhouette techniques are performed to identify the optimal
This concept has been widely used in other problems such as number of clusters as shown in Figure and Figure
regression and dimensionality reduction. Similar to customer
segmentation, in maintenance, it is important to identify the
key features that attribute to a specific fault or failure mode of
the machine. These specific features are used to study the root
cause of a particular problem in the machine, and necessary
design changes could be made to eliminate the problem
completely. In other instances, when a fault is detected, these
specific features can be used to verify the state of the machine.

In this paper, a vibration monitor is used on rotating


machinery to observe the health of the machine. Gaussian
mixture model-based clustering is used to cluster the
observations into specific groups. Each group is diagnosed
with failure mode using spectrum analysis. Finally, random
forest model is used to identify significant variables that affect
each group or a cluster.

II. GAUSSIAN MIXTURE MODEL


Clustering is a process of modeling similar observations
into specific groups. This process in machine learning is an Figure 1: WSS plot for optimal number of clusters
unsupervised learning technique where only predictor
variables are known for grouping. There are various methods From WSS technique, we can observe that after the sixth
to group the data such as distance, density, shape, and type of cluster, there is no significant change in WSS. Hence, in WSS
data. In this paper, Gaussian mixture modeling technique is method the optimal number of clusters could be identified as
used. A Gaussian Mixture Model is a parametric probability six. In silhouette technique, at sixth cluster provided the
density function represented as a weighted sum of Gaussian maximum separation compared to other clusters. Hence the
component densities [6]. In clustering using finite mixture optimal number of clusters was determined to be six. From
modeling, each component probability refers to a cluster, and both the techniques, we can identify that the optimal number
the models that differ in the number of components/ of clusters is six.
component distributions can be equated using statistical tests
[7]. The data is clustered using GMM with k = 6 as an optimal
number of clusters. The results are as shown in Figure 3.
In this research, rotating machinery is considered which
3

for 27-Aug, we can observe nonsynchronous peaks through


the mid band. The amplitude of the operating frequency has
also significantly reduced.

Figure 5: Shaft displacement issue on 27-Aug

Figure 2: Silhouette width for identifying optimal number of


clusters

Figure 6: Machine normal condition or repair state on 10-Sep

Figure 3: GMM cluster results for six clusters Figure 7: Fan imbalance created on 22-Sep

From the results, we can infer some of the primary results


such as machine shutdown formed cluster 5. But, to diagnose
each state of the machine accurately, spectrum analysis is
performed for each cluster of the machine using time as the
reference as shown in Figure

Figure 8: Fan imbalance created on 04-Jan

These characteristics can be attributed to mechanical


looseness in the machine. Upon maintenance, it was
discovered that the shaft had displaced by 10 mm creating
Figure 4: Machine operating in normal condition on 12-Aug mechanical looseness. The dominant cluster during this period
was cluster six. In frequency plot from 10-Sep, we can
Five spectrum plots were analyzed in time series to diagnose observe the machine in normal operating condition comparing
the state of the machine and correlate with the clusters. A to the baseline as shown in Figure 6. This was the period after
frequency plot was generated for data on 12-Aug as shown in the maintenance. The dominant cluster during this period was
Figure 4. Based on the known maintenance history of the cluster 2. In frequency plot from 22-Sep, we can observe the
machine, the state was considered as a normal state and was increase in magnitude from its operating frequency as shown
used as a baseline for rest of the plots. The dominant cluster in Figure 7. This characteristic is an indication for mechanical
during this period was cluster 1. In frequency plot generated imbalance. During this period the dominant cluster was cluster
4. After maintenance, we can observe the machine operating
4

in normal condition as shown in Figure 8. The dominant best split among all predictors, randomly sample mtxy
cluster during this period was cluster 3. of the predictors and choose the best split from
among those variables. (Bagging can be thought of as
From the above spectrum analysis, the modes of each of the the special case of random forests obtained when mtxy
clusters were diagnosed. Based on the information, some of = p, the number of predictors.)
the inferences can be drawn using the cluster plot generated 3. Predict new data by aggregating the predictions of
using GMM. The conclusions are as follow the ntree trees (i.e., majority votes for classification,
 GMM model was capable of diagnosing the machine average for regression).
In Boosting, successive trees give extra weight to points
repair states. This is a clear indication of the robustness of
incorrectly predicted by earlier predictors. Finally, a weighted
identifying the change in process or environment
vote is taken for prediction.
 Imbalance state of the fan is observed since the beginning
of the data collection as seen in Figure 3. Although an In bagging successive trees do not depend on earlier trees.
assumption was made during spectrum analysis when Each is independently constructed using a bootstrap sample of
creating the baseline, clustering technique was capable of the data. Finally, a majority vote is taken for prediction.
identifying the imbalance state
 Clustering technique was capable of detecting machine An estimate of the error rate can be obtained, based on the
powered off state as well. training data, by the following [9]:
1. At each bootstrap iteration, predict the data not in the
From the above results we can conclude that we can bootstrap sample the tree grown with the bootstrap
conclude that by using clustering and spectrum analysis, we sample.
can overcome some of the many challenges of supervised 2. Aggregate the OOB (Out of Bag) predictions.
classification methods. Some of the advantages of the above Calculate the error rate, and call it the OOB estimate
technique are as follows of error rate.
 There is no requirement of training the model with all
Variable importance in the random forest is defined based
the states of the machine
on the interaction with other variables. Random forest
 The above procedure can be implemented in a shorter
estimates the significance of variable based on how much the
period. Hence, the benefit of CBM can be realized faster prediction error increases when data for a particular variable is
 There is no need to retrain the model when a new state of permuted while the other variables are left unchanged. The
the machine is identified. calculated for variable importance are carried out each tree at
a time as the random forest is constructed. Today, random
forest is used in various applications such as banking, retail,
III. FACTOR ANALYSIS FOR CLUSTERED DATA the stock market, medicine and image analysis. Some of the
In maintenance, upon detecting and diagnosing the faults, main advantages of this technique is as follows
identifying the important features that affect that are specific 1. The same algorithm can be used for both classification
to a particular cluster is important. The factors contributing to and regression problems
a specific state of the machine is used in studying the root- 2. There is no issue of overfitting when this algorithm is
used either for classification or regression.
cause of the problem and potentially eliminating the problem.
3. The random forest can also be used for identifying
It is also used in validating the cluster results. In this paper, we
important variables in the data while building the
discuss a supervised learning technique called random forest
models.
which will be used to identify the important features that are 4. It can handle large datasets efficiently without variable
specific to a specific fault of the machine (or cluster). deletion

Random forest is an ensemble learning technique that is In this research, the clusters are used as the response
used both in regression and classification problems. In a variable, and the feature data is used as the predictor variables.
regular decision tree, a single decision tree is built. But, in a A total of 500 trees are generated using random forest
random forest, a number of decision trees are built. The technique. The accuracy of different models was considered to
number of trees is usually user defined. In an ensemble choose the best model. The optimal model was chosen with
process, a vote from each decision tree is used in deciding the mtxy = 19.The summary of Resampling results across tuning
final class. In this technique, a sample of data with parameters is as shown in Table 2.
replacement is used for building the decision tree along with
the subset of variables. This sampling and subsetting are Table 2: Resampling results across tuning parameters
performed at random. Hence, this technique is called a random mtry Accuracy Kappa
forest. The algorithm for random forest is given as follows [9]
1. Draw ntree bootstrap samples from the original data. 2 0.8709 0.8451
2. For each of the bootstrap samples, grow an unpruned 19 0.8853 0.8623
classification or regression tree, with the following 37 0.8806 0.8567
modification: at each node, rather than choosing the
5

After identifying the best model, the important variables for In Figure 10, we can observe the importance of all the
every cluster group was identified. features for all six clusters. In the following results, we can
identify that all the features has some amount of significance
for cluster 1, 2, 3, 5 and 6. For cluster 4, SdYAxisF feature
had no significance. We can also observe that in cluster 2, 3,
4, 5 and 6 have ambient temperature as the most significant
variable. While in cluster 1, MeanXAxisT was the most
significant variable. From the above analysis, we can also
observe that all the all the clusters have different levels of
significance for different clusters. This observation provides a
strong conclusion that the clusters are unique with different
characteristics.

IV. CONCLUSION
In rotating machinery, vibration analysis is one of the most
sought techniques for condition-based monitoring. In a highly
dynamic environment such as manufacturing, unsupervised
Figure 9: Optimal model selection using Random Forest machine learning techniques such as clustering are used to
group the data into clusters. These individual clusters
represent a state of the machine. The mode of each state can
The results are as shown in Figure 10. be diagnosed using frequency spectrum analysis. The

Figure 10: Feature importance plot for different clusters


6

important factors that are specific to a cluster can be identified


by using random forest technique. The significant factors can
be used to study the causality of a particular failure mode.
With different significant features for each mode, we can
provide substantial reasoning that the identified clusters are
significantly different and their behavior is caused due to a set
of unique attributes. In this research, with the proposed
methodology, we were able to build a machine state detection
model for a machine working in changing environments using
clustering, eliminate the need for retraining the model,
diagnosing machine faults using spectrum analysis and finally
identifying important factors that contribute to each state of
the machine.

V. REFERENCES

[1] N. Amruthnath and T. Gupta, "Modified Rank Order


Clustering Algorithm Approach by Including
Manufacturing Data," in 4th IFAC International
Conference on Intelligent Control and Automation
Sciences, Reims, 2016.
[2] N. Amruthnath and P. M. Prathibhavani, "Data Security in
wireless Sensor Network using Multipath Randomized
Dispersive Routes.," International Journal of Scientific &
Engineering Research, vol. 5, no. 1, 2014.
[3] N. Amruthnath and T. Gupta, "A Research Study on
Unsupervised Machine Learning Algorithms for Early
Fault Detection in Predictive Maintenance," in 5th
International Conference on Industrial Engineering and
Applications, Singapore, 2018.
[4] NASA, "Reliability Centered Maintenance Guide for
Facilities and Collateral Equipment," National Aeronautics
and Space Administration, Washington, D.C., 2000.
[5] N. Amruthnath and T. Gupta, "Class Prediction in
Unsupervised Learning using Model-Based Clustering
Approach," International Journal of Machine Learning
and Computing, vol. IN PRESS, 2018.
[6] D. Reynolds, "Gaussian Mixture Models," In Encyclopedia
of Biometrics, pp. 659-663, 2009.
[7] C. Fraley and A. E. Raftery, "Model-based Clustering,
Discriminant Analysis and Density Estimation," Journal of
the American Statistical Association, vol. 97, pp. 611-631,
2002.
[8] K. W. Broman and T. P. Speed, "A model selection
approach for the identification of quantitative trait loci in
experimental crosses," ournal of the Royal Statistical
Society: Series B (Statistical Methodology), vol. 64, no. 4,
pp. 641-656, 2002.
[9] A. Liaw and M. Wiener, "Classification and regression by
randomForest," R news, vol. 2, no. 3, pp. 18-22, 2002.

You might also like