Am Ruth Nath 2018

,QWHUQDWLRQDO&RQIHUHQFHRQ,QIRUPDWLRQDQG&RPSXWHU7HFKQRORJLHV
Fault Class Prediction in Unsupervised Learning Using Model-Based Clustering

Approach
Nagdev Amruthnath, Tarun Gupta

Industrial and Entrepreneurial Engineering, Western Michigan University
Kalamazoo, MI, 49008, USA
e-mail: nagdev.amruthnath@wmich.edu, nagdev.amruthnath@wmich.edu
the machine failure rate, and component failure rates are

Abstract—Manufacturing industries have been on a steady path identified [2]. This data is used to inspect the machine and its
considering for new methods to achieve near-zero downtime to components periodically and replaced if necessary.
have flexibility in the manufacturing process and being Implementing TPM have shown a significant improvement in
economical. In the last decade with the availability of industrial productivity and quality, and decrease in maintenance cost.
internet of things (IIoT) devices, this has made it possible to This type of maintenance is widely used in industry today.
monitor the machine continuously using wireless sensors, assess Although with many advantages with this approach, it is still
the degradation and predict the failures of time. difficult to monitor critical component degradation such as
Condition-based predictive maintenance has made a significant ball bearings, motor shaft, pneumatic components and many
influence in monitoring the asset and predicting the failure of as such. By replacing these critical components based on
time. This has minimized the impact on production, quality, and reliability data, the remaining useful life is lost and hence
maintenance cost. Numerous approaches have been in proposed
increasing the cost of maintenance [2].
over the years and implemented in supervised learning. In this
Predictive maintenance is the second type of planned
paper, challenges of supervised learning such as need for
historical data and incapable of classifying new faults accurately maintenance. This type of maintenance is also called as the
will be overcome with a new methodology using unsupervised complement or opposite of planned maintenance. In this
learning for rapid implementation of predictive maintenance process, machine data is collected for a certain period, and
activity which includes fault prediction and fault class detection then this data is modeled in different learning algorithms such
for known and unknown faults using density estimation via as regression, classification, neural networks, fuzzy logic and
Gaussian Mixture Model Clustering and K-means algorithm many as such. There is a number of architectures proposed
and compare their results with a real case vibration data. over the years but, the main components of predictive
maintenance such as data collection, data cleaning (process of
Keywords-unsupervised learning; fault class detection; removing incomplete, duplicate and missing values), feature
predictive maintenance; gaussian mixture model; clustering; extraction and selection (feature is synonymous of an attribute
just-in-time; TPM or input variable) [3], fault prediction, fault class prediction
and time to failure (TTF) prediction have remained same. The
I. INTRODUCTION data collected for predictive maintenance can be physics based
Over the years, numerous machine maintenance data such as vibration, temperature, pressure, voltage, light
methodologies have been proposed. But, the goal of dispersion, humidity [4] and process data such as process
maintenance methodologies has not changed. The major aim deviations, raw material quality, control settings, machine
of maintenance methodology is to increase the productivity specification and as such. The implementation predictive
along with quality while reducing the unplanned downtime. maintenance can be performed using different learning
Machine maintenance can be mainly classified as unplanned methods such as supervised learning, unsupervised learning,
or unscheduled maintenance and planned or scheduled deep learning and reinforcement learning. Learning methods
maintenance. Unplanned maintenance or run to failure can be mainly classified into four types; supervised learning,
maintenance is performed only when the machine breaks unsupervised learning, semi-supervised learning and
down. The major disadvantage of this type of maintenance is reinforcement learning. Supervised and unsupervised learning
high downtime which includes investigation, component part methods are the most commonly used methods in predictive
replacement, and verification of repaired condition. This leads maintenance. Supervised learning is where the target variable
to high maintenance cost. Every year, it is estimated that U.S. or response variable is known. In unsupervised learning, the
industry spends $200 billion on maintenance of plant target variable or response variable is not known.
equipment and facilities and the result of ineffective
maintenance leads to a loss of more than $60 billion [1]. II. LITERATURE REVIEW
During the evolution of Toyota Production System era, a In today’s competitive market, attaining Just-in-Time (JIT)
concept of Total Preventive Maintenance (TPM) was had become one of the biggest goals especially in
introduced. TPM is also known as planned maintenance or manufacturing. There are many advantages to achieving JIT
time-based maintenance (TBM). In this type of maintenance, such as reduction in cost, increase in quality and shorter lead
978-1-5386-5384-5/18/$31.00 ©2018 IEEE 5

time. Over the years, it has become challenging to achieve this states. In this paper, we have proposed an unsupervised
due to performance issues and equipment failures [5], control learning methodology for rapid implementation of predictive
on inventory, line stoppages, poor process synchronization, maintenance activity. This implementation involves the major
and low volume operations [6]. To overcome poor process components of predictive maintenance such as collection of
synchronization, an enhanced method was proposed using physics-based data, predicting faults and predicting the type of
modified rank order clustering using manufacturing data [7]. fault using different machine learning algorithms such as
Condition-based maintenance (CBM) is a process of principal component analysis (PCA) for dimensionality
monitoring the system or a machine based on physical reduction, Hotelling T2 statistic for fault detection, density
parameters or process parameters or both continuously to estimation via Gaussian Mixture Model Clustering Algorithm
predict the health of the machine, predict faults, predict the [14] and K-means clustering for fault class prediction [11].
type of faults and predict the time to failure of the machine. All the above algorithms are implemented using R-
Initially termed as predictive maintenance, the concept of programming, open-source statistical software and are
CBM was first introduced by the Rio Grande Railway visualized in real time using a BI tool called SISENSE.
Company in the late 1940s [8]. The process of CBM involves Density estimation via Gaussian Mixture Model Clustering
various domains such as data mining, artificial intelligence, Algorithm is implemented using mclust package [29] [30],
and statistics. CBM has been used in various applications such silhouette optimal cluster identification is implemented using
as automotive, manufacturing, aviation, defense and other NbClust package [31].
industries [8]. CBM is majorly implemented using supervised
learning algorithms such as regression and classification. III. UNSUPERVISED FAULT CLASS DETECTION
CBM systems can be applied when there is the availability of
historical data as well as when no historical data is not A. Model Building
available using machine learning. Unsupervised learning A model using Gaussian Mixture Model and K-Means
approach requires building a reference model identifying approach was developed next for detecting the faults while
normal and abnormal situations. Supervised learning and using minimal/no historical data which would serve as a
reinforcement learning approach could be applied to make the metric to separate normal operating condition from abnormal,
CBM algorithms more accurate [21]. One of the key and no class labels. Following important assumptions applied
advantages of CBM is the ability to observe the machine to the model development process.
component degradation before failing. Assumption 1: It is assumed that the model is working in
One the most common learning approaches used today for ideal condition where, voltage, current, temperature of the
fault diagnosis is supervised learning. This is exclusively machine, ambient temperature are not significant
based on the predictor variable and response variable. One of Assumption 2: It is assumed that vibration of the machine
the challenges in this method is the time required to collect the is the only significant attribute of the machine
data and train the model. To overcome this problem, Assumption 3: the probabilistic model used here assumes
unsupervised learning can be used where the class structure that all vibration data obtained is generated obtained from a
can be detected without any previous knowledge of the data. mixture of finite number of Gaussian distributions of unknown
These methods can be classified into two techniques (i) parameters
subspace structure of data and its (ii) clustering characteristics An unsupervised fault prediction model is conceptualized
[27]. To summarize the first approach, objects use a smaller and developed. Later on, we demonstrate how this model
number of features than the total population, and in clustering works while using minimal historical data accurately predicts
approach, the data set uses a lower number of objects than the known and unknown faults.
original number [27].
With the need for flexibility and cost-effectiveness, in a
CBM system fault classification has an important role. Fault
classification or fault diagnosis is a process of identifying the
current state of the system or the machine. This process can be
achieved by the data-driven approach, qualitative model
approach or quantitative model approach [18]. Due to the need
for precise mathematical models and a significant amount of
expert knowledge, the data-driven method has been widely
used in the last two decades [15]. Some of the common
algorithms used in fault classification is support vector
machine (SVM) [15], artificial neural network [15], Kohonen
Feature Mapping [10], fuzzy logic [9], self-organizing map
[17], k- means [24], fuzzy c-means, hierarchical clustering Figure 1. Model for fault detection and fault type prediction.
[27].
Among different learning methods, supervised learning is Limitations: this model is developed, evaluated and
widely used mainly because of its high prediction accuracy validated only for rotating machinery such as cooling fan
[21]. This also comes with a disadvantage of longer where vibration is the significant attribute. Hence, at this point
implementation time which involves capturing all the machine of time in research, this model cannot be validated for
6
machines where other physics-based data such as pressure and Eigenvalues. The Eigenvalues and components are plotted in a
temperature are significant. scree plot as shown in Figure 3.
B. Understanding the Current System F. Fault Prediction
A furnace fan was used as a test subject for performing this Fault prediction is a process of detecting an abnormal
predictive maintenance activity as shown in Figure 2. This fan behavior of the system. Some of the most commonly used
operates in a non-fluctuating environment, and due to moving algorithms are Hotelling T2 statistic [28] [20], Q statistic [28]
parts and no process data availability, vibration data was [20], K-means algorithm, and hierarchical clustering analysis
chosen to monitor. Also, due to the sensitivity of the vibration [32] in unsupervised learning. In this research study, Hotelling
spectrum of equipment failures, the vibration signal is T2 statistic, a multivariate statistic is used for the analysis. Test
commonly used as the data source [15]. It can also be noted data or calibration data is captured during a healthy state of the
that about 99% of mechanical faults have some noticeable machine and T2 statistics, and prediction limits are obtained at
vibration and acoustic signals. 99.99% confidence interval using the formula 1 and 2. Here,
the significant components obtained using PCA are used for
C. Data Collection calculating the statistic [19].
Vibration monitors were mounted both on x-axis and
y-axis. Vibration data was collected in the time domain at T2 statistic
2048 Hz and for 0.8 seconds every 5 minutes. This raw data is
used for feature extraction, feature selection, fault prediction = ∑ (1)
and fault class prediction. The methodology for this research
study is as shown in Table 1. Upper confidence limit
( )
D. Feature Extraction , ,∝ = , ,∝ (2)
Every instance of raw data collected consists of
approximately 1600 data point in the time domain. It is Figure 4 is the fault detection system for monitoring the
important to capture different features both in the time domain state of the fan. This system was calibrated for July 31 and
and frequency domain [11]. Here, the time domain features August 1. The obtained statistics were used for monitoring the
such as min, max, median, mean, standard deviation, kurtosis, system. Upper limit obtained was 15.83. It can be noticed that
skewness, range, and RMS [25] for both x-axis and y-axis are on August 26 there was a sharp trend in the graph which
collected. The raw data in time domain was later transformed indicated that there was a fault detected. Upon investigation, it
to frequency domain using Fourier transforms [16] to capture was found that the fan housing had displaced by 2 inches and
the same features in x-axis and y-axis at 25 Hz to minimize the the fan was replaced. Upon replacement of the fan, we were
noise at a higher frequency. able to confirm that the model had to be recalibrated for the
new fan. But, our research goal was to detect these changes in
E. Feature Selection our new fault class detection model. Hence, the fan was not
recalibrated and was left to continue to use the same statistics
as the previous fan. This behavior can be observed in Figure 4.
TABLE I. UNSUPERVISED PREDICTIVE MAINTENANCE
METHODOLOGY FOR FAULT CLASS PREDICTION
Physical Fault Fault class

Data type Features
Parameter prediction Prediction
Time Hotelling
Vibration Domain Min T2 Statistic GMM
Frequency
Domain Max K-Means
Mean
Standard
Deviation
Kurtosis
Figure 2. Furnace fan and sensors mounted on x-axis and y-axis. RMS
Skewness
In the feature extraction process, a total of 36 features were Range
extracted both in the time domain and frequency domain. To Median
avoid the curse of dimensionality, dimensionality reduction
algorithm called PCA [28] is used. In this algorithm, the data The primary objective of this research was to predict fault
is linearly mapped into lower dimension space to maximize class in unsupervised learning enabling the architecture for
the variance of the data. Here, the covariance matrix for the rapid implementation. To achieve this, two unsupervised
data is constructed, and Eigenvalues are computed. Optimal clustering algorithms, density estimation via Gaussian
components are computed by using a scree plot for
7
Mixture model clustering and K-means algorithm were used. identified. Hence, it was concluded to use non-parametric
The main objective of these to detect the following density estimation.
1. Healthy state of the machine
2. Faulty state of the machine
3. New fan replacement
G. Density Estimation via Gaussian Mixture Model

GMM algorithm is used in various applications such as
image classification [12], speaker verification [26], speech
recognition, medicine, etc. Density estimation is a process of
fitting a distribution. Data distributions are usually identified
by plotting density plots. If a set of random numbers are
generated based on a known distribution such as normal,
uniform or binomial, then it is known as parametric way of
density estimation. If a set of random numbers are fitted to a
known distribution such as Gaussian Mixture model then, it is
known as a non-parametric method of density estimation. In
our research, when the feature data was plotted onto density
plot to identify the distribution, multimodal distributions were Figure 3. PCA scree plot to identify the optimal number of components.
Fan 1: PCA-T2-Health
T2-Health
T2-Limit
70
60
50
T2-Health
40
30
20
10
Aug 01 Aug 15 Sep 01 Sep 15
Date
Figure 4. Hotelling T2 statistic for fault detection.
( )=∑ () ( |) (3)
H. Fault Class Prediction
In this section, we will recall the density functions, where P(j) is the mixture proportion and is non-negative. Its
log-likelihood function and E and M steps for Gaussian sum must be equal to one. The Gaussian centers can be
Mixture Modeling [13]. Here, we have considered a defined by their centers cj and their covariance matrix ∑j.
D-dimensional continuous random vector X Rd. From the [13]
reference (equation 3 to 9), the probability density function for
a mixture model which is a linear combination of M Gaussian
component densities is defined as ( | ) = (2 ) |∑ |
) ∑ ( − )
. exp − ( −
[13]
(4)
8
Based on our above mixture model (equation 4), we can
define the log-likelihood function as (equation 5) ∑ ( )( ( )
)(
( )
)
[13] ∑( )
= ( )( | (8)
∑ )
( ) = log ∏ ( ) (5) ( )
()= ∑ ( )
(| ) (9)
where, is the model parameter of P(j), cj and ∑j . Using
Expectation and Maximization approach the maximum Raw PCA
likelihood estimate of can be obtained iteratively. GMM
Features Analysis
Expectation or E-step involves computing the expected value
of some unobserved data using current parameter estimates
and observed data. Maximization or M-step involves using the
expected values from E-step to compute the maximum
likelihood estimates. Upon achieving this model parameters Classification
are updated. At a given iteration step t, Data
Figure 5. Approach used in fault diagnosis using unsupervised learning.

E-step: [13]
( )
( )( | ) ( )( ) The process of implementation is as shown in Figure 5.
(| )= (6)
( )(
) The raw data features are loaded onto PCA function to obtain
M-step: [13] the optimal number of principal components. These
components are loaded to densityMclust [29], [30] function
( ) ∑ ( )( | ) from Mclust package to obtain the classification data.
= ∑ ( )( | ) (7)
Figure 6. Fault type detection using density estimation via GMM.
Figure 6 represents the Gaussian Mixture Model have class labels from historical data we can only
Clustering fitted by Expectation-maximization (GMM-EM hypothesize this as a failed state. Upon investigation we
model) results grouped by dates on the x-axis and cluster by the maintenance crew, it was found that it was a shaft
number on the y-axis. The clusters are color-coded, and size of housing had displaced by two inches
the cluster represents some data points clustered to a certain 3. From August 31 to September 19, cluster 5 had
cluster on that day. Bigger the size of the cluster, the certain significant occurrence. As mentioned in the above
state was majorly observed and smaller the size fewer times section, this was the period when the old fan was
that state was observed. In our model, we found a total of 6 replaced with a new fan. We can confirm this change
states. Without any previous knowledge of the data, it is
here.
difficult to interpret the results. But, when we compare
4. Quite interestingly, there was yet another cluster formed.
1. From July 7 to August 25, cluster 4, 3 and 2 had major
Cluster 6 occurred majorly on September 3 and
state occurrence. From T2 statistic we can hypothesize
September 10. The machine was completely shut down
that this could be normal state
on September 3 and for a shift on September 10. These
2. From August 26 to August 30, cluster 1 has a major state
states could represent the machine shutdown state.
occurrence. We can observe the deviation from the
normal state. Since in unsupervised learning, we do not
9
I. K-means Algorithm silhouette method. K-means clustering is an unsupervised
K-Means Algorithm is one of the most commonly used in learning procedure; the method can be directly implemented to
fault detection and fault class prediction in unsupervised measured vibration data, and thus the need for training process
learning. This algorithm uses Euclidian distance to form the for identification for the faulty process is eliminated [11].
clusters, and the number of clusters is determined using
Figure 7. K-Means clustering results grouped based on date.
Figure 7 is the final result obtained from the raw features

extracted using K-means algorithm. There were two ideal TABLE III. MACHINE STATE DETECTION COMPARISON BETWEEN
ALGORITHMS
clusters based on silhouette method. In the results, we can
positively identify the faulty state as cluster 1 in purple. But, Number of States
this algorithm fails to detect the replaced fan. State
GMM-EM K-Means
Healthy or Normal 3 1
IV. ANALYSIS AND RESULTS Shaft Housing Fault 1 1
Hotelling T2 statistic was used for detecting the faults. In New fan replacement 1 0
this phase, we were able to detect the normal or healthy Machine shutdown 1 0
condition as well as abnormal condition based on the test
statistic and statistical limit at 99.9% confidence interval.
We were able to use our expert judgment to identify and
Based on the results the following test statistic was found to be
name each cluster and create a confusion matrix as shown in
82.96% accurate in predicting the faults accurately. The
Table 4 to predict the accuracy of density estimation via GMM.
accuracy was calculated based on the information in Table 2.
From the results, we were able to estimate the accuracy of
Based on the results of fault detection phase we were able
about 90.01% in predicting the states accurately using
to detect three states in the data. The states are a healthy state,
confusion matrix from table 2 and Equation 11.
faulty state, and reset state (new fan state). To predict the
classes accurately, two unsupervised algorithms were used;
+
K-means clustering and Gaussian Mixture Model Clustering. Accuracy of T =
The results of both the algorithms are as shown in Table 3. +
+ +
Based the results, we can analyze that GMM method was able 1215 + 9041
= = 82.96%
to predict all the states within the data along with redundant 1215 + 993 + 1113 + 9041
three healthy states. On the other hand, K-means algorithm (10)
was able to detect healthy and faulty state, but, it failed to
identify new fan state and machine turned off state.
TABLE IV. CONFUSION MATRIX FOR FAULT TYPE PREDICTION USING
DENSITY ESTIMATION VIA GMM
TABLE II. CONFUSION MATRIX FOR FAULT DETECTION USING Predicted

HOTELLING T2 STATISTIC Actual Positive Negative
Predicted True 934 142
Actual Positive Negative False 1051 10340
Positive A=1215 B=993
Negative C=1113 D=9041
10
+ [10] B. H. Chowdhury and K. Wang, "Fault Classification Using Kohonen
Accuracy of T = Feature Mapping," in Proceedings of the International Conference on
+ + + Intelligent Systems Applications to Power Systems, 1996.
934 + 10340 [11] C. T. Yiakopoulos, K. C. Gryllias, and I. A. Antoniadis, "Rolling
= = 90.01%
934 + 142 + 1051 + 10340 element bearing fault detection in industrial environments based on a
(11) K-means clustering approach," Expert Systems with Applications, vol.
38, no. 3, pp. 2888-2911, 2011.
[12] Ç. Ari and S. Aksoy, "Unsupervised classification of remotely sensed
V. CONCLUSION images using Gaussian mixture models and particle swarm
In the recent years, the wide availability of secure wireless optimization," in Geoscience and Remote Sensing Symposium
(IGARSS), 2010.
sensor has made it more viable option to monitor the machine [13] C. Archambeau and M. Verleysen, "Fully Nonparametric Probability
regularly and continuously [33]. The core objective of this Density Function Estimation with Finite Gaussian Mixture Models,"
paper was to propose, develop and implement the predictive in ICAPR'2003 proceedings - 5th International Conference on
maintenance methodology rapidly using unsupervised Advances in Pattern Recognition, Calcutta, 2003.
learning for a furnace fan as a case. With the following [14] C. Biernacki, G. Celeux, and Gérard Govaert, "Choosing starting
values for the EM algorithm for getting the highest likelihood in
implementation, we were able to predict the faults with 82.96% multivariate Gaussian mixture models," Computational Statistics &
accuracy and determine the different states of the machine Data Analysis, vol. 41, no. 3-4, pp. 561-575, 2003
based on domain knowledge. In order predict the state of the [15] F. Zhou, Y. Gao and C. Wen, "A Novel Multimode Fault
machine, GMM, and K-means algorithms were used. Based Classification Method Based on Deep Learning," Journal of Control
on the analysis and results we found that GMM methodology Science and Engineering, vol. 2017, 2017
works better in predicting the states of the fault accurately [16] G. G. Yen and K.-C. Lin, "Wavelet packet feature extraction for
vibration monitoring," IEEE Transactions on Industrial Electronics,
compared to K-means algorithm. In conclusion, in this vol. 47, no. 3, pp. 650-667, 2000.
research, we were able to rapidly implement a predictive [17] H. Benitez-Perez, F. Garcia-Nocetti, and H. Thompson, "Fault
maintenance activity using unsupervised learning classification SOM and PCA for inertial sensor drift," in IEEE
methodology with no historical data for fault class prediction International Workshop on Intelligent Signal Processing, 2005.
algorithms and minimal historical data for fault detection [18] H. Li and D. Y. Xiao, "Survey on data-driven fault classification
methods," Control and Decision, vol. 26, no. 1, pp. 1-16, 2011.
algorithms. [19] H. Hotelling, "Analysis of a complex of statistical variables into
principal components," Journal of Educational Psychology, vol. 24,
VI. FUTURE SCOPE OF WORK pp. 417-441, 1933.
[20] J.-H. Cho, J.-M. Lee, S. Wook, D.-w. Choi and I.-B. L. Lee, "Fault
The current work possibly will be extended to different identification for process monitoring using kernel principal
physics-based data such as temperature, pressure, humidity, component analysis," Chemical Engineering Science, vol. 60, no. 1, pp.
acoustics, voltage, and current across different domains. This 279-288, 2005.
would also help in determining the correlation between the [21] J.-H. Shin and H.-B. Jun, "On condition based maintenance policy,"
Journal of Computational Design and Engineering, vol. 2, no. 2, pp.
need for dimensionality reduction algorithms such as PCA and
119-127, 2015.
the accuracy of clustering algorithms. [22] M. Fujimoto and Y. Riki, "Robust speech recognition in additive and
channel noise environments using GMM and EM algorithm," in IEEE
REFERENCES International Conference on Acoustics, Speech, and Signal Processing,
[1] R. K. Mobley, An Introduction to predictive maintenance, 2 ed., 2002. 2004.
[2] I. Guyon, "Design of experiments of the NIPS 2003 variable selection [23] D. Reynolds, "Gaussian Mixture Models," In Encyclopedia of
benchmark," 2003 Biometrics, pp. 659-663, 2009.
[3] Y. Peng, M. Dong, and M. J. Zuo, "Current status of machine [24] S. M. Zhang, F. L. Wang, S. Tan and S. Wang, "A fully automatic
prognostics in condition-based maintenance: a review," The online mode identification method for multi-mode processes”," Acta
International Journal of Advanced Manufacturing Technology, vol. 50, Automatica Sinica, vol. 42, no. 1, pp. 60-80, 2016.
no. 1-4, pp. 297-313, 2010. [25] S. Wegerich, A. Wilks, and R. Pipke, "Nonparametric modeling of
[4] K. Javed, R. Gouriveau, N. Zerhouni and P. Nectoux, "Enabling vibration signal feature for equipment health monitoring," in
Health Monitoring Approach Based on Vibration Data for Accurate Aerospace Conference, 2003.
Prognostics," IEEE Transactions on Industrial Electronics, vol. 62, no. [26] S. Memon, M. Lech, and N. Maddage, "Speaker Verification Based on
1, pp. 647-656, 2015. Different Vector Quantization Techniques with Gaussian Mixture
[5] E. Gundogar, A. Yilmaz and B. Erkayman, "A solution approach to a Models," in Third International Conference on Network and System
synchronization problem in a JIT production system," Production Security, 2009.
Planning and Control. , vol. 25, 2014. [27] S. M, C. P. and B. T., "Supervised and unsupervised learning process
[6] R. C. Walleigh, "What’s Your Excuse for Not Using JIT?”," Harvard in damage classification of rolling element bearings," Diagnostyka,
Business Review, 1983. vol. 17, no. 2, pp. 71-80, 2016.
[7] N. Amruthnath and T. Gupta, "Modified Rank Order Clustering [28] W. Ku, R. H. Storer, and C. Georgakis, "Disturbance detection and
Algorithm Approach by Including Manufacturing Data," in 4th IFAC isolation by dynamic principal component analysis," Chemometrics
International Conference on Intelligent Control and Automation and Intelligent Laboratory Systems, , vol. 30, no. 1, pp. 179-196, 1995.
Sciences, Reims, 2016. [29] C. Fraley and A. E. Raftery, "Model-based Clustering, Discriminant
[8] A. Prajapati, J. Bechtel, and S. Ganesan, "Condition based Analysis and Density Estimation," Journal of the American Statistical
maintenance: a survey," Journal of Quality in Maintenance Association, vol. 97, pp. 611-631, 2002.
Engineering, vol. 18, no. 4, pp. 384-400, 2012. [30] C. Fraley, A. E. Raftery, T. B. Murphy, and L. Scrucca, "mclust
[9] B. Das and J. Reddy, "Fuzzy-logic-based fault classification scheme Version 4 for R: Normal Mixture Modeling for Model-Based
for digital distance protection," IEEE Transactions on Power Delivery, Clustering," Classification, and Density Estimation Technical Report
vol. 20, no. 2, 2005 No. 597, Department of Statistics, University of Washington, 2012.
11
[31] M. Charrad, N. Ghazzali, V. Boiteau and A. Niknafs, "NbClust: An R Maintenance," in 5th International Conference on Industrial
Package for Determining the Relevant Number of Clusters in a Engineering and Applications, Singapore, 2018.
DataSet.," Journal of Statistical Software, vol. 61, no. 6, pp. 1-36, [33] N. Amruthnath and P. M. Prathibhavani, "Data Security in Wireless
2014. Sensor Network using Multipath Randomized Dispersive Routes.,"
[32] N. Amruthnath and T. Gupta, "A Research Study on Unsupervised International Journal of Scientific & Engineering Research, vol. 5, no.
Machine Learning Algorithms for Early Fault Detection in Predictive 1, 2014.
12

Am Ruth Nath 2018

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Am Ruth Nath 2018

Uploaded by

Copyright:

Available Formats

,QWHUQDWLRQDO&RQIHUHQFHRQ,QIRUPDWLRQDQG&RPSXWHU7HFKQRORJLHV

Fault Class Prediction in Unsupervised Learning Using Model-Based Clustering

Nagdev Amruthnath, Tarun Gupta

the machine failure rate, and component failure rates are

978-1-5386-5384-5/18/$31.00 ©2018 IEEE 5

Physical Fault Fault class

G. Density Estimation via Gaussian Mixture Model

Aug 01 Aug 15 Sep 01 Sep 15

Figure 4. Hotelling T2 statistic for fault detection.

Figure 5. Approach used in fault diagnosis using unsupervised learning.

Figure 6. Fault type detection using density estimation via GMM.

Figure 7. K-Means clustering results grouped based on date.

Figure 7 is the final result obtained from the raw features

TABLE II. CONFUSION MATRIX FOR FAULT DETECTION USING Predicted

Negative C=1113 D=9041

You might also like

Am Ruth Nath 2018

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Am Ruth Nath 2018

Uploaded by

Copyright:

Available Formats

,QWHUQDWLRQDO&RQIHUHQFHRQ,QIRUPDWLRQDQG&RPSXWHU7HFKQRORJLHV

Fault Class Prediction in Unsupervised Learning Using Model-Based Clustering

Nagdev Amruthnath, Tarun Gupta

the machine failure rate, and component failure rates are

978-1-5386-5384-5/18/$31.00 ©2018 IEEE 5

Physical Fault Fault class

G. Density Estimation via Gaussian Mixture Model

Aug 01 Aug 15 Sep 01 Sep 15

Figure 4. Hotelling T2 statistic for fault detection.

Figure 5. Approach used in fault diagnosis using unsupervised learning.

Figure 6. Fault type detection using density estimation via GMM.

Figure 7. K-Means clustering results grouped based on date.

Figure 7 is the final result obtained from the raw features

TABLE II. CONFUSION MATRIX FOR FAULT DETECTION USING Predicted

Negative C=1113 D=9041

You might also like

,QWHUQDWLRQDO&RQIHUHQFHRQ,QIRUPDWLRQDQG&RPSXWHU7HFKQRORJLHV