Professional Documents
Culture Documents
Universidade Federal de Santa Catarina Rafael Hoffmann Fallgatter
Universidade Federal de Santa Catarina Rafael Hoffmann Fallgatter
Florianópolis
2019
Rafael Hoffmann Fallgatter
Florianópolis
2019
Rafael Hoffmann Fallgatter
This Bachelor’s Thesis was considered appropriate to obtain the Bachelor’s Degree on
Mechanical Engineer and approved on its final form by the Graduate Course of
Mechanical Engineering
____________________________________
Prof. Carlos Enrique Niño Bohórquez, PhD. Eng
Course Coordinator
Examining Committee:
____________________________________
Prof. Rodolfo César Costa Flesch, PhD. Eng.
Supervisor
Universidade Federal de Santa Catarina
____________________________________
Ahryman Seixas Busse de Siqueira Nascimento, M.Sc.
Co-supervisor
Universidade Federal de Santa Catarina
____________________________________
Prof. Carlos Enrique Niño Bohórquez, PhD. Eng
Universidade Federal de Santa Catarina
____________________________________
Bernardo Barancelli Schwedersky, M.Sc.
Universidade Federal de Santa Catarina
To my family and friends
AKNOWLEDGEMENTS
I would firstly like to thank my parents and my stepfather for all the help and
support throughout my studies and my life. All my achievements are also theirs. Thanks
also to my siblings for putting up with me every single day and making my days funnier.
And thanks to the rest of my family for always being with me.
I would also like to thank all my friends, those who I have just met and those
who I have known throughout my life. They are all very important to me. A special thanks
to the great people of PET-MA, you have a huge influence on who I am today. The work
at PET-MA was my life during the graduation.
Moreover, I would like to thank the teachers and other people that I have met in
university, who have taught me so much.
Finally, I would like to thank very much my supervisor, Rodolfo Flesch, as well
as Ahryman Nascimento, Bernardo Schwedersky and the other people of LABMETRO,
for their great support throughout the development of this project.
“I am enough of an artist to draw freely upon my imagination.
Imagination is more important than knowledge. Knowledge is limited.
Imagination encircles the world”
(Albert Einstein)
RESUMO
Introdução
A geração de frio de uma forma confiável e econômica é um desafio de
grande importância para a sociedade atual. Refrigeradores e sistemas de ar-
condicionado estão presentes em diversos tipos de aplicações industriais, comerciais
e domésticas.
Um dos componentes de maior importância em sistemas de refrigeração é o
compressor. Para a realização da manutenção destes equipamentos, é muitas vezes
necessária a obtenção de sua pressão de sucção e descarga. Entretanto, para isso, é
necessária a inserção de sensores de pressão nas linhas do sistema, o que traz uma
série de implicações tanto econômicas quanto de qualidade de medição. Assim sendo,
existe um interesse na realização desta medição de forma não-intrusiva.
Neste trabalho, utiliza-se a técnica conhecida como sensoriamento virtual,
onde uma grandeza é medida e correlacionada com a grandeza que se deseja analisar.
Optou-se pela medição da vibração da carcaça do compressor, uma vez que existe
uma significativa relação entre a mesma e as pressões de sucção e descarga.
Entretanto, tal relação é de alta complexidade, sendo inviável a utilização de uma
modelagem analítica. Dessa forma, este trabalho tem por objetivo o desenvolvimento
de um algoritmo de Aprendizado de Máquina capaz de realizar previsões utilizando
exemplos obtidos através de ensaios.
Objetivos
O objetivo global deste projeto é o desenvolvimento de um modelo ensemble de
Aprendizado de Máquina, baseado em Árvores de Decisão, capaz de predizer as pressões
de sucção e descarga de um compressor recíproco de uso doméstico. Os objetivos
específicos são os seguintes:
- Gerar um conjunto otimizado de features utilizando dados de vibração;
- Comparar diferentes algoritmos de Aprendizado de Máquina baseados em
Árvores de Decisão;
- Otimizar os parâmetros dos modelos selecionados;
- Propor aplicações práticas para o modelo desenvolvido
Metodologia
Este trabalho utilizou dados de vibração obtidos em uma bancada de testes do
Laboratório de Metrologia (LABMETRO) da Universidade Federal de Santa Catarina
(UFSC). Nela, foram realizados ensaios em 5 compressores de um mesmo modelo,
medindo-se a vibração em 3 eixos perpendiculares durante o seu funcionamento. Foram
utilizados 11 diferentes valores de pressão de sucção, 11 valores de pressão de descarga
e 3 valores de velocidade de rotação do compressor, gerando um total 363 combinações
possíveis. Este experimento foi repetido três vezes. Adicionalmente, 6 valores
intermediários de pressão de sucção e descarga e 2 valores de velocidade de rotação foram
utilizados para gerar um conjunto de dados de teste.
Os dados dos quatro primeiros compressores são utilizados para o treinamento dos
algoritmos, formando o chamado Conjunto de Treino. Os dados obtidos nestes mesmos
compressores, mas utilizando valores intermediários, formam o Conjunto de Teste,
utilizado para testar diferentes métodos e otimizar os parâmetros dos modelos.
Finalmente, os dados do quinto compressores são utilizados para avaliar a performance
final dos algoritmos, constituindo o Conjunto de Validação.
Neste trabalho, optou-se por construir um algoritmo que realiza a previsão das
temperaturas de evaporação e condensação do fluido, grandezas estas que são
representativas, respectivamente, da pressão de sucção e descarga. O motivo disso é que
são essas temperaturas que são utilizadas para o desenvolvimento de sistemas de
refrigeração, uma vez que elas são independentes do fluido refrigerante sendo utilizado,
ao contrário do caso da pressão.
Para a geração das features a serem utilizadas como entrada nos modelos, é
aplicada uma Transformada Rápida de Fourier e é calculada a energia do espectro dentro
de faixas de frequência com espaçamento linear. Diferentes valores de frequência
máxima, comprimento das faixas e sobreposição são testados para se obter os valores que
geram os melhores resultados. Adicionalmente, dois métodos de Redução de
Dimensionalidade são aplicados: Principal Components Analysis (PCA) e Feature
Importance. Diferentes parâmetros dos mesmos são testados. O método de Feature
Importance é também utilizado para analisar quais eixos de medida e frequências
apresentam maior importância para a previsão de cada uma das pressões.
O conjunto gerado é então utilizado para treinar e comparar a performance de 4
algoritmos: Regressão Linear Múltipla, Árvore de Decisão, Random Forest e LS Boost,
sendo que estes últimos dois algoritmos são classificados como métodos ensemble
baseados em Árvores de Decisão. Uma versão modificada do LS Boost, com amostragem
aleatória das variáveis utilizadas em cada nó, é também utilizada. Um algoritmo de
Otimização Bayesiana é então aplicado para a seleção dos hiperparâmetros dos
algoritmos ensemble utilizando-se o Conjunto de Treino.
Finalmente, são avaliados os modelos finais no Conjunto de Validação e é testada
a sensibilidade de sua performance à diferentes valores de pressão, velocidade de rotação
e eixos de medição. É feita também uma análise de como estes conhecimentos podem ser
utilizados no desenvolvimento de sistemas comerciais.
Resultados e discussão
O modelo final que apresentou melhor performance tanto para o caso de
temperatura de evaporação quanto para temperatura de condensação foi o LS Boost com
amostragem aleatória de variáveis nos nós, sem utilização de Redução de
Dimensionalidade. Este modelo obteve um Erro Médio Absoluto (MAE) de 1,22 °C para
a temperatura de evaporação e de 2.95 °C para a temperatura de condensação. Entretanto,
esta performance é altamente dependente das condições de pressões e de velocidade de
rotação do compressor.
Finalmente, mostrou-se que o sensor localizado próximo ao tubo de descarga
possui pouca influência para a previsão da temperatura de evaporação. Já para o caso da
previsão da temperatura de condensação, é o sensor alinhado com a vertical que menos
contribui com a qualidade do modelo.
Considerações finais
Mostrou-se que é possível o desenvolvimento de um algoritmo de Aprendizado
de Máquina para a previsão das pressões de sucção e descarga de um compressor
utilizando-se de dados de vibração. Os modelos e dados gerados neste trabalho, assim
como os conhecimentos obtidos, poderão ser utilizados no futuro para o desenvolvimento
de sistemas comerciais de medida não-invasiva de pressões de sucção e descarga de
compressores.
HT – Hilbert Transform
IBGE – Brazilian Institute of Geography and Statistics (from the Prtuguese Instituto Brasileiro de
Geografia e Estatística)
LVA – Laboratory of Vibration and Acoustics (from the portuguese Laboratório de Vibração e
Acústica)
RP – Recursive Partitioning
UFSC – Federal University of Santa Catarina (from the Portuguese Universidade Federal de Santa
Catarina)
WT – Wavelet Transform
LIST OF SYMBOLS
E – Young’s modulus
L – loss function
m– the mass of tube per unit length, used to calculate its natural frequencies
P – pressure
𝑃 ,𝑃 – suction pressure
TABLE OF CONTENTS
1 INTRODUCTION........................................................................................................... 19
2 THEORETICAL FUNDAMENTS................................................................................ 23
4 DEVELOPMENT ........................................................................................................... 43
6 CONCLUSION ................................................................................................................ 91
REFERENCES ..................................................................................................................... 93
19
1 INTRODUCTION
1.1 CONTEXTUALIZATION
is altered when changes on its working conditions occur. It would just be a matter of relating
those changes on vibration to the values of pressures.
Soedel (2008) presents strategies for calculating analytically the vibration
characteristics of compressors. Those methods are, however, highly complex and approximate,
as the relation between the pressure and the vibration is extremely complex. As an alternative,
this project presents a method of making this calculation using Machine Learning techniques.
The method of measuring a physical property of a system and using it to estimate other
correlated properties by the use of mathematical models is known in literature as soft sensing
or virtual sensing (LIU et al., 2009). When this relationship can be expressed mathematically,
an analytical model may be used. However, in the case that this relationship is too complex,
empirical models, such as Artificial Neural Networks (ANN), must be applied (LIN et al.,
2007). According to recent surveys, soft sensing is being applied successfully in a wide variety
of fields (Kadlec et al., 2009; Qin et al., 2012; Yin et al., 2015).
The application of soft-sensing has already been explored by researchers at the
Laboratory of Metrology (LABMETRO) of UFSC, the laboratory in which the author is an
intern, applying mainly ANNs in procedures of performance analysis of the compressors
manufactured by the company which funded this project (LIMA, 2010; PENZ, 2011; CORAL,
2014; PACHECO, 2015; NASCIMENTO, 2015).
Walendowsky (2017) developed a method to estimate the profile of variation of
angular velocity during one cycle of rotation of a compressor and used it in addition to the
average current as input for two ANNs to predict the suction and discharge pressures of
compressors. Similar projects applied to refrigerators and medium-sized air conditioning
systems can also be found in literature (PARIS et al., 2014; SCHANTZ, 2011; SCHANTZ,
LEEB, 2017).
However, a model of soft-sensing which makes use of vibration data is new to the
laboratory, and nothing could be found in the literature about its use for the prediction of
compressor pressures. Most works that exist focus on using the vibration for the prediction of
failures of compressors.
Yang (2005) made a comparison of three techniques for defect identification in
reciprocating compressors: Self Organizing Maps (SOM), Learning Vector Quantization
(LVQ) and Support Vector Machine (SVM). The input features of the models were obtained
21
through a Wavelet Transform (WT) of the vibration and the calculation of statistical values of
the filtered signal.
Tran (2013) developed a method for the identification of defects in reciprocating
compressor through data of vibration, pressure and current. It calculates the envelope by the
method of Teager Kaiser Energy Operation (TKEO) and filtering by WT. As classification
algorithms, three methods were compared: Deep Belief Network (DBN), SVM and Back
Propagation Neural Network (BPNN).
Because of the high demand of compressors with low noise emission, plenty of
research has been done in order to model its vibroacoustic behavior analytically, by the Finite
Elements Method (FEM) and experimentally, aiming mainly to get information for the design
of quieter compressors. Soedel (2008) presents an extensive study of analytical formulations of
the vibroacoustic behavior of reciprocating compressors, including possible influences of the
suction and discharge pressures.
The Laboratory of Vibration and Acoustics (LVA) of UFSC has also a history of
projects in this research line, modelling specifically the compressors of the company which
funded this project (SANGOI, 1983; DIESEL, 2000; CARMO, 2001; DENCKER, 2002;
FULCO, 2014). The work by Fulco (2014) is especially interesting, in which an analytical
model of a compressor similar to the one used in this work was developed, which took into
account the variation of rotational speed due to the chamber pressure. Moreover, an FEM model
was developed in order to evaluate the paths of transmission of vibroacoustic energy up to the
frequencies of 6300 Hz. However, the focus of these projects was the design of compressors
with lower vibration and not the prediction of other properties by the levels of vibration.
1.3 OBJECTIVES
2 THEORETICAL FUNDAMENTS
Humankind has been trying for a long time to develop smart systems, capable of
simulating human reasoning. However, hard-coded approaches proved to be inefficient to
tackle problems of higher complexity. The solution was to develop systems with the ability to
acquire their own knowledge and extract patterns from raw data, techniques that are known
nowadays as Machine Learning. Although such algorithms already exist since many decades,
they have grown in importance only recently with the fast expansion of available data and
computing power, as well as the evolution of the statistical methods (GOODFELLOW, 2016;
LANTZ, 2015).
One category of such methods is known as ensemble Tree-based methods. Tree-based
algorithms are simple but powerful Machine Learning techniques that are widely used in Data
Mining problems (TAN, 2006), being great for dealing with domains with large number of
variables and cases (TORGO, 1999). Moreover, these techniques are capable of analyzing the
importance of each feature, which can be of great use for the selection of features to be used.
Those methods can be applied for both classification and regression. The simplest of
such algorithms is the Decision Tree, which alone has limited applications but, when combined
with other Decision Trees in an ensemble manner, can achieve great results (TAN, 2006).
Caruana (2006) has made a large-scale empirical comparison of ten supervised
learning algorithms using eight performance criteria. Prior to calibration of the
hyperparameters, Bagged Trees, Random Forests and Neural Networks gave the best average
performance. However, after calibration using Platt’s Method, Boosted Trees move into first
place.
It is evident the high accuracy of tree-based methods and thus this project focuses on
this kind of algorithm. More advanced techniques, such as deep learning, are not appropriate
for this case, because of the relatively small size of the dataset and large number of input
variables (GOODFELLOW, 2016).
Firstly, the theory of the Mulivariable Linear Regression is presented, one of the
simplest methods of Machine Learning. Following, a revision about Tree-based methods is
done, starting with an explanation about the theory behind the Decision Tree. This method is
24
the building block for the next two algorithms, Random Forest and Gradient Boosting, which
are then explained in section 2.1.3.
𝑌 = 𝛽 +𝛽 𝑋 +𝛽 𝑋 +⋯+ 𝛽 𝑋 (1)
According to Tan (2005), a Decision Tree makes predictions by dividing the dataset
into subgroups depending on the features of each observation. Each of those divisions happens
at one of the tree nodes. The tree may have several levels of divisions until reaching what is
called the Terminal Nodes or Leaf Nodes.
If the problem being solved is a Classification, the Leaves will have classes and, in the
case of a Regression problem, it will have a value for the prediction of the dependent variable.
As the problem being tackled by this project is a Regression, focus is given to the Regression
Trees.
Figure 1 shows an example presented by Shalizi (2009), in which the tree predicts the
price of houses in the USA based on the latitude and longitude. For example, for a sample with
latitude lower than 38.485, longitude lower than -121.655 and latitude higher than 37.925, this
Decision Tree would make a prediction of 12.10.
25
According to Breiman (1984), the process of building (or training) a Regression Tree
is done through a Recursive Partitioning (RP) algorithm and the most widely used of such
techniques is known as CART, which simply stands for Classification and Regression Trees. It
builds Least Squares Regression Trees, or in other words, it tries to find the parameters which
minimize the Least Squares Error (LSE) criterion. It is the one that is used in this work.
At each node, the algorithm searches among all variables and among all possible split
values of each variable for the option that reduces the most the LSE. It then uses this criterion
to create the rule for the split into further nodes (BREIMAN, 1984).
The main problem when defining the parameters of a decision tree is to decide when
to stop splitting. Torgo (1999) discusses that the more a tree is grown, the more unreliable it
gets, because at each division the number of samples is reduced. As the reduction of error is
being evaluated using the data of the training set, this leads to an overfit of the model.
26
According to Tan (2006), the process of simplifying this model is called pruning. He
discusses that there are basically two methods of pruning. On the first one, known as pre-
pruning, more strict conditions are imposed to the tree so that it is not allowed to grow pass a
certain size. In addition to the condition that an additional split must decrease the MSE, it can
be defined the maximum number of splits of a tree and also the minimum number of
observations at each Leaf.
The second method, known as post-pruning, consists on growing a tree until its
maximum depth, followed by a process of elimination of the splits that give less reduction of
MSE, until reaching a maximum of accuracy. According to Tan (2006), this method normally
yields better results than pre-pruning, because this last one can leave pass important splits due
to a too early stop.
Even though Decision Trees have a series of advantages, they have also a high
instability that may degenerate their quality, as a small change in the training set may lead to a
different choice when building a node. Moreover, their prediction in regression is highly non-
smooth and the ability to evaluate complex problems is restricted (TORGO, 1999).
There are several techniques to overcome these problems. One solution is the use of
ensemble methods, which use a series of Decision Trees and makes an estimation by making a
combination of the output of each of those trees (TAN, 2006).
The error of a predictor is composed by two competing properties: bias and variance.
According to Lantz (2015), variance refers to the amount that predictions would change when
the model is trained with a different training set whereas bias refers to the error that is introduced
by approximating a real-life problem (which may be extremely complicated) by a much simpler
model. Normally, when more flexible methods are used, bias decreases (as it is able to better
describe the phenomena) but variance increases.
A good way to understand these phenomena is by analyzing how polynomial curves
of different degrees fit into a dataset, as represented by the graphs in figure 2, extracted from
Lantz (2015). In the left, certain experimental points are represented, as well as three models
that try to fit to it: a linear regression (orange), and two smoothing spline fits (blue and orange).
The graph in the right shows the MSE for test set (red curve) and training set (gray curve) for
all models.
27
The linear regression is a model too simple to represent the data, and thus it has a huge
bias and, consequently, a big MSE both for training and test set. When a more complex model
is used, such as the one represented by the blue line, the system can be better modeled and the
MSE falls. However, if even more flexibility is added to the model, as the one represented by
the green line, it will be able to represent well the training set, reducing the related MSE, but
will also lose its generalization capabilities, increasing its test set error. This is a case of big
variance. The ensemble methods work by trying to minimize the effect of such elements.
Tan (2006) says that there are two conditions for an ensemble method to be better than
a single classifier: 1) the base-classifiers must be independent among each other and 2) each
classifier must be better than a random choice. In practice, it is hard to have completely
independent classifiers, but there are methods to make those classifiers the least possibly
correlated. The first method is through the manipulation of the training set, technique used by
Bagging and by the Gradient Boosting. The second is by manipulating the training features,
which is done by the Random Forest. Both of those methods work by reducing the variance of
the answer. However, boosting goes further and, by iteratively improving the classifiers, it also
helps to reduce bias, at the expense of the risk of overfitting.
28
𝑝̅ (1 − 𝑠 )
𝑒= (2)
𝑠²
Breiman (2001) has also demonstrated that Random Forests are especially robust
against noise on the dependent variable (which can otherwise generate overfitting) and that they
can also deal well with datasets having a high number of weak, highly redundant variables. This
is the case of the present work: in order to predict the suction and discharge pressures of the
refrigeration fluid, the values of pressure measured on a test bench are used as labels, which are
filled with noise and don’t actually represent perfectly the real value of pressure. Moreover, the
variables used as input come from bands of FFT and thus there may be a high number of
independent variables depending on the number of bands used.
29
2.1.3.2. LS Boost
LS Boost stands for Least Squared Boost and is a version of Gradient Boosting used
for Regression Problems. This method was presented by Friedman (2001) and also uses several
trees to make a prediction. However, instead of using independent predictors as the Random
Forest, the LS Boost builds each Tree based on the previous one, improving the accuracy of the
model at each iteration. It starts by initializing the model to a constant value, represented by
equation 3.
Where 𝐹 is the function generated by the model, L is the loss function, y is the value
of the dependent variable for each observation, 𝛾 is the constant value to be found and i
represents each observation. According to this equation, 𝐹 is assigned to the value of 𝛾 that
minimizes the sum of the loss function for all observations.
Since the Least Squared Error has been chosen as the loss function, Friedman (2001)
has shown that the result is the mean value of the dependent variable for all observations.
Therefore, the first iteration of the model simply calculates the mean of all measured pressures
and uses this value as prediction for all observations.
The next step is the calculation of the so-called pseudo-residuals for each observation,
as defined in equation 4.
𝜕𝐿 𝑦 , 𝐹(𝑥 )
𝑦 =− for 𝑖 = 1, … , 𝑛 (4)
𝜕𝐹(𝑥 ) ( ) ( )
As the Least Squared Error is being used, the pseudo-residuals are the actual residuals
(i.e. the difference between the predicted values and the actual ones), multiplied by two.
However, according to Friedman 2001, this procedure tends to overfit the data, and
thus a regularization must be carried out. This is achieved by multiplying the response of the
base learners by a learning factor 𝜈, which has normally a value around 0.1.
Therefore, the final shape of the equation that defines the prediction of the model at
the iteration m is expressed in equation 8.
This procedure is repeated several times, adding new trees to the model, until reaching
a certain stopping criteria defined by the user, such as the number of iterations M.
The whole procedure can be summarized by figure 3.
31
Source: Author
Tests carried out by Caruana (2006) show that LS Boost generally achieves a higher
accuracy than Random Forests after calibration, as it is able to model more closely highly
complex conditions, although this result depends heavily on the problem being modeled. The
results also show that the performance of the Boosted Trees is highly dependent on the model
hyperparameters: if no tuning is carried out, the Random Forest tends to achieve a much better
result.
As further explained in section 4.1.2, the dataset used for the model is composed by
features that are extracted from the FFT spectrum by dividing it in constant-length bands. This
means that each band yields one feature and, as the size of each band is reduced in order to get
more concentrated information, the number of features increases as well. Therefore, there will
probably be bands that do not contribute to the modeling of the pressure. Moreover, there are
bands in different axes of measurement, or even in the same axis, which carry the same
information, being thus redundant. Two methods are tested to overcome this problem:
Dimensionality Reduction and Feature Subset Selection.
As discussed by Tan (2006), there is a variety of benefits to the use of Dimensionality
Reduction. Firstly, many Machine Learning algorithms work better if the dimensionality is
lower, mainly because of the reduction of noise and the elimination of irrelevant features. It can
also lead to a more understandable model. Finally, the amount of time and memory required is
reduced.
32
Still according to Tan (2006) the term Dimensionality Reduction is often reserved to
those techniques that reduce dimensionality of a dataset by creating new attributes that are a
combination of the old attributes. One such method is known as Principal Components Analysis
(PCA) and is the one used on this project.
The reduction of dimensionality may also be achieved by selecting new attributes that
are a subset of the old one, technique that is known as Feature Subset Selection or Feature
Selection (TAN, 2006). The technique used on this project for the selection of features is the
Feature Importance calculated by Random Forest, which is better explained in section 2.2.2.
Datasets often contain features that are irrelevant and redundant, and thus they could
be reduced simply by choosing only the ones that carry information about the problem being
solved. The ideal approach would be to try all possible subsets of features as input to the model
being developed and choose the one that produces the best result. This method, however, is
impractical, as the number of trials would increase exponentially with the number of features
(TAN, 2006). Therefore, other techniques must be used.
A solution is to use a Decision Tree to calculate the Feature Importance, which
represents how much each feature is contributing to the final model accuracy. This kind of
technique, in which the feature selection occurs naturally as part of the Machine Learning
algorithm, is known as an Embedded Approach (TAN, 2006).
33
According to Breiman (2001, 2002), for the case of classification trees, the importance
of a feature 𝑋 is measured as the sum of the weighted impurity decrease 𝑝(𝑡)Δ𝑖(𝑠 , 𝑡) for all
nodes t in which 𝑋 is used, where p is the proportion 𝑁 /N (samples reaching t over total
number of samples) and ∆i is the reduction of impurity due to this specific split. For the case of
a regression tree, the reduction of impurity is simply replaced by the reduction of RMSE.
This process is presented in equation (9), where t represents a specific split, T a specific
tree, and 𝑣(𝑠 ) is the variable used in split 𝑠 .
The same technique can be applied for Ensemble Methods (presented in section 2.1.3)
simply by averaging the importance across all trees T, as shown in equation 10 (BREIMAN,
2001, 2002), where 𝑁 represents the total number of trees.
1
𝐼𝑚𝑝(𝑋 ) = 𝑝(𝑡)Δ𝑖(𝑠 , 𝑡) (10)
𝑁
: ( )
1
𝑅𝑀𝑆𝐸 = 𝑦 −𝑦 ² (11)
𝑛
34
∑ 𝑦 −𝑦 ²
𝑅 =1− (12)
∑ 𝑦 −𝑦 ²
For a better visualization of the quality of the final model, the Mean Absolute Error
(MAE) is also used, as it gives a more intuitive feeling of the model precision.
1
𝑀𝐴𝐸 = 𝑦 −𝑦 (13)
2
Machine Learning models often require a careful tuning of its many hyperparameters.
However, this process is often done manually, using experience, or with a brute-force approach.
(SNOEK et al, 2012).
As an alternative, it is possible to use an automated procedure, such as the Bayesian
Optimization, a technique that has been shown to outperform other state of the art global
optimization algorithms on a number of cases (SNOEK et al, 2012). This method has been used
on a series of cutting-edge Machine Learning projects, such as DeepMind’s AlphaGo (CHEN,
2018).
The version of the algorithm applied in this project creates a Gaussian Process function
to predict the response of the Machine Learning model to a set of hyperparameters, as described
by Snoek et al (2012). The accuracy of this function is improved as new sets of hyperparameters
are used.
The algorithm must then choose the parameters that minimize a predicted loss function
(the RMSE for the present work). There are several methods that can be used for the choice of
the set of hyperparameters to be used at each iteration. The one used in this project was the
“expected-improvement-per-second-plus” (MATHWORKS, 2019). It uses the traditional
Expected Improvement per Second algorithm (SNOEK et al 2012), which optimizes not only
the accuracy but also the training time of the models tested, but with a modification to avoid
overexploiting an area, as proposed by Bull et all (2011).
35
This chapter begins with a general explanation about the compressor architecture and
its vibration characteristics. Following, it presents the experiments and data acquisition details.
It finishes with an explanation of the organization of the generated datasets.
The compressor has an induction electric motor that actuates a piston, which moves
on a reciprocating fashion inside a cylinder (A) similar to an internal combustion engine. When
this piston retracts, a region of low pressure is generated, making the fluid to enter the cylinder.
Following, the direction of movement of the piston is inverted and it then starts exerting work
over the gas, compressing it. As a consequence, the pressure of the fluid is increased until
reaching a value that opens the discharge valve, injecting this pressurized fluid into the system.
The amount of pressure increase depends on the characteristics of the refrigeration
system and the speed of rotation of the compressor. By varying this speed, it is possible to
control the refrigeration capacity of the system.
36
In this specific architecture, the fluid at low pressure (suction pressure) is injected in
the whole cavity of the compressor through (C), entering the cylinder through an intake valve.
After being pressurized, the fluid exits the cylinder through a discharge valve and a pipe to the
rest of the refrigeration system (D). These valves are composed by flexible plates designed to
open on a specific pressure differential.
All moving parts of the compressor are assembled in the casing, which is mounted
over springs (F) in order to avoid transmitting their vibrating to the shell (B). Apart from the
springs, the only physical connection of these parts to the shell is the discharge pipe, which is
also designed to reduce the transmitted vibration. The shell is composed of two parts (top and
bottom cap) which are welded together, forming a hermetic cavity. The only component present
in the upper part of the shell is the connection of the suction pipe.
The modelling of vibration and noise of compressors is a wide field of study and an
in-depth discussion of its vibration characteristics is beyond the scope of this project. Therefore,
focus is given only to the possible mechanisms of influence of the pressures over the vibration.
Soedel (2008) discusses that a variation on the conditions of the gas inside the shell
may change the gas resonances, as the speed of sound in it is modified. In addition to that, the
difference between the discharge and suction pressures may change the stiffness of the
discharge pipe. An increase in the pressure differential increases its stiffness, shifting by a few
hertz the natural frequencies of the pipe, which may be enough to cause resonance. This can be
better visualized in equation 14, which shows the natural frequencies of a simplified, straight,
discharge pipe, ignoring the effect of gas velocity (SOEDEL, 2008).
𝜋 𝑛 𝐸𝐼 (𝑃 − 𝑃 )𝐴𝑙²
𝑓 = 1+ (14)
4𝑙 𝑚 𝐸𝐼𝜋²𝑛²
Where 𝑃 is the discharge pressure, 𝑃 is the suction pressure, A is the internal area of
tube, l is the length of the tube, E is the Young’s modulus, I is the area moment, m is the mass
of the tube per unit length and n = 1, 2, 3…
There are two possible sources of vibration to the discharge pipe. The first one is the
vibration of the compressor casing (cylinder, piston, etc.). The second one is the case in which
the discharge pipe has curves (which is the case for the compressor used in this study), causing
37
that the pulses of pressure produced by the valves to excite vibration at each curve. Because of
its slenderness, this pipe has normally several natural frequencies in the range of 100 Hz to
5000 Hz, which is the range of concern for typical household compressors. Therefore, the
occurrence of pipe resonances is almost inevitable, contributing considerably to the shell
vibration (SOEDEL, 2008). As the change in pressure shifts such natural frequencies, it is
expected an important relation between pressure and vibration.
In the FEM simulation done by Fulco (2008), the model received as input the measured
discharge pressure and calculated as output the vibration transmitted to the shell. Until 1000 Hz,
the sound at the discharge pipe is composed basically by the displacement of the casing, with
virtually no influence of the pressure. However, after 2500 Hz, the pressure starts to have some
influence and at 6300 Hz there is an exceptionally high incidence of noise due to the discharge
pressure. According to the simulation, from 3150 Hz to 6300 Hz, the intake acoustic filter and
the discharge pipe accounted to over 50% of the total sound power level.
Fulco (2008) also shows how the variation of pressure causes a cyclical variation of
resistant force, which in turn causes a variation of rotational speed and thus of the vibrational
pattern of the casing. This effect is more pronouncing over the lower spectrum of vibration.
Walendowski (2017) made use of this fact to make an estimation of the suction and
discharge pressures of the compressors using measures of electric quantities from the motor.
The figure below shows the results of an experiment conducted by him, in which the instant
torque is measured at each angular position for 5 different pressure conditions. It is quite evident
the effect that the pressures have on the peak of torque.
This variation in torque also changes the vibration pattern of the casing, which is in
turn transmitted to the shell. Therefore, this is another way by which the pressure has some
influence over the shell vibration.
38
Figure 5 - Torque profile as a funtion of the angular position for 5 different pressure conditions
Soedel (2008) also discusses the effect of the valves on the compressor vibration and
acoustic noise. Due to the intermittent nature of piston compressors, the valves open and close
at a defined frequency, hitting the valve seat and causing pulses of pressure, which may
modulate the structure resonances.
Moreover, the valves may flutter, adding high frequency components to the vibration.
However, the specific dynamics of the opening and closing acts more like a disturbance, and
those high frequency components have few influence on the vibration when compared to the
frequency of rotation of the shaft.
Finally, gas pulsations of the suction manifold may cause a sloshing effect typically
around 200 Hz to 500 Hz. Turbulence can also have an influence, mostly in the high frequency
spectrum, around 3000 Hz to 6000 Hz (SOEDEL, 2008).
39
In order to train and test the Machine Learning models of this project, data collected
from a series of experiments carried out in a test rig built at the Laboratory of Metrology
(LABMETRO), from UFSC, were used. This test rig is capable of controlling the pressure of
the refrigeration fluid during the operation of a compressor at both intake and discharge. A
schematic representation of the test rig is presented in figure 6.
Although the system measures and controls the suction and discharge pressures (𝑃
and 𝑃 ), these values were transformed to evaporation and condensing temperatures of the
refrigeration fluid (𝑇 and 𝑇 , respectively) in order to be used in the Machine Learning models.
The reason for that is because different kinds of refrigerants may be used depending on the
system, which means that the pressures may also vary. Therefore, the design of refrigeration
systems is made using the evaporation and condensing temperatures, which will be the same
independently of the refrigerant. It is thus convenient to use these temperatures as independent
variables of the model instead of suction and discharge pressures, although they actually mean
the same thing.
The experiments used in this work considered five identical compressor samples. For
each compressor, a series of tests was made varying 𝑇 and 𝑇 , as well as the speed of rotation.
During these tests, the vibration of the compressor shell was measured using three sensors at a
sampling rate of 51.2 kHz and the data were stored as vibratory velocity. The sensors were
mounted aligned to the X, Y and Z axes, as shown in the figure 7.
40
Source: Author
The X-axis and Y-axis sensors were positioned on the bottom cap, whereas the Z-axis
sensor was positioned on the upper cap. The Z-axis is aligned with the axis of the springs of the
casing, the X-axis is aligned with the axis of the cylinder (which is horizontally oriented in this
design), and the Y-axis is perpendicular to the axis of the cylinder. The X-axis sensor is located
near the discharge pipe connection.
All the compressors were subjected to the same testing procedures. Two sets of
conditions were implemented. The first one (mapped measurements) used 3 levels of speed, 11
levels of 𝑇 and 11 levels of 𝑇 , leading to a total of 363 possible combinations. The range of
values of temperature used was defined using the tables of application of the manufacturer. It
is interesting to note that each of the possible 121 combinations of pressures was chosen in a
random sequence by the system, instead of following a pre-defined sequence. This was done in
order to avoid systematic errors.
The reason why 3 different speed values are used is because this model of compressor
is able to work at different speeds, so it is necessary that the proposed tool is able to make
predictions in any of these conditions. As the vibration is highly dependent on the frequency of
rotation, the speed acts as an important disturbance of the model.
The experiment using these combinations was executed three times and each
measurement was done during 10 seconds in order to reduce the effect of noise of the
experiment. Only the experiments using the first and second speeds on the first compressor
were executed five times, instead of three.