Predicting Non-Uniform Indoor Air Quality Distribution by Using Pulsating Air Supply and SVM Model

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Building and Environment 219 (2022) 109171

Contents lists available at ScienceDirect

Building and Environment


journal homepage: www.elsevier.com/locate/buildenv

Predicting non-uniform indoor air quality distribution by using pulsating


air supply and SVM model
Xue Tian a, Yuchun Zhang a, Zhang Lin b, *
a
Department of Architecture and Civil Engineering, City University of Hong Kong, Hong Kong, China
b
Division of Building Science and Technology, City University of Hong Kong, Hong Kong, China

A R T I C L E I N F O A B S T R A C T

Keywords: Mixing ventilation is the most common air distribution strategy, and often the same diffusers provide space
Indoor air quality cooling and heating. Although aiming to achieve uniform air distributions, there are still non-uniformities at
Support vector machine specific indoor locations. This study innovatively predicts the non-uniform indoor air quality (IAQ) distribution
Air age
with mixing ventilation, directly from air temperature and air velocity. Both heating and cooling cases are
Pulsating air supply
Non-uniform air distribution
conducted with computational fluid dynamics (CFD) techniques, validated by the experimental measurements in
a multi-occupant office configuration. The comparison shows that the support vector machine (SVM) model
outperforms the back-propagation neural network (BPNN) and genetic algorithm back-propagation neural
network (GABPNN) models, with a medium computation time. The innovative method is demonstrated that the
pulsating air supply is useful in generating effective inputs, i.e., the dynamic variations of air temperature and
velocity during pulsating air supply (ΔT and Δv), for the air age prediction under the corresponding steady state
(τsteady ). ΔT and Δv can be fast obtained by monitoring the air velocity and temperature for 10–20 min. With a
random selection, the data size should be larger than 180 to reach the mean absolute percentage error (MAPE)
threshold value of 5%. However, it can also be reduced to 60 when the data points have a similar steady air
temperature. The prediction accuracy under heating is slightly higher than that under cooling, as achieving good
mixing under heating is more difficult. We aim to provide guidelines on effective and measurable inputs and
valid data-driven models for non-uniform IAQ prediction.

There is a predicted mean vote (PMV) deviation of 1.5 between different


1. Introduction measurement points under mixing ventilation, along with a CO2 con­
centration deviation of 500 ppm [7]. The particle number concentration
Air distribution plays an important role in providing satisfactory is also not uniformly distributed under mixing ventilation [8]. Mixing
thermal comfort and indoor air quality (IAQ). Among various air dis­ ventilation risks forming a significant vortex flow, resulting in local air
tribution methods, mixing ventilation is the most common method, retention [9]. Generally, a perfect mixing condition is rare in reality.
applied to various rooms such as offices and classrooms, and often the Although the same diffusers are widely applied for both cooling and
same diffusers provide space cooling and heating [1,2]. Mixing venti­ heating, the air distributions with cooling and heating are different. The
lation systems can be configured by “wall supply and exhaust” or range of the characteristic space length for the diffuser jet flow value
“ceiling supply and exhaust” [3]. It is supposed that the air distribution that can achieve good mixing indicated by air diffusion performance
under mixing ventilation is quite uniform, compared with methods that index (ADPI) under heating conditions, is significantly smaller than that
aim to serve the occupied zone, e.g., displacement ventilation [4], under cooling mode [10]. Researchers found low ventilation effective­
stratum ventilation [5] and impinging jet ventilation [6]. However, ness under the heating conditions with mixing ventilation [2,11]. The
there can still be non-uniformity of thermal comfort and IAQ when it impact of the stratification and low ventilation effectiveness with mixing
comes to specific indoor locations, due to room layouts, partitions, air ventilation for heating are taken into account by a correction factor in
terminal positions, etc. For instance, slightly high air velocity levels the ASHRAE Standard 62.1 [12]. The perception of warm head and/or
appear in local regions under mixing ventilation [1]. Temperature in­ cold feet is noticeable with mixing ventilation for heating, due to the
creases with height as a cubic polynomial under mixing ventilation [5]. thermal non-uniformity [13]. In summary, the local non-uniformity

* Corresponding author.
E-mail address: bsjzl@cityu.edu.hk (Z. Lin).

https://doi.org/10.1016/j.buildenv.2022.109171
Received 11 February 2022; Received in revised form 12 April 2022; Accepted 3 May 2022
Available online 7 May 2022
0360-1323/© 2022 Elsevier Ltd. All rights reserved.
X. Tian et al. Building and Environment 219 (2022) 109171

Abbreviations SVM Support vector machine

ACH Air changes per hour Nomenclature


ADPI Air diffusion performance index Tduty Air temperature at the end of the duty period (◦ C)
ANN Artificial neural network Tidle Air temperature at the end of the idle period (◦ C)
BP Back-propagation Tsteady Air temperature under the steady state (◦ C)
BPNN Back-propagation neural network τsteady Air age under the steady state (s)
CFD Computational fluid dynamics vduty Air velocity at the end of the duty period (m/s)
DOM Discrete ordinate method vidle Air velocity at the end of the idle period (m/s)
GA Genetic algorithm vsteady Air velocity under the steady state (m/s)
GABPNN Genetic algorithm back-propagation neural network vs− duty Supply air velocity during the duty period (m/s)
HVAC Heating, ventilation, and air conditioning vs− idle Supply air velocity during the idle period (m/s)
IAQ Indoor air quality vs− steady Supply air velocity under the steady state (m/s)
MAE Mean absolute error ΔT Dynamic variation of air temperature during pulsating air
MAPE Mean absolute percentage error supply (◦ C)
PMV Predicted mean vote Δv Dynamic variation of air velocity during pulsating air
R2 Coefficient of determination supply (m/s)
RBF Gaussian Radical Basis Function
RMSE Root mean square error

exists with both cooling and heating under mixing ventilation. air jets also play roles [7]. Therefore, air quality stratification is
It is necessary to achieve energy reduction by improving the capa­ enhanced by the height of the room and thermal length [20], while such
bility of the Heating, ventilation, and air conditioning (HVAC) system to enhancement is reduced with mixing ventilation. Generally, IAQ pre­
adjust quickly and automatically, which is also an important feature of diction using air distribution poses a challenge because of reduced
intelligent buildings [14,15]. With the importance of IAQ for occupants’ effective inputs and complex variables. Steady air distribution infor­
health and well-being, monitoring and controlling the IAQ have long mation may be insufficient, and the challenge is to collect effective input
been an important topic in the domain of indoor built environment. data with minimal costs. Pulsating air supply is used innovatively to
Compared with the field measurements, prediction is non-invasive and produce effective input.
can be quick and inexpensive, which is essential for effective and timely The technical route is thus developed as follows. Firstly, the exper­
control [16]. Because of the complex input data that the mechanistic iments with mixing ventilation in a multi-occupant office are conducted.
models require, data-driven models serve as an alternative approach for Secondly, the full-chamber computational fluid dynamics (CFD) models
IAQ prediction [17]. Popular models include artificial neural network are established, and validated using the experimental measurements,
(ANN) models, regression models, models based on decision trees, etc. under both steady and transient conditions. After conducting CFD sim­
[17]. The models include various inputs, such as indoor pollutant con­ ulations cases under both heating and cooling modes, three kinds of
centration, outdoor conditions, indoor temperature, relative humidity, data-driven models including ANNs and non-ANN, are developed. Ef­
etc. [16,18]. It should be pointed out that there is a possibility to predict forts are made to use the data-driven models to predict the IAQ from the
IAQ directly with the air distribution information to reduce labor and air distribution results, and the comparisons are presented. The goal is to
material expenses on the IAQ measurements [19,20]. approach a sufficient prediction accuracy, by discovering the inputs
Considering the whole indoor environment as uniform to control can using pulsating air supply, and testing the algorithms. The prospect is to
result in thermal discomfort, poor IAQ at the occupied zone and energy guide the non-uniform IAQ prediction with mixing ventilation, towards
wastage [21]. Efforts have been made on accurately predicting the right measurable inputs and applicable algorithms.
non-uniform environments. Zhao et al. have developed a theoretical
expression for the clean air volume in a non-uniform cleanroom envi­ 2. Methodology
ronment [22]. Zhang et al. used a polynomial model to predict the in­
door air temperature using the air supply and exhaust conditions under Fig. 1 presents an example of pulsating air supply from the results of
stratum ventilation [19]. An online supervisory control predictive tool is the cooling case C-5 listed in Table 4. Pulsating air supply has been
developed for the displacement ventilation system with chilled ceiling to investigated in buildings as a potential solution to increase both venti­
minimize energy consumption while achieving optimal IAQ and thermal lation efficiency and thermal comfort [27]. The principle of the pulsat­
comfort [23]. However, few studies considered the non-uniform IAQ ing air supply introduced here, is to create cyclic air velocity and
prediction and the corresponding control under mixing ventilation. temperature variations by controlling supply air condition [27]. A whole
Because it has been widely installed in existing buildings, it is essential cycle of pulsating air supply includes an idle period when the supply air
to retrofit mixing ventilation to be part of intelligent green buildings, velocity is lower than the average of the whole cycle (i.e., 0–150 s in
without major renovation in ventilation systems [24]. Fig. 1), and a duty period when the air velocity is higher than the
The present work aims to predict non-uniform IAQ conditions, using average of the whole cycle (i.e., 150–300 s in Fig. 1). The changing
the air temperature and velocity information, which is easier and faster trends of the air temperature and air velocity at a point are drawn in
to obtain in practice. The developing process is achievable for signifi­ Fig. 1. Generally, during the idle period, the air velocity has been kept at
cantly stratified environments, while much more difficult for those with a relatively low value, and the air temperature has been increasing, vice
mixing ventilation. Essentially, the major factors causing non- versa.
uniformity are different. For stratified environments produced by stra­ The flow diagram of the methodology is shown in Fig. 2. The process
tum ventilation, displacement ventilation, etc., the non-uniformity is aims to develop a method to use the air distributions to predict the air
mainly due to the buoyancy or momentum of the supply air jets [25,26]. age, as a representative of IAQ. As one of the common IAQ indicators, air
For mixing ventilation, the dominant causes are the distributions of the age refers to the time that air experiences from entering a room to
heat sources and physical partitions, while the buoyancy and the supply arriving at a particular location in the room, indicating the freshness of

2
X. Tian et al. Building and Environment 219 (2022) 109171

⃒ ⃒
ΔT = ⃒Tduty − Tidle ⃒ (2)

where, vduty is the air velocity at the end of the duty period, vidle is the air
velocity at the end of the idle period; Tduty is the air temperature at the
end of the duty period, Tidle is the air temperature at the end of the idle
period. An example of collecting the vduty , vidle , Tduty , and Tidle is pre­
sented in Fig. 1. The vsteady and Tsteady are proved to be effective inputs to
predict τsteady , as the air age distribution is determined by the air dis­
tribution [20]. The inspiration of using Δv and ΔT to be inputs to predict
τsteady is that, the Δv and ΔT (i.e., dynamic variations during pulsating air
supply) are closely related to the air supply condition and the location
characteristics under stratum ventilation [27]. As for stratum ventila­
tion, this relationship is more apparent as the fresh air is directly sup­
plied to the investigated breathing zone, compared with mixing
ventilation which contains longer routes of fresh air and more complex
turbulent air motions [5]. Therefore, larger data sets and data-driven
models are needed to discover hidden relationships.
The inputs and outputs of the data set are therefore filled in. After the
Fig. 1. An example of pulsating air supply (CFD simulated results of Case C-5 data collection, the data-driven models are developed. Several models
in Table 4). are established to make comparisons on the accuracy and computation
time, including back-propagation neural network (BPNN) models, ge­
indoor air [20]. Air age can be obtained experimentally by tracer gas
methods (pulse, step-up, and step-down (decay) method) or determined
numerically using CFD techniques. The steady-state method based on Table 3
the resolution of an additional transport equation is applied to deter­ CFD settings.
mine air age, due to its low computation time and high reliability for Condition Steady (Steady air Transient (Pulsating air
predicting [28]. The air age calculation is implemented into CFD supply) supply)

through user-defined functions (UDF). Turbulence model RNG k-ε [33,34]


As presented in Fig. 2, the experiments are first conducted to validate Discretization method Finite volume method [35]
Wall function Standard wall function [30]
the CFD models. The reason for choosing the CFD method to collect data
Buoyancy effect Boussinesq hypothesis [31]
is that the data-driven approach often requires a large amount of data, Radiation model DOM (discrete ordinate method) [31]
which is costly by conducting experiments. The CFD simulations with Pressure-velocity coupling SIMPLE [34]
steady air supply provide the information of the air velocity and air algorithm
temperature under the steady state (i.e., vsteady and Tsteady ), and the air Spatial discrete scheme Second order upwind schemes [34]
Time discrete scheme Null Second order implicit
age under the steady state (i.e., τsteady ). Simultaneously, the CFD simu­
scheme
lations with pulsating air supply provide the information of the Δv and Time step Null 0.1 s [34]
ΔT. The explanation of Δv and ΔT is presented as follows: Convergence criterion for 10− 4 [33]
⃒ ⃒ momentum residuals
Δv = ⃒vduty − vidle ⃒ (1) Convergence criteria for the other 10− 6
[33]
terms

Table 1
Information on measurement instruments.
Instrument Measured parameter Accuracy Range Sampling duration Sampling
frequency (Hz)

ALNOR EBT731 flow Airflow rate ±3% of reading 42–4250 2 min 0.017
hood m3/h
WYZ-1 temperature Temperature of the surfaces (i.e., walls, floor and ±0.3 C

− 20–80 ◦ C 30 min 0.017
recorder ceiling); Room air temperature
SWEMA 03+ Air velocity at the sampling lines ±0.04 m/s at 0.5–3.00 m/ 10 min for steady air supply; 15 min 5
anemometer 0.05–1.00 m/s; s for pulsating air supply
±4% read value at
1.00–3.00 m/s
Air temperature at the sampling lines; Supply air ±0.1 ◦ C 10–40 ◦ C
temperature;

Table 2
Information on experimental cases.
Case Room air temperature Supply air temperature Exhaust air temperature Air changes per hour Air supply velocity (m/s) Cycle time
(◦ C) (◦ C) (◦ C) (ACH) (s)
The whole Duty Idle
cycle period period

Steady 23.2 ± 0.2 18.14 ± 0.1 23.7 20 0.56 ± 0.13 Null


Pulsating 22.8 ± 0.3 18.37 ± 0.2 23.7 20 0.56 ± 0.13 0.90 ± 0.20 ± 300
0.15 0.20

3
X. Tian et al. Building and Environment 219 (2022) 109171

Table 4
CFD simulation cases.
Case Condition Airflow rate Supply air Supply air velocity of steady Supply air velocity of duty Supply air velocity of idle Length of
(ACH) temperature (◦ C) state vs− steady (m/s) period vs− duty (m/s) period vs− idle (m/s) cycle (s)

C-1 Cooling 20 16 0.55 0.9 0.2 1200


C-2 20 16 0.55 0.9 0.2 600
C-3 20 16 0.55 0.9 0.2 300
C-4 33 16 0.90 1.5 0.3 600
C-5 7 16 0.20 0.3 0.1 600
H-1 Heating 20 30 0.55 0.9 0.2 600
H-2 33 30 0.90 1.5 0.3 600
H-3 7 30 0.20 0.3 0.1 600

Fig. 2. Flow diagram of methodology.

netic algorithm back-propagation neural network (GABPNN) models There are 12 sampling lines arranged in the experimental chamber,
and support vector machine (SVM) models. The inputs are also tested, to which are fairly evenly distributed throughout the room. Detailed lo­
see if adding Δv and ΔT can improve the prediction accuracy, as adding cations of the sampling lines are drawn in Fig. 3 (b). Sampling Lines L1,
the measurements of Δv and ΔT is more complicated. The aim is to L3, L5, L7, L9 and L11 are located in front of the occupants. Lines 2, 4, 8
predict the IAQ at different locations through the information that can and 10 are between the tables. And Lines 6 and 12 are between the table
be relatively easily obtained and quickly measured. and the wall. Sampling Lines L1, L3, L5, L7, L9 and L11, adjacent to heat
sources, are positioned to represent the micro-environments around the
occupants, which are vital for evaluating the thermal comfort and IAQ.
2.1. Experiments The non-uniformity of the air distribution has a close relationship with
the layouts of the heat sources and partitions [7]. Other sampling lines
To validate the proposed model, experiments are conducted in a are positioned between the partitions where local recirculation may
chamber located at the City University of Hong Kong, China. This appear [9]. Point L1-1.1 refers to the measuring point at the height of
chamber has six desks accommodating six occupants, which is config­ 1.1 m above the floor in Line L1, and so forth.
ured as a multi-occupant office. There is no window on the envelope. The supply air temperature and supply airflow rate (supply air ve­
The partitions are shared with adjacent spaces constantly air- locity) are controlled using a control system by varying the opening of
conditioned. The layout is shown in Fig. 3 (a). The dimensions of the the chilled water valve and the frequency of the supply fan. The time
chamber are 7.8 m (length) × 5.3 m (width) × 2.4 m (height). The lengths of the complete cycle, duty period and idle period are controlled
occupant is represented by a human simulator with the dimensions of manually. The supply airflow rates and the corresponding supply air
0.4 m (length) × 0.25 m (width) × 1.2 m (height), with a 100 W light velocities are tested aforehand, to match the frequencies of the fan. The
bulb placed inside to simulate occupant heat generation [29]. The ALNOR EBT731 flow hood is applied to measure the airflow rates of the
diameter of the light bulb is 125 mm, and 98% of the total power is supply air inlets. Before experimental measurements, a period of 2 h
converted to heat. Cheng et al. has proved by flow visualization that after turning on the air-conditioning system and the heat sources is
such human simulator settings are effective to simulate the occupant’s applied, to ensure that the room condition is steady. The room air
heat generation on the global air distribution [29]. Fourteen temperature is measured by the WYZ-1 temperature recorder positioned
ceiling-mounted lamps of 56 W each are also present. According to the at the geometric center of the room. Vertical sensor rigs attached with
initial luminous flux, the initial luminous efficacy is calculated as 11%. SWEMA 03+ anemometers at the desired heights (i.e., 0.1 m, 0.6 m, 1.1
The luminous flux decays with the service time, meaning that at least m and 1.8 m above the floor) are placed on the sampling lines. The
89% of the total power compensates for energy lost as heat. Here, we heights of 0.1 m, 0.6 m and 1.1 m are chosen to represent the ankle level,
consider the worst scenario in which 100% of the total power is trans­ the abdomen level and the head level for a seated occupant [25]. The
formed into heat in CFD simulations. As shown in previous works, the height of 1.8 m represents the upper part of the room. The supply air
lights are distant from the occupied zone, and the errors are acceptable temperature is measured at the three air supply inlets (see Fig. 3 (a)).
[30,31]. Three air supply inlets (i.e., S1–S3 in Fig. 3 (a)) with the same The exhaust air temperature is monitored by the control system. The
dimension of 0.6 m × 0.6 m are located on the ceiling. Also on the ceiling information on the measured parameters and the measurement in­
are three air exhausts of the same size (i.e., E1-E3 in Fig. 3 (a)). They struments is presented in Table 1. Table 2 summarized the detailed
configure a common mixing ventilation layout, i.e., “ceiling supply and conditions of the experimental cases. The lengths of the duty and idle
exhaust” [5,6].

4
X. Tian et al. Building and Environment 219 (2022) 109171

Fig. 3. (a) Layout of experimental chamber (b) Locations of sampling lines


(Note: H1–H6: Human simulators; T1-T7: Tables; L1-L14: Lights; E1-E3: Air exhausts; S1–S3: Air supply terminals).

periods are the same during a complete cycle. The complete cycle length turbulent kinetic energy residual, turbulent dissipation residual, energy
is 300 s. Room air temperature of 23 ◦ C is considered, and the airflow residual and radiation intensity residual is set as 10− 6 [33]. For pul­
rate is 20 air changes per hour (ACH), approaching the parameter sating air supply, that is, under transient conditions, the second order
choices in previous studies [5,29,32]. upwind scheme is used for spatial discretization of all physical quanti­
ties, and the second order implicit scheme is used for time discretization
[34]. After testing different time steps, the size of 0.1 s is enough to
2.2. CFD settings ensure the convergence of the solution for each time step.
All solid surfaces are designated as no-slip walls. The right-side wall
2.2.1. Computational set-up is considered the exterior wall in CFD simulations, with 30 ◦ C for cooling
The room configuration used in CFD numerical simulation is and 15 ◦ C for heating conditions. The human simulators and lamps are
consistent with the experimental chamber, as shown in Fig. 3. The set as the constant-heat-flux surfaces, with the values of 61.6 W/m2 and
commercial CFD software ANSYS Fluent 19 is adopted. The CFD settings 155.6 W/m2 referring to the experiments, respectively. The air supply
are listed in Table 3. The RNG k-ε model has been proved to be a tur­ inlet is set as a velocity-inlet [30]. The air supply velocity is independent
bulence model suitable for predicting indoor airflow characteristics, of time for the steady air supply, with a given constant value. For pul­
both for steady and transient conditions [33,34]. The discretization sating air supply, the air supply velocity is defined by UDF. Air exhaust is
method of the control equation adopts the finite volume method due to simulated as outflow.
its high balance of the physical quantity [35]. The standard wall func­
tion simulates the turbulent flow in the near-wall area [30]. The Bous­ 2.2.2. Mesh generation
sinesq hypothesis is adopted to consider the buoyancy effect, and the ICEM software is used to grid the built room model. As all indoor
discrete ordinate method (DOM) is selected for radiation [31]. The objects are cuboids, the grids are structured hexahedral. In order to
SIMPLE method is adopted as the pressure-velocity coupling algorithm improve the accuracy of simulation, the meshes near air terminals, heat
[30]. Under steady air supply conditions, second order upwind schemes sources and non-adiabatic wall surfaces are refined, since the air ve­
are adopted for both convective and viscous terms. The convergence locity and temperature gradients of these positions are relatively large. A
criterion of momentum residual is set as 10− 4, and that of mass residual,

5
X. Tian et al. Building and Environment 219 (2022) 109171

grid independence check is conducted. Three grids are generated. Their predictions by data-driven models [18,32,36]. The high value of R2 in­
cell numbers are 722,155 (coarse), 2,201,846 (medium) and 3,949,506 dicates a well-fitted curve, while the relatively low values of MAE, RMSE
(refined) respectively. As the cells are all structured hexahedral, the and MAPE indicate how close the results are. They are calculated using
skewness is unity. The aspect ratios are 18 for the coarse, 14 for the the following formulas:
medium and 6 for the refined. When the simulation results do not

N
significantly change with the increase of the number of grids, the grid (yi − xi )2
dependence is reached, and the grid with the minimum cell number to 2
R =1 − (i=1 ( )2 (1)
reach the consistent results is selected [4]. The maximum relative dif­ ∑ ∑
N N
1
xi − xi
ferences of 80 points on four sampling lines (i.e., Lines 3, 4, 5 and 10)
N
i=1 i=1

between the coarse and medium grids are 2.4 ◦ C for air temperature, and
0.09 m/s for air velocity. The numbers decreased to 0.6 ◦ C and 0.02 m/s 1 ∑N

between the medium and refined grids. In summary, the cell number is MAE = |yi − xi | (2)
N i=1
chosen as 2,201,846.
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
∑N 2
i=1 (yi − xi )
RMSE = (3)
2.3. CFD validation N

In order to prove that the models and calculation settings used in the N ⃒ ⃒
1 ∑ ⃒yi − xi ⃒
simulation can accurately predict the air distribution, we validate the MAPE = ⃒ ⃒ (4)
N i=1 ⃒ xi ⃒
CFD model using the experimental measurements. The boundary con­
ditions, i.e., the wall temperatures and air supply parameters used in the where, xi and yi are the measurement and CFD prediction respectively,
CFD validation, are consistent with the actual conditions in the experi­ and N is the amount of data. For evaluating the prediction accuracy of
ment (see Table 2). The coefficient of determination (R2 ), mean absolute the data-driven models, xi and yi are the predictions by the CFD and
error (MAE), root mean square error (RMSE) and mean absolute per­ data-driven models, respectively.
centage error (MAPE) are calculated to evaluate the consistency be­
tween the CFD simulation results and experimental measurements or

Fig. 4. Comparison between experimental and CFD results.

6
X. Tian et al. Building and Environment 219 (2022) 109171

2.3.1. Steady condition time are basically similar. It should be noted that the measured Δv and
For steady air supply, the velocity and temperature of 16 points at ΔT in Figs. 5 and 6 are lower than 0.3 m/s and 1 ◦ C, respectively.
four levels (i.e., 0.1 m, 0.6 m, 1.1 m and 1.8 m) along four sampling lines Therefore, an instantaneous deviation of 0.4 (standardized) is accept­
(i.e., Lines 3, 4, 5 and 10), are taken as the data for comparison between able. In Fig. 6, the applied CFD model cannot accurately simulate the
experiment and CFD, as shown in Fig. 4. Different heights and horizontal irregular turbulence shown in the measured air velocity results [35].
positions are considered to avoid contingency. Lines 3 and 5 are near The CFD results are quite similar to the experimental results after
heat sources and air supply terminals. Line 10 is distant from those, de-composition [27]. The vduty , vidle , Tduty , and Tidle shown in Fig. 1 are
while Line 4 is in-between. For air velocity and temperature, the generated using the CFD results, which reflect the difference in the air
maximum differences between CFD results and experimental results are velocities between the idle period and the duty period, even better than
about 0.2 m/s and 0.5 ◦ C respectively, which are acceptable. For air the experimental results, reducing the influence brought by the irregular
velocity, the MAE value is 0.05 m/s and the RMSE value is 0.05 m/s. For turbulence. As we only collect the Δv and ΔT values rather than
air temperature, the MAE value is 0.33 ◦ C and the RMSE value is 0.40 ◦ C. instantaneous values of each moment, the transient CFD model is used to
The MAE and RMSE results are similar to those in a previous study [31]. predict the air distribution under pulsating air supply with an acceptable
In conclusion, the CFD model applied can accurately predict the air error level.
distribution under a steady air supply. The later simulation will be
conducted with this validated model. 2.4. CFD simulation cases

2.3.2. Transient condition The CFD simulation cases, five for cooling and three for heating, are
Since the supply air velocity with pulsating air supply changes with listed in Table 4. The supply air temperatures for cooling and heating are
time, the air velocity and temperature at a certain measurement point maintained at 16 ◦ C and 30 ◦ C, respectively. Three supply air velocities
are transient rather than steady against time. As the periodically-varying under the steady state (vs− steady ) are applied for both cooling and heating,
air supply velocity determines the periodically-varying air distribution, i.e., 0.55, 0.90 and 0.20 m/s. The corresponding airflow rates are 7, 20,
a larger number of cycles insignificantly affect the prediction accuracy. and 33 ACH, respectively. The supply air velocity of the duty period
Therefore, the changing trends of air velocity and air temperature over a (vs− duty ) is set at 150% of the vs− steady , and the supply air velocity of the
complete cycle (300 s) are compared with the experimental measure­ idle period (vs− idle ) is set at 50% of the vs− steady . Three cycles of varied
ments, as shown in Fig. 5 and Fig. 6 respectively. A total of 6 mea­ lengths, 300 s, 600 s and 1200 s, are applied. A previous study concluded
surement points are selected for validation, including four sampling that mixing ventilation needs at least 910 s to reach the set-point tem­
lines and three heights. With pulsating air supply, the large turbulence perature [4]. A shorter time length is beneficial for obtaining the air age
intensity results in a great uncertainty of the instantaneous velocity and information quickly. The lengths of the duty and idle periods are the
temperature. Therefore, it is difficult to ensure that the instantaneous same during a cycle.
velocity and temperature are well fitted at every moment. Figs. 5 and 6
show that the trends between the experimental and CFD results over

Fig. 5. Comparison between transient CFD and experimental results of air temperature.

7
X. Tian et al. Building and Environment 219 (2022) 109171

Fig. 6. Comparison between transient CFD and experimental results of air velocity.

2.5. Data-driven model algorithm [41]. The hyperparameters chosen based on references or
repeated testing are presented in Table 5.
To predict IAQ using mechanistic models may appear trustworthy; To optimize BPNN further, genetic algorithm (GA) is added to opti­
however, the specific mechanisms between the air distribution (i.e., the mize initial weights [47]. The term “genetic algorithm back-propagation
air velocity and air temperature) and IAQ at a specific location are not neural network (GABPNN)” refers to the combination of GA and BPNN
well established. The data-driven model can be very useful as an alter­ explained as follows. At the beginning of the process, GA conducts a
native. The data-driven models used in this study are all established global search on weight ranges and finds out the best initial weights for
using Matlab. There are numerous data-driven models to achieve our BPNN. Then, the BPNN starts training with the best initial weights based
goal. According to a review article, ANNs have emerged as the most on GA, thus approaching the optimum solution [47]. The workflow of
frequently used model in HVAC systems [37]. The BPNN and the GA is presented in our previous study in detail [20]. The hyper­
GABPNN models are chosen as representatives as ANNs. The aim is to parameters are chosen based on suggestions in a previous study with a
validate if the inputs contain adequate information to predict the out­ small number of parameters and are presented in Table 5.
puts, rather than determine which data-driven method is the most ac­
curate. Therefore, three methods, including ANNs and non-ANNs, are 2.5.2. Support vector machine
presently applied. The SVM is another typical method for regression and classification.
The objective of the SVM algorithm is to find an optimal separating
2.5.1. Artificial neural network models hyperplane with a maximum margin. The SVM has been applied in a
ANN is a popular data-driven method based on an interconnected
structure of neurons. Depending on the structure and technique, ANN is Table 5
classified into feedforward NN (e.g., single layer perception and multi- Parameters of data-driven methods.
layer perception) and feedback NN (e.g., recurrent NN and Kohonen’s
BPNN learning rate (η) 0.6 [42]
self-organizing map) [38]. The term “BPNN” represents a multi-layer Target error 10− 4 [43]
feedforward network trained according to error back-propagation [39, Transfer functions of the hidden Tangent sigmoid function [18]
40]. BPNN is frequently used to predict IAQ, due to its capability to layer
address non-linearly separable data [18,41]. A BPNN structure is made Transfer functions of the output layer Pure sigmoid function [18]
Hidden layer number 2 (by repeated testing)
of three kinds of layers: input, hidden and output, with each kind of
Neuron number of the first layer 4 (by repeated testing)
layer having a number of neurons. The input layer is the initial infor­ Neuron number of the second layer 20 (by repeated testing)
mation provided to the system, and the output layer is the expected Maximum epoch 500 [44]
target that the neural network should simulate, i.e., air age for the GABPNN Population size 50 [45]
present study. The hidden layer receives and transmits the data between Mutation rate 0.01 [45]
input and output layers. BP learning algorithm updates weights based on Crossover rate 0.6 [45]
the error between model output and desired output. This process repeats Maximum generation 100 [45]

and stops when it reaches a specified maximum epoch or satisfies other SVM Method ε-support vector regression
stopping criteria. We used BPNN with the Levenberg-Marquardt BP [46]

8
X. Tian et al. Building and Environment 219 (2022) 109171

large variety of regression problems, including air quality problems size of 540. For BPNN-1 and BPNN-2, both models take less than 10 s to
[48]. The detailed mechanism for SVM regression is illustrated in a simulate. GABPNN-1 and GABPNN-2 take 362–443 s to complete the
previous paper [46]. We use the Matlab library LibSVM [49] to imple­ simulation, while SVM-1 and SVM-2 take 72–119 s. The computation
ment one type of SVM regression, i.e., ε-support vector regression [46]. time required by BPNN is much less than that of the other two models.
Two functions, svmtrain and svmpredict, are applied for training and GABPNN models take three times as long as SVM models.
testing, respectively. We incorporate vsteady , Tsteady , Δv and ΔT as
candidate input features. Both the training and test data are scaled 3.2. Prediction accuracy with different cases
within the range [0, 1]. The SVM prediction steps based on the LibSVM
toolbox are presented in Fig. 7. At first, the training dataset is used with Table 6 presents the comparison of the prediction errors between
the function svmtrain. The input features are mapped to the corre­ cases. Cases C-1, C-2 and C-3 are distinguished in the time lengths of a
sponding air age results, then the svmtrain function adjusts and obtains complete cycle. With SVM-2, the prediction accuracy of Case C-1 is
the best values of the essential parameters (i.e., C, γ, ε). The prediction slightly higher than Case C-2, while that of Case C-3 is lower than C-2.
function, svmpredict, then uses the best values of the parameters to The prediction performances between Cases C-1 and C-2 are small, with
predict the air age based on the test data. As a popular kernel function the deviation of 1 s/1 s/0.01/0.1% (i.e., MAE/RMSE/R2/MAPE). The
choice, Gaussian Radial Basis Function (RBF) is applied, which can be prediction accuracy of Case C-2 is as high as 20 s/48 s/0.83/2.3% (i.e.,
used when the number of features is relatively small and the relationship MAE/RMSE/R2/MAPE). Cases C-1, C-4 and C-5 are differentiated in
between the features and the target results (i.e., air age for this study) is airflow rate and supply air velocity (see Table 4). With SVM-2, the
nonlinear [50]. largest error deviations among the three cases are 14 s/35 s (i.e., MAE/
RMSE). The R2 and MAPE values among Cases H-1, H-2 and H-3 are
3. Results close, with the largest deviation of 0.6%/0.06 (i.e., MAPE/R2). For Cases
H-1, H-2 and H-3, with SVM-2, the largest error deviations among the
3.1. Prediction performances with different models three cases are 23 s/40 s (i.e., MAE/RMSE). The R2 and MAPE values
among Cases H-1, H-2 and H-3 are close, with the largest deviation of
3.1.1. Prediction accuracy with different models 1.5%/0.07 (i.e., MAPE/R2).
In this section, the performances of the 1st input feature set (i.e., The first common observation is that the SVM-2 model predicts the
vsteady and Tsteady ) and the 2nd input feature set (i.e., vsteady , Tsteady , Δv and air age with acceptable accuracy for all the cases. All MAPE values are
ΔT) are compared to find the better set. The data size is 540, including lower than 3.2%, and the R2 values are distributed between 0.73 and
540 points evenly distributed along the heights between 0.1 and 2 m, on 0.97. The second common observation is that the performances of the
Sampling Lines 1–12. The ratio of the number of data in the training set SVM-2 prediction are more accurate than that of the SVM-1, with the
to that of the test set is maintained at 7:3 throughout this study [36]. The improvement of up to 65 s/80 s/0.50/5.7% (i.e., MAE/RMSE/R2/
sets are randomly mixed before picking the trained ones and the tested MAPE). Adding the Δv and ΔT improves the performance in air age
ones for each Matlab simulation [18]. Each model is trained and tested prediction. This is valid for both heating and cooling.
ten times. Fig. 8 (a) shows that, for Case C-1, the averaged MAE/RM­
SE/R2/MAPE values of the 2nd feature set are always higher than the 1st 3.3. Prediction accuracy with different data sizes
feature set by 24–33 s/21–36 s/0.22–0.44/2.8%–3.7%, respectively.
Fig. 8 (b) shows that, for Case H-1, the averaged MAE/RMSE/R2/MAPE As suggested by Kim et al. [51], data-driven models will be updated
values of the 2nd feature set are always higher than the 1st feature set by when new data arrives. Fig. 10 shows the prediction errors with the
19 s/9–29 s/0.04–0.12/2.0%–2.2%, respectively. These observations SVM-2 model varying with the data size. The curves show that predic­
illustrate that the 2nd feature set brings a higher prediction accuracy, tion power can be improved by enlarging the total data size. In Fig. 10
compared with the 1st feature set. In addition, we have compared the (a) and (b), the performance of 42 s/75 s/0.61/4.7% (i.e., MAE/RM­
prediction performances of different data-driven methods. The averaged SE/R2/MAPE) is achieved after 180 data points. In Fig. 10 (c) and (d),
R2 values of the SVM models are significantly higher than the ANN for Case H-1, the performance of 41 s/64 s/0.74/4.7% (i.e., MAE/RM­
models (Paired sample t-test, p < 0.05). In comparison, the averaged SE/R2/MAPE) is achieved after 130 data points. However, obtaining a
MAE/RMSE/MAPE values of the SVM models are significantly lower dataset of over 180 data points takes time and effort. The increase in
than the ANN models (Paired sample t-test, p < 0.05), indicating a more data size results in an increment of computation time.
accurate prediction performance of the SVM model.
3.4. Further investigation into small data sizes
3.1.2. Computation time with different models
Fig. 9 presents the computation time with different models for a IAQ datasets at different locations are typically smaller and some­
single Matlab simulation. The CPU type of the computer running the times more diverse compared with those of the other parameters,
simulations is Intel (R) Core (TM) i7-8700 CPU @ 3.20 GHz. The oper­ especially when the data is collected through field measurements or
ating system is Windows 10. The results are made based on the total data monitoring. Therefore, this study aims to test the possibility of estab

Fig. 7. SVM prediction steps based on LibSVM toolbox.

9
X. Tian et al. Building and Environment 219 (2022) 109171

Fig. 8. Prediction errors of air age with different models


(Note: the postfix of 1 following the model name means the 1st input feature set composed of vsteady and Tsteady , and 2 means the 2nd input feature set composed of
vsteady , Tsteady , Δv and ΔT).

mixing ventilation, the intention is to create an environment where the


Tsteady is uniformly distributed. Though perfect mixing is hard to reach in
reality, the Tsteady in large part of the room fluctuates around an average
value, depending on the room geometry, layout, the supply air condi­
tions, etc. Compared with Tsteady , the deviations of vsteady values at
different locations are higher, and more stochastic. Choosing a dataset
with a similar Tsteady value (i.e., |max − min| < 1 ◦ C) can happen in en­
gineering practice [52]. Further, it better proves the effectiveness of the
ΔT and Δv for the τsteady prediction, as the Tsteady data are basically
invalid.
Each data set which contains 60 data points is selected ten times from
the 540 data points, and the data-driven model establishment process is
repeated ten times. Each result shown in Fig. 11 is the averaged one of
100 Matlab simulations. In Fig. 11, three data sets are applied, including
two data selection methods and two prediction models. It is previously
known in Section 3.1 that the ΔT and Δv are effective inputs. In order to
pursue the simplicity of inputs, the prediction model SVM-3 is tested,
which only contains ΔT and Δv as inputs. The results under Cases C-1
and H-1 are presented as examples of cooling and heating conditions.
For Case C-1, when the data points are randomly selected (i.e., Data Set
Fig. 9. Computation time with different models. 1), the prediction errors are large, with the averaged MAE/RMSE/R2/
MAPE values of 58 s/88 s/0.36/6.8%. With Data Sets 2 and 3, the pre­
lishing rules for accurate prediction with small data sizes. A previous diction accuracy is improved. Similarly, with Case H-1, the prediction
study investigating the indoor environment shows that the models performances of Data sets 2 and 3 are better than Data set 1. For all the
generally converge when the data size reaches around 60 [51]. Thus, we cases in Table 4, the improvements of the averaged MAE/RMSE/R2/
choose the total data size of 60 to study the prediction accuracy further. MAPE values by Data set 2 over Data set 1 are 25–77 s/28–94 s/
As the data size decreases, the data selection method is important to 0.14–0.53/2.9–7.3%, and the improvements by Data set 2 over Data set
ensure an acceptable accuracy. We apply two data selection methods, i. 3 are 1–26 s/2–46 s/0.04–0.44/0.1–2.1%. Fig. 11 (e) and (f) present the
e., select randomly and select the data with a similar Tsteady . To select the prediction errors of τsteady with 8 cases listed in Table 4, operated with
data with a similar Tsteady is based on the following considerations. With Data Set 2. For cooling, the most accurate performance appears under

10
X. Tian et al. Building and Environment 219 (2022) 109171

Table 6
Prediction accuracy with different cases.
SVM-1 SVM-2
2
Case MAE (s) RMSE (s) MAPE (%) R MAE (s) RMSE (s) MAPE (%) R2

C-1 54 ± 3 88 ± 3 6.1 ± 0.3 0.47 ± 0.07 21 ± 2 47 ± 6 2.4 ± 0.4 0.84 ± 0.05


C-2 20 ± 2 48 ± 7 2.3 ± 0.3 0.83 ± 0.05
C-3 27 ± 6 57 ± 12 3.2 ± 0.7 0.73 ± 0.14
C-4 37 ±3 57 ± 4 7.3 ± 0.5 0.37 ± 0.08 10 ± 1 25 ± 2 1.8 ± 0.3 0.87 ± 0.02
C-5 89 ±8 140 ± 11 7.6 ± 0.7 0.43 ± 0.11 24 ± 6 60 ± 13 1.9 ± 0.5 0.90 ± 0.04
H-1 27 ±3 48 ± 4 3.1 ± 0.3 0.86 ± 0.02 8±2 19 ± 6 0.9 ± 0.2 0.97 ± 0.01
H-2 20 ±2 35 ± 2 3.7 ± 0.3 0.79 ± 0.03 6±1 16 ± 4 1.2 ± 0.2 0.95 ± 0.02
H-3 43 ±3 72 ± 6 3.5 ± 0.3 0.84 ± 0.03 29 ± 6 56 ± 12 2.4 ± 0.5 0.90 ± 0.04

Fig. 10. Prediction accuracy with different data sizes.

Case C-4, with the averaged MAE/RMSE/R2/MAPE values of 16 s/28 s/ rather than minimize training error [53]. SVM models can overcome the
0.72/3.1%. For heating, the prediction errors are small, which are inherent drawbacks of ANN models, such as over-fitting training, local
distributed within 5–12 s (MAE)/9–23 s (RMSE)/0.6%–1.0% (MAPE), minima and poor generalization performance in cases with large initial
and the R2 values are higher than 0.88. Overall, the MAPE values are all data, leading to a better generation capability, global optimization, and
lower than 5%, showing a sufficient accuracy of Data Set 2. dimensional independence [53]. However, a previous study shows that
Generally, Data Set 2 has the highest prediction accuracy for both ANN is more robust to noise than SVM [54]. With the data set with
cooling and heating, followed by Data Set 3, and then Data Set 1, indi­ sufficient data size and linear relationships between variables, the ANN
cating that the data selection method exerts a larger impact compared model can also perform precise and quick prediction of IAQ compared
with the model selection. with the SVM model [55]. Clearly, the data characteristics of the present
study are more compatible with the SVM model.
4. Discussion Although the BP model consumes the least computation time, its
accuracy is lower than the SVM model (see Figs. 8 and 9). The GABPNN
4.1. Data-driven model selection models spend threefold of the time of the SVM models, yet bring no
improvement in prediction accuracy. The long computation time of the
From Fig. 8, the SVM model has improved the generalization per­ GABPNN is expected, as the population evolves over generations to
formance of the ANN models. Different from the ANN models, the SVM obtain the optimal solution, i.e., the optimal initial weights for the
model seeks to minimize an upper bound of the generalization error, present study [47]. According to Wang et al. [56], prediction accuracy

11
X. Tian et al. Building and Environment 219 (2022) 109171

Fig. 11. Prediction errors with different data sets and cases.

should be given priority when solving such problems. Overall, the to Pearson coefficient tests [25], the ΔT negatively correlates with the
computation time of the SVM models is acceptable. The data-driven τsteady (p < 0.01, data size = 540), validating our explanation. However,
model selection should be considered with the data set characteristics the relationship may be weaker with a smaller data size or other data
(e.g., data size, linear or non-linear relationships, data distribution) and selection methods as the Δv and ΔT are also influenced by the heat
the computation time. source distributions.
Previous studies have observed the existence of stagnant zones in
indoor environments, especially under mixing ventilation, where the
4.2. Significance of ΔT and Δv
fresh air is hard to reach. The pollutant concentration will be especially
high in the stagnant zones [7], and the particles in the stagnant zones are
From that can be observed in Fig. 8 and Table 6, adding the infor­
difficult to be ventilated out of the room [8]. However, the low air ve­
mation of the Δv and ΔT based on the vsteady and Tsteady as inputs im­
locity and/or the air temperature more deviates from the supply air
proves the prediction accuracy, which is applicable to both cooling and
temperature should not be sufficient conditions to judge if it is a stag­
heating. It can be inferred that, the Δv and ΔT are related to the air
nant zone. Firstly, the heat sources are not evenly distributed in the
freshness of the specific location in the room. This is because that, the Δv
room. Secondly, the buoyancy effect should be taken into consideration.
and ΔT reflect the impact of the air supply condition variation on the air
The most reliable way is to visualize the airflow field. It is rather costly
distribution at a specific location. In other words, larger Δv and ΔT
for application to calculate the air distribution in the whole room with
values imply that the air distribution responds more quickly to the
either the experimental method using tracer gas, or the CFD method.
changes in supply air conditions. It is thus understandable that the fresh
This study illustrates that the Δv and ΔT are useful information to reflect
air arrives faster to this location, resulting in a smaller τsteady . According

12
X. Tian et al. Building and Environment 219 (2022) 109171

the air freshness, thus the IAQ. More importantly, they can be easily and acceptance threshold, which is stricter than about 20% and 10% in
fast obtained by monitoring the air velocity and temperature for 10–20 previous studies [36,57]. With a random selection from 540 data points,
min, as illustrated in Section 3.2. This finding helps identify the zones the data size of 180 meets the threshold standard for both heating and
that are unsuitable for occupants to stay. cooling (see Fig. 10). However, with data selection according to certain
rules, the data size of 60 also meets the threshold standard (see Fig. 11).
4.3. Comparison between cases In Section 3.4, for both cooling and heating, Data Set 2 has the highest
prediction accuracy, followed by Data Set 3, and then Data Set 1, indi­
The comparison between cases is developed in three ways, i.e., the cating that the data selection method exerts a larger impact compared
comparison between heating and cooling, the influence of the time with the data-driven model selection. For the present study, it is found
length of a cycle, and the influence of airflow rate. that the data sets of the similar Tsteady (|max − min| < 1 ◦ C) better reflect
From Table 6, the 2nd feature set (i.e., vsteady , Tsteady , Δv and ΔT) the relationships between the ΔT and Δv, and the τsteady .
predicts the τsteady with R2 values up to 0.97, indicating a close rela­
tionship between the air distribution and τsteady for both heating and 4.5. Limitations and further study
cooling conditions. Fig. 9 also shows that the computation times for
heating and cooling are very close, with a deviation of less than 10%. Above all, this study is very primary, and the aims are to identify the
The mechanism of mixing ventilation is mixing indoor air with fresh air effective inputs and the model with a relatively accurate prediction
while exhausting indoor air in both summer and winter [10]. The pre­ performance. It should be noted that, the proposed strategy may be
diction accuracy under heating is slightly better than that under cooling, applied to stratified environments accompanied by over-qualifying
with the R2 increase of up to 0.13 with the same airflow rate using problems. The advanced ventilation methods, including displacement
SVM-2 (see Table 6). This is because achieving good mixing under ventilation [4], stratum ventilation [5], impinging jet ventilation [6],
heating is more difficult than under cooling [10], as validated by a better etc., are proved to be more energy-efficient than mixing ventilation.
prediction performance of heating cases over cooling cases, with the Further, the IAQ predictions are easier with simpler algorithms and
SVM-1 model. Overall, the SVM-2 model can be applied to predict the fewer inputs, as illustrated in the Introduction. However, there are a
τsteady under both heating and cooling with sufficient accuracy. massive number of existing buildings installed with mixing ventilation,
In Table 6, the prediction accuracy of Case C-1 is slightly higher than as well as scenarios where advanced ventilation methods are inappli­
Case C-2, while that of Case C-3 is lower than C-2. This is because the air cable due to room features. Vigorous effective IAQ control may make the
temperature and velocity variations with a longer time length of a ventilation performance of mixing ventilation closer to those of
complete cycle are generally larger, providing more information on the advanced ventilation methods.
airflow motion, as well as providing a more divided dataset for the Fig. 12 shows the application of the proposed model to practical use
model. Further, with a longer time length of a complete cycle, the in­ in indoor environment monitoring. At the beginning of the process, the
fluence brought by the lag time from air supply condition change to the measurements of τsteady , vsteady , Tsteady , ΔT and Δv are needed as the
indoor air distribution change is decreased [27]. Therefore, a longer training data to establish the model. The problem is that the measure­
time length results in more accurate prediction performance. However, ments of τsteady at dozens of points are rather costly. The alternative and
the prediction performances between Cases C-1 and C-2 are small, with the prospect are to use the pollutant concentration to replace the τsteady ,
the deviation of 1 s/1 s/0.01/0.1% (i.e., MAE/RMSE/R2/MAPE) with as the pollutant concentration can be read in real time. As the τsteady and
SVM-2. It demonstrates that, when the time length of a complete cycle the pollutant concentration have a close relationship [58], we are
increases to a certain point, further improvement by increasing the time confident about the success of using vsteady , Tsteady , ΔT and Δv to predict
length of a complete cycle is limited. This is because the prediction ac­
the non-uniformly distributed pollutant concentration. In that case, the
curacy is not dependent solely on the Δv and ΔT values. Still, the pre­
pollutant characteristics are needed, which may be used as inputs.
diction accuracy of Case C-2 is 20 s/48 s/0.83/2.3% (i.e.,
Because of the limitless possibilities of the pollutant characteristics, the
MAE/RMSE/R2/MAPE) with SVM-2, indicating that a time length of
hyperparameters of the algorithms may be revised to adapt to more
600 s is sufficient for such predictions.
inputs.
Table 6 shows that, the MAPE and R2 values among Cases C-1, C-4
Another limitation is that we only consider the heating and cooling
and C-5 are close to each other, with the maximum deviation of 0.6%
conditions while neglecting transition seasons/isothermal air supply.
and 0.06. However, for MAE and RMSE, the values of Case C-5 are
The influence factors of the non-uniform air distribution mainly include
higher than Case C-1, followed by Case C-4. This is due to the different
the heat transfer and indoor partitions. Studies have pointed out that the
air age values and ranges with different air change rates. A higher
non-uniformity of air distribution is reduced with low heat/cooling
airflow rate results in a smaller τsteady value [10]. Therefore, even with a
loads [10,59]. The parameters of Tsteady and ΔT are also invalid facing
similar MAPE/R2 value, the case with a higher airflow rate performs a
the isothermal air supply. Therefore, the prediction accuracies are
lower MAE/RMSE value. For heating, the airflow rate of Case H-3 is
deduced to decline in the heat transfer aspects. As for the complex in­
lower than the other two cases, leading to higher τsteady values and higher
door partitions, the vsteady and Δv are still effective. Furthermore, a
MAE/RMSE values. With SVM-2, the largest error deviations among the
perfect mixing condition may not need to solve non-uniformity in the
three cases are 23 s/40 s (i.e., MAE/RMSE). The R2 and MAPE values
first place. We only compare three data-driven methods, i.e., BPNN,
among Cases H-1, H-2 and H-3 are all higher than 0.90 and lower than
GABPNN and SVM. There may be a better method with higher accuracy
2.5%, respectively. Therefore, a higher airflow rate results in lower
companied by acceptable computation time. Further studies can be
MAE/RMSE values, but exerts an insignificant influence on the MAPE
conducted to explore the performances among different data-driven
and R2 values. The developed SVM-2 model thus can predict τsteady
methods, especially those that do not belong to ANNs [37].
regardless of different airflow rates. Currently, we are focusing on the IAQ distribution in a single room,
but the results of this study may potentially be applied to other occasions
4.4. Data selection as well. For example, in a building with multiple rooms applying vari­
able air volume systems, monitoring the steady state temperature is
The data selection refers to two aspects for the present study, i.e., the difficult to determine the IAQ levels between different rooms. The in­
data size and data selection method. Fig. 10 illustrates that the predic­ formation of ΔT and Δv may help determine the rate at which fresh air
tion power is improved by enlarging the total data size, with a fixed ratio arrives. An obstacle to practical application is that data-driven models
of training data to test data. We use the MAPE value of 5% as the require a certain quantity of data. Even a small data size of 60 in Section

13
X. Tian et al. Building and Environment 219 (2022) 109171

Fig. 12. Conceptual plan for further research.

3.4 is difficult to achieve for small-scale rooms/buildings. Solutions for with multiple rooms and 4) further decrease data size.
this include 1) discovering more effective input data, 2) developing
mechanistic models or grey models and 3) relaxing the restrictions for CRediT authorship contribution statement
accuracy. The locations, humidity and radiation may be effective inputs
for air quality [20,48,51]. Grey models to predict dynamic variations Xue Tian: Writing – original draft, Conceptualization. Yuchun
using supply air conditions are also developed [27]. Further studies Zhang: Investigation. Zhang Lin: Writing – review & editing, Supervi­
should be conducted on the relevant issues. sion, Funding acquisition.

5. Conclusions Declaration of competing interest

The main conclusions are drawn as follows: The authors declare that they have no known competing financial
interests or personal relationships that could have appeared to influence
1. SVM models perform better than the BPNN and GABPNN models the work reported in this paper.
when predicting the τsteady , improving up to 27 s/28 s/0.37/3.2%
(MAE/RMSE/R2/MAPE) for cooling and 15 s/42 s/0.21/1.9% Acknowledgements
(MAE/RMSE/R2/MAPE) for heating. The computation time of SVM
is medium compared with BPNN and GABPNN models. The work described in this paper is supported by General Research
2. The ΔT and Δv with a pulsating air supply significantly improve the Grant from the Research Grants Council of the Hong Kong Special
prediction accuracy of τsteady , based on the Tsteady and vsteady , with the Administrative Region, China (Project No. CityU 11208220).
improvement of up to 65 s/80 s/0.50/5.7% (i.e., MAE/RMSE/R2/
MAPE), indicating that the ΔT and Δv are related to the air freshness References
at a specific point in the room, for both heating and cooling.
[1] K.-N. Rhee, M.-S. Shin, S.-H. Choi, Thermal uniformity in an open plan room with
3. The prediction accuracy with heating is slightly better than that with
an active chilled beam system and conventional air distribution systems, Energy
cooling, as achieving good mixing under heating is more difficult Build. 93 (2015) 236–248.
than that under cooling. [2] M. Krajčík, A. Simone, B.W. Olesen, Air distribution and ventilation effectiveness in
4. A 10-min pulsating air supply cycle is adequate to collect the ΔT and an occupied room heated by warm air, Energy Build. 55 (2012) 94–101.
[3] H.B. Awbi, Ventilation and air distribution systems in buildings, Front. Mech. Eng.
Δv for the studied multi-occupant office, with the R2 value higher 1 (2015) 4.
than 0.80. [4] H. Ahn, D. Rim, L.J. Lo, Ventilation and energy performance of partitioned indoor
5. A higher airflow rate results in lower MAE/RMSE values but an spaces under mixing and displacement ventilation, Build. Simulat. 11 (3) (2018)
561–574.
insignificant impact on the MAPE/R2 values. The developed SVM-2 [5] X. Kong, et al., A comparative experimental study on the performance of mixing
model can predict τsteady regardless of different airflow rates, with a ventilation and stratum ventilation for space heating, Build. Environ. 157 (2019)
sufficient accuracy. 34–46.
[6] L. Wang, et al., Numerical comparison of the efficiency of mixing ventilation and
6. When the data points have a similar Tsteady , the SVM-2 model can still impinging jet ventilation for exhaled particle removal in a model intensive care
determine the τsteady of small data size (i.e., 60) with a sufficient ac­ unit, Build. Environ. 200 (2021), 107955.
curacy, adding the evidence that the ΔT and Δv help reveal the [7] K.-C. Noh, J.-S. Jang, M.-D. Oh, Thermal comfort and indoor air quality in the
lecture room with 4-way cassette air-conditioner and mixing ventilation system,
inapparent difference in IAQ, and approaching engineering Build. Environ. 42 (2) (2007) 689–698.
applications. [8] G. Xu, J. Wang, CFD modeling of particle dispersion and deposition coupled with
particle dynamical models in a ventilated room, Atmos. Environ. 166 (2017)
300–314.
We primarily identify the effective inputs and the model with a
[9] H. Lee, H.B. Awbi, Effect of internal partitioning on indoor air quality of rooms
relatively accurate prediction performance. Further studies can be with mixing ventilation-basic study, Build. Environ. 39 (2) (2004) 127–141.
conducted to 1) explore the performances among other data-driven [10] H. Amai, A. Novoselac, Experimental study on air change effectiveness in mixing
ventilation, Build. Environ. 109 (2016) 101–111.
methods, 2) investigate the pollutant concentration combined with the
pollutant characteristics, 3) apply the research approach to a building

14
X. Tian et al. Building and Environment 219 (2022) 109171

[11] R. Tomasi, et al., Experimental evaluation of air distribution in mechanically [35] S. Benni, et al., Efficacy of greenhouse natural ventilation: environmental
ventilated residential rooms: thermal comfort and ventilation effectiveness, Energy monitoring and CFD simulations of a study case, Energy Build. 125 (2016)
Build. 60 (2013) 28–37. 276–286.
[12] Refrigerating, A.-C. Engineers, A.N.S. Institute, Ventilation for Acceptable Indoor [36] C. Ding, K.P. Lam, Data-driven model for cross ventilation potential in high-density
Air Quality, vol. 62, American Society of Heating, Refrigerating and Air- cities based on coupled CFD simulation and machine learning, Build. Environ. 165
Conditioning Engineers, 2001. (2019), 106394.
[13] M. Krajčík, et al., Experimental study including subjective evaluations of mixing [37] N. Ma, et al., Measuring the right factors: a review of variables and models for
and displacement ventilation combined with radiant floor heating/cooling system, thermal comfort and indoor air quality, Renew. Sustain. Energy Rev. 135 (2021),
HVAC R Res. 19 (8) (2013) 1063–1072. 110436.
[14] D. Gonçalves, et al., One step forward toward smart city Utopia: smart building [38] O.I. Abiodun, et al., State-of-the-art in artificial neural network applications: a
energy management based on adaptive surrogate modelling, Energy Build. 223 survey, Heliyon 4 (11) (2018), e00938.
(2020), 110146. [39] J.-Z. Wang, et al., Forecasting stock indices with back propagation neural network,
[15] S. Zhang, Z. Ai, Z. Lin, Occupancy-aided ventilation for both airborne infection risk Expert Syst. Appl. 38 (11) (2011) 14346–14355.
control and work productivity, Build. Environ. 188 (2021), 107506. [40] J. Li, et al., Brief introduction of back propagation (BP) neural network algorithm
[16] B. Khazaei, A. Shiehbeigi, A.R. Haji Molla Ali Kani, Modeling indoor air carbon and its improvement, in: Advances in Computer Science and Information
dioxide concentration using artificial neural network, Int. J. Environ. Sci. Technol. Engineering, Springer Berlin Heidelberg, Berlin, Heidelberg, 2012.
16 (2) (2019) 729–736. [41] J.W. Moon, et al., Determining optimum control of double skin envelope for indoor
[17] W. Wei, et al., Machine learning and statistical models for predicting indoor air thermal environment based on artificial neural network, Energy Build. 69 (2014)
quality, Indoor Air 29 (5) (2019) 704–726. 175–183.
[18] S. Park, et al., Predicting PM10 concentration in Seoul metropolitan subway [42] J.W. Moon, K. Kim, H. Min, ANN-based prediction and optimization of cooling
stations using artificial neural network (ANN), J. Hazard Mater. 341 (2018) 75–82. system in hotel rooms, Energies 8 (10) (2015) 10775–10795.
[19] S. Zhang, et al., Modeling non-uniform thermal environment of stratum ventilation [43] Y. Lu, et al., Data augmentation strategy for short-term heating load prediction
with supply and exit air conditions, Build. Environ. 144 (2018) 542–554. model of residential building, Energy 235 (2021), 121328.
[20] X. Tian, Y. Cheng, Z. Lin, Modelling indoor environment indicators using artificial [44] Y.H. Zweiri, J.F. Whidborne, L.D. Seneviratne, A three-term backpropagation
neural network in the stratified environments, Build. Environ. (2021), 108581. algorithm, Neurocomputing 50 (2003) 305–318.
[21] S. Zhang, et al., Energy performance index of air distribution: thermal utilization [45] T. Li, et al., Genetic algorithm for building optimization: state-of-the-art survey, in:
effectiveness, Appl. Energy 307 (2022), 118122. Proceedings of the 9th International Conference on Machine Learning and
[22] J. Zhao, et al., Theoretical expression for clean air volume in cleanrooms with non- Computing, Association for Computing Machinery, Singapore, Singapore, 2017,
uniform environments, Build. Environ. 204 (2021), 108168. pp. 205–210.
[23] A. Keblawi, N. Ghaddar, K. Ghali, Model-based optimal supervisory control of [46] R. Rana, et al., Feasibility analysis of using humidex as an indoor thermal comfort
chilled ceiling displacement ventilation system, Energy Build. 43 (6) (2011) predictor, Energy Build. 64 (2013) 17–25.
1359–1370. [47] S. Wang, et al., Wind speed forecasting based on the hybrid ensemble empirical
[24] Y. Sheikhnejad, et al., Can buildings be more intelligent than users?- the role of mode decomposition and GA-BP neural network method, Renew. Energy 94 (2016)
intelligent supervision concept integrated into building predictive control, Energy 629–636.
Rep. 6 (2020) 409–416. [48] B. Gong, J. Ordieres-Meré, Prediction of daily maximum ozone threshold
[25] F. Cheng, et al., Experimental study of thermal comfort in a field environment exceedances by preprocessing and ensemble artificial intelligence techniques: case
chamber with stratum ventilation system in winter, Build. Environ. 207 (2022), study of Hong Kong, Environ. Model. Software 84 (2016) 290–303.
108445. [49] C.-C. Chang, C.-J. Lin, LIBSVM: a library for support vector machines, ACM Trans.
[26] X. Tian, et al., Experimental study of local thermal comfort and ventilation Intell. Syst. Technol. (TIST) 2 (3) (2011) 1–27.
performance for mixing, displacement and stratum ventilation in an office, Sustain. [50] J. Wainer, P. Fonseca, How to tune the RBF SVM hyperparameters? An empirical
Cities Soc. 50 (2019), 101630. evaluation of 18 search algorithms, Artif. Intell. Rev. 54 (6) (2021) 4771–4797.
[27] X. Tian, Z. Lin, Dynamic modelling of air temperature in breathing zone with [51] J. Kim, et al., Personal comfort models: predicting individuals’ thermal preference
stratum ventilation using a pulsating air supply, Build. Environ. 210 (2022), using occupant heating and cooling behavior and machine learning, Build.
108697. Environ. 129 (2018) 96–106.
[28] V. Chanteloup, P.-S. Mirade, Computational fluid dynamics (CFD) modelling of [52] C. Chen, D. Lai, Q. Chen, Energy analysis of three ventilation systems for a large
local mean age of air distribution in forced-ventilation food plants, J. Food Eng. 90 machining plant, Energy Build. 224 (2020), 110272.
(1) (2009) 90–103. [53] B. Yeganeh, et al., Prediction of CO concentrations based on a hybrid partial least
[29] Y. Cheng, Z. Lin, Experimental investigation into the interaction between the square and support vector machine model, Atmos. Environ. 55 (2012) 357–365.
human body and room airflow and its effect on thermal comfort under stratum [54] J. Li, An empirical comparison between SVMs and ANNs for speech recognition, in:
ventilation, Indoor Air 26 (2) (2016) 274–285. The First Instructional Conf. On Machine Learning, 2003.
[30] X. Shao, et al., Potential of stratum ventilation to satisfy differentiated comfort [55] Z. Liu, et al., Exploring the potential relationship between indoor air quality and
requirements in multi-occupied zones, Build. Environ. 143 (2018) 329–338. the concentration of airborne culturable fungi: a combined experimental and
[31] S. Liang, et al., Determining optimal parameter ranges of warm supply air for neural network modeling study, Environ. Sci. Pollut. Control Ser. 25 (4) (2018)
stratum ventilation using Pareto-based MOPSO and cluster analysis, J. Build. Eng. 3510–3517.
37 (2021), 102145. [56] Z. Wang, et al., Random Forest based hourly building energy prediction, Energy
[32] Y. Cheng, et al., Optimization on fresh outdoor air ratio of air conditioning system Build. 171 (2018) 11–25.
with stratum ventilation for both targeted indoor air quality and maximal energy [57] M. Macas, et al., The role of data sample size and dimensionality in neural network
saving, Build. Environ. 147 (2019) 11–22. based forecasting of building heating related variables, Energy Build. 111 (2016)
[33] C. Wu, N.A. Ahmed, A novel mode of air supply for aircraft cabin ventilation, Build. 299–310.
Environ. 56 (2012) 47–56. [58] X. Tian, et al., Multi-indicator evaluation on ventilation effectiveness of three
[34] T. van Hooff, B. Blocken, Mixing ventilation driven by two oppositely located ventilation methods: an experimental study, Build. Environ. 180 (2020), 107015.
supply jets with a time-periodic supply velocity: a numerical analysis using [59] J. Fan, C.A. Hviid, H. Yang, Performance analysis of a new design of office diffuse
computational fluid dynamics, Indoor Built Environ. 29 (4) (2019) 603–620. ceiling ventilation system, Energy Build. 59 (2013) 73–81.

15

You might also like