Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

A Comparison Study of Outpatient Visits Forecasting

Effect between ARIMA with Seasonal Index and


SARIMA
Zhang Xinxiang Zhou Bo Fu Huijuan

School of Information & Safety Engineering, Zhongnan University of Economics and Law, Wuhan, China

Abstract—this paper delineates a case study analyzing and of this problem, this paper selects ³Day´ as timescale which
forecasting of the outpatient visits frequency of a hospital in prove to be precise forecasting of outpatient visits within short
Zhengzhou, China. By evaluating the annual out-patient data period. Thus the forecasting result embodies both the alarming
throughout the year of 2015, this paper applies the ³DD\´ and realistic meaning as to guide the administration personnel
as timescale and carries out the experiment so as to forecast the
number of visiting patients with the impact of the ³WHHN´ taken
to alter strategy and make accordingly arrangement. Cyclical
into consideration. Two models are used separately: the influence of the factor of the ³Week´ will be emerged due to
Autoregressive Integrated Moving Average (ARIMA) with the reason that selected data has been narrowed down to ³GDLO\´
seasonal index and the Seasonal Autoregressive Integrated number, therefore it is appropriate to take seasonal index into
Moving Average (SARIMA). Based on the empirical findings consideration. Combined time series model which has been
from the comparison of the fitting effect and forecasting effect of productively used in the field of forecasting research, ARIMA
the above two models, it is clear that SARIMA reaches a with seasonal index and SARIMA are to be established. Via
satisfactory outcome: it displays optimum indexes. Therefore it is the comparison of the fitting effect the two models present, the
preferable to deploy the SARIMA model to proceed a forecasting optimal forecasting model shall be acquired so as to obtain the
of outpatient visits for medical institutions. Meanwhile the paper
also aims to provide management of medical institution with
more accurate forecasting of outpatient visits. 
theory grounds of working and personnel arrangement and
insight so as to make a prompt and reasonable contingency plan II. METHODS
when it comes to sudden disease.
A. ARIMA Model with Seasonal Index
Keywords: SARIMA Model; Outpatient visits; Seasonal index; The Autoregressive integrated moving average (ARIMA)
Forecasting model, a commonly used stochastic time series model, is one
I. INTRODUCTION of the classical model among time series models. Formulated
by Box and Jenkins[10], it works on a hypothesis that the
Currently, how to make a reasonable allocation of medical futuristic predicted values are linearly related to several
resources as well to deal with the sudden disease timely turn observations in the past and white noises as well. Therefore, it
out to be long-term challenges faced by the management of is aimed at time series containing tendency. The model is
medical institutions, given the rising tension and obvious noted as ሺ’ǡ †ǡ “ሻ, p stands for autoregressive order, d
shortage of domestic medical resources[1]. Therefore, it is of for moving average order, q for phase by order difference
great practical and scientific significance to apply efficient order. Its formula is as follows:
model and proceed to forecast the outpatient visits considering
the fact that the outpatient department constitutes a vital
position within the medical institutions.  ɔሺሻ‫׏‬ୢ ›ሺ–ሻ ൌ  Ʌሺሻɂሺ–ሻ 
In previous studies, many scholars home and abroad have
tried to apply various forecasting models to make analysis and With
forecast with modeling based upon various outpatient data. For ɔሺሻ ൌ ͳ െ ɔଵ  െ ‫ ڮ‬െ ɔ୮  ୮ 
example: Base on the time series model and subsequent
improvement [2],[3],[4];Base on the combination of models Ʌሺሻ ൌ  ͳ ൅ Ʌଵ  ൅ ‫ ڮ‬൅ Ʌ୯  ୯ 
and forecast[5]; Base on the comparison of forecasting effect ‫׏‬ୢ ›ሺ–ሻ ൌ  ሺͳ െ ሻୢ šሺ–ሻ
of models[6],[7]. Those researches mentioned above have all In the formulaˈ ୩ is the lag operator of K; ‫ ׏‬is the
chosen the ³Week´ or ³Month´ as timescale with regards to difference operator; ɔሺሻ is the autoregressive polynomial of
data processing as the fitting effect could be better
demonstrated hence the most suitable models being used and p with autoregressive parameter of ɔଵ , ɔଶ , 尀 , ɔ୮ ;  Ʌሺሻ
evaluated. Nevertheless, such loose timescales may not be indicates moving average polynomial of q, as the moving
perfect choice considering short-term forecasting and the result average parameter displayed as Ʌଵ ǡ Ʌଶ ǡ ‫ ڮ‬ǡ Ʌ୯ ; Stationary time
might fail to provide the manager of theory support of strategy series reflects as ‫׏‬ୢ šሺ–ሻ ;  ɂሺ–ሻ reveals as mutually
arrangement as well as monitoring of sudden disease. In view independent sequences of white noise and obeys the Gauss
978-1-5386-1978-0/17/$31.00 ©2017 IEEE


distribution. the EViews statistics software.
The seasonal index method expatiates how the seasonal
periodic vibration of time series affecting the future C. Periodicity Analysis and Stability Test
forecasting as a target. Multiplied the trending time series The original sequence of time is counted by the time scale
which elicited via the ARIMA model with seasonal index, the of week and then variance analysis is being employed to see
final forecasting model can be established. The formula for how significant the impact of factor of ³:eek´ is. The
calculating the seasonal index is shown below: Original hypothesis (the effect of the factor of ³:eek´ is less
than significant) can be assumed as +Bį . If p-value
୫౟
Ƚ୧ ൌ ሺ‹ ൌ ͳǡ ǥ ǡ ሻ   ሺʹሻ coming out of the test result (by default, accepting the

probability of original hypothesis, similarly hereinafter) does
not reach the significant level (0.05), then the original
In the formula, n is the total number of quarters asȽ୧ is hypothesis should be denied. That is, the effect the factor of
the index of Season ‹ . The average number of season ‹ ³:eek´ exerts on this sequence is clearly cyclical. Conversely,
presents as୧ as M is the average for the whole season. To the original hypothesis can be proved to be valid of which the
make the average of the seasonal index as 1, there is a need to sequence is least affected by the factor of ³:eek´.
standardize the seasons, thus the formula sketches as below: This paper uses ADF to perform the unit root test and
determine the stability of the original sequence, thus the
୬஑౟ applicability of it and the compatibility it shows with the
Ƚෝన ൌ σ౤    ሺ‹ ൌ ͳǡ ‫ ڮ‬ǡ ሻ   ሺ͵ሻ
౟సభ ஑౟ model in this paper can be evaluated. The original hypothesis
According to formula (1) and (3), we can define the of the ADF unit root test (Containing a unit root) is +Bį .
ARIMA with seasonal index. It is as follows: If the p-valve in the test results of the original time series is
less than the significant level (0.05), then the original
ɔሺሻ‫׏‬ୢ ›ሺ–ሻȽෝన ൌ  Ʌሺሻɂሺ–ሻ   ሺ‹ ൌ ͳǡ ‫ ڮ‬ǡ ሻ ሺͶሻ hypothesis is denied as there is no unit root and this time series
is representing the characteristics of stability. If the original
hypothesis cannot be denied, then a one by one order
difference re-test shall be operated until it is reached the
stability and being denied. If the original sequence cannot deny
B. SARIMA Model the original hypothesis, then it is clear that this time sequence
Time series with obvious periodic vibration due to the is a unit root sequence which is not applicable for the model in
seasonal change (quarterly, monthly, and weekly etc.) could be this paper.
referred as seasonal time series. In order to deal with these In this paper, Eviews is used to analysis of variance of
kinds of time series, we can adopt the Seasonal ARIMA Model, time series and ADF unit root test.
namely the  ሺ’ǡ †ǡ “ሻሺǡ ǡ ሻୱ . in which p and P
signify the order of autoregressive and seasonal autoregressive D. ModelSelection Criteria
respectively; while d and D the order by difference and 1) White Test
seasonal difference, respectively; q and Q the order of moving Due to the random error term of the model, if
averages and seasonal moving average, respectively. Its heteroscedasticity exists, it will have an impact on the
formula is as follows: forecasting accuracy of the model.
It is the white test is used that the heteroscedasticity can
ɔሺሻ‫׎‬ሺୗ ሻ‫׏‬ୢ ‫׏‬ୈ ୗ
ୗ ›ሺ–ሻ ൌ ɅሺሻԂሺ ሻɂሺ–ሻ ሺͷሻ be determined, thus the model being selected.
White test is to run a heteroscedasticity test using the
statistical quantity ofɖଶ which is constructed by means of an
there into auxiliary regression. For a model of random error term
‫׎‬ሺ ୗ ሻ ൌ ͳ െ ‫׎‬ଵ  ୗ െ ‫ ڮ‬െ ‫  ୔׎‬ୗ  sequences, if the original hypothesis is denied, then there is no
heteroscedasticity; if the original hypothesis is accepted, then
Ԃሺ ୗ ሻ ൌ ͳ ൅ Ԃଵ  ୗ ൅ ‫ ڮ‬൅ Ԃ୯  ୗ  the very existence of heteroscedasticity can be proved, which
in turn shall affect the forecasting of the model, then the model
‫׏‬ୈ ୗ ୈ
ୗ ൌ ሺͳ െ  ሻ  can be denied.

B,‫׏‬, ›ሺ–ሻ, ɔሺሻ,Ʌሺሻ and ɂሺ–ሻ are consistent with the 2) Criteria of ܴതଶ
ARIMA model with seasonal index. ‫׎‬ሺ ୗ ሻ indicates seasonal ഥଶ is one of the important metrics of evaluating the fitness

autoregressive polynomial as‫׎‬ଵ ɾ‫׎‬ଶ ɾ ‫ ڮ‬ɾ‫ ୔׎‬acts as the of regression model with its values ranging from 0 to 1.The
seasonal autoregressive parameter.  Ԃሺ ୗ ሻ stands for closer the value is to 1, the better the effect of the fitting degree
seasonal moving average polynomial as Ԃଵ ɾԂଶ ɾ ‫ ڮ‬ɾԂ୕ of the model. Nevertheless, as the  ഥଶ criterion is used as a
act as seasonal moving average parameter;‫׏‬ୈ ୗ implies the measurement, one thing has to be considered that a minor
seasonal difference which goes through the D order. variable cannot be simply added to the model just upon the
Modeling and parametric estimation process is based on basis of the increase of  ഥଶ as it could bring about a decrease in


the degree of freedom of the model. sequence, this paper adopts two models to deal with, namely
ഥଶ to
Thus this paper adopts an adjusting version of  the ARIMA model with seasonal index and the SARIMA
measure. It is noted as followsǣ model.

୬ିଵ
ഥ ଶ ൌ ͳ െ ሺͳ െ ଶ ሻ
   ሺ͸ሻ d>/͘ &ĂĐƚŽƌŽĨtĞĞŬ
୬ି୩ିଵ
Week IndexȽෝన
There into, n represents the sample size, and K represents
the number of explanatory variables. Such improvements can,
Monday 1.3234
to a certain point, avoid the effects of certain minor variables.
Tuesday 1.1018
3) Criteria of AIC and SC
AIC refers to the Schwartz criterion as SC refers to Wednesday 1.0579
Akaike information criteria and they are used to balance the
complexity of the estimated model and best result of fitting Thursday 1.0211
degree of the model. Both of the criteria share similar path of
thinking: The penalty factor is added to independent variable Friday 0.9838
within the model so as to reduce the influence it exerts upon
the model. Saturday 0.8428
Considering the model selection, it needs to be guaranteed
that the model has to pass the White test with largerഥଶ value Sunday 0.6692
and less than larger AIC and SC values.

III. MODEL BUILDING AND ANALYSIS 2) Stationary test
Two sequences, one adopts the first-order difference
A. Data Sources eliminating the factor of ³:eek´, another adopts the first
difference of seasonal difference are to build the ARIMA
The data in this paper are derived from the annual model and the SARIMA model separately. 
outpatient data of a hospital in Zhengzhou, Henan Province in The EViews software is used to draw their autocorrelation
the year of 2015. The data ranging from January to November (AC) and partial correlation (PAC) analysis charts (shown in
are selected as training data while the data of December are
Fig.2 Fig. 3). By observing and comparing autocorrelation,
picked as test data. The former is used to build the model while
partial correlation coefficients and their truncation cases,
the latter is used to evaluate the prediction effect of the model.
 ሺ͵ǡͳǡͳሻand ሺͳǡͲǡʹሻሺͳǡͳǡͳሻ଻ can be ascertained.
Fig. 1 is a plot of outpatient visits for the hospital from
January 2015 to November 2015.


Fig. 1.Time series plot of outpatient visits(01-01-2015 to 30-11-2015)

B. Model Building
1) Periodicity Analysis 
Variance analysis is performed after weekly outpatient
visits counting as the P value of the analysis results is far less Fig. 2.The plot of AC and PAC of series used by ARIMA
than 0.05.Therefore, it is considered that the factor of week has
exerted significant impact on outpatient visits and the sequence
shows a clear pattern of periodicity and S = 7.Table პ shows
the results of the factor of week (Ƚෝన ) quantified by the seasonal
index method. When it comes to the periodicity of the original


Fig. 4. The comparison of forecasting and actual values

C. Model analysis and performance evaluation


Table ჟ reveals various basic statistical indicators of two
models. Viewing from the heteroscedasticity of the model, the
two models all passed the White testing which means the
random error sequence within the model has no variance. The
influence of heteroscedasticity on prediction accuracy can be
excluded. Seeing from the fitting degree of the model,
theഥଶ value of the SARIMA is greater than ARIMA with
seasonal index, indicating the superior fitting effect of
SARIMA. From the perspective of criteria of the AIC and SC
of the model and residual sum of squares, the statistics of
SARIMA are less than ARIMA with seasonal index.
d>//͘ ĂƐŝĐƐƚĂƚŝƐƚŝĐĂůŝŶĚŝĐĂƚŽƌƐŽĨŵŽĚĞůƐ
ARIMA with seasonal
Model SARIMA
index

 White test(P value) 0.0158 0.0002


Fig. 3. The plot of AC and PAC of series used by SARIMA
ഥଶ
 0.340441 0.379476
With the EViews being used to estimate the parameters of AIC 13.96273 13.73797
above models, the expression of the detailed models are as
follows: SC 14.01716 13.79183
The ARIMA with seasonal index:  residual sum of
23083039 19025094
ଶ ଷ ሻሺͳ squares
ሺͳ ൅ ͲǤͷʹʹʹ ൅ ͲǤ͵͸ͷ͵ ൅ ͲǤ͵ͳͺʹ െ

ሻ›ሺ–ሻȽෝన ൌ ሺͳ െ ͲǤʹ͵ͺʹሻɂሺ–ሻሺ‹ൌͳǡĂǡሻ    Table რ exhibits the forecasting performance of the two
models. By horizontal comparison, it is clear that all forms of
ሺ͹ሻ
error indicators of the SARIMA were less than the ARIMA
with seasonal index that the SARIMA model can better fit in
The SARIMA:  with the real value.
Meanwhile, Taken the correlation coefficient between
ሺͳ െ ͲǤͺ͸͵ʹሻሺͳ ൅ ͲǤͲͻ͵ͺ଻ ሻሺͳ െ  ଻ ሻ›ሺ–ሻ ൌ models and the real value into account, the statics shown of the
SARIMA (0.8627)is larger than the ARIMA with seasonal
ሺͳ ൅ ͲǤʹ͸Ͷ͵ ൅ ͲǤͳ͸Ͷ͸ଶ ሻሺͳ ൅ ͲǤͻͷͳͳ଻ ሻɂሺ–ሻ   index(0.8486), further testifying the better performance of the
ሺͺሻ fitting degree of the SARIMA model.
d>///͘ DŽĚĞůƉƌĞĚŝĐƚŝŽŶŽƵƚĐŽŵĞĂŶĚƉĞƌĨŽƌŵĂŶĐĞĞǀĂůƵĂƚŝŽŶ
Fig. 4 shows the comparison of the two models with the
ARIMA with
forecasting and actual values. days Real value SARIMA
seasonal index

Average

Absolute üü 172.65 169.26


Deviation(AAD)

Average Absolute

Relative üü 9.573% 9.310%


Deviation(AARD)

Mean Squared
üü 206.538 201.512
Error

correlation
üü 0.8486 0.8627
coefficient



[4] Bergs J, Heerinckx P, Verelst S. Knowing what to expect, forecasting
 monthly emergency department visits: A time-series analysis[J].
International Emergency Nursing, 2013, 22(2):112-115.
IV. CONCLUSION [5] Nobre F F, Monteiro ABTelles P R, Williamson G D. Dynamic linear
model and SARIMA: a comparison of their forecasting performance in
Medical profession has remained to be one of the salient epidemiology[J]. Statistics in Medicine, 2001, 20(20):3051-69.
aspect of livelihood issues. In August 19th 2016, during a [6] Cao Q, Ewing B T, Thompson M A. Forecasting medical cost inflation
keynote speech President Xi Jinping gave on the National rates: A model comparison approach[J]. Decision Support Systems,
2012, 53(1):154-160.
Health Conference, the prospect of the big data being applied
[7] Araz O M, Bentley D, Muelleman R L. Using Google Flu Trends data
upon health care and medical industry is heavily emphasized. in forecasting influenza-like-illness related ED visits in Omaha,
This paper, by selecting the ³Da\´DVWLPHVFDOHDQGHYDOXDWLQJ Nebraska.[J]. American Journal of Emergency Medicine, 2014,
quotidian data, has filled up the gap of timescale of forecasting 32(9):1016-23.
among medical field. [8] Abraham G, Byrnes G B, Bain C A. Short-term forecasting of
Much more accurate and precise forecasting result in short emergency inpatient flow[J]. IEEE Transactions on Information
Technology in Biomedicine A Publication of the IEEE Engineering in
period will be obtained which in turn provides guarantee for Medicine & Biology Society, 2009, 13(3):380-388.
theory support for re-arrangement of medical personnel and [9] Chikobvu D, Sigauke C. Regression-SARIMA modelling of daily peak
adequate preparation for the breaking out of the sudden electricity demand in South Africa[J]. Journal of Energy in Southern
disease. Africa, 2012, 23(3):23-30.
The ARIMA model has been widely used in the modeling [10] Box G E P, Jenkins G M. Time Series Analysis: Forecasting and
of time series and it has proved to be high reliable and accurate Control[J]. Journal of the American Statistical Association, 1970,
68(342):199-201.
in the forecasting field. This paper has applied two models
separately expounding the outpatient visits: the ARIMA model 
with seasonal index and the SARIMA model. During the
research, the factor of ³Week´ has been taken into account. By
comparing and analyzing the experimental result with real data,
the quality of the model is being assessed and the conclusion
has been reached: the SARIMA model largely outweighs its
counterpart. It is believed that the SARIMA presents favorable
fitting effect of forecasting the future outpatient visits and
therefore the decision of future staffing arrangement can be
referred to and real time surveillance of disease shall be
ensured.
In addition to the factor of the ³Week´, there are many
other facets can influence the visiting number of outpatient in
real life, such as alteration of weather, season and environment
etc. As for the future research, the authors intends to add those
factors so as to improve and upgrade the forecasting model
with a more reliable and precise result with the aim to better
serve the medical institutions. 

ACKNOWLEDGMENT
The authors are grateful to the financial support from the N
ational Social Science Foundation of China (Grant No.16BGL
192). Furthermore, the study is a substantial work of the
Graduate Innovative Education Program of Zhongnan
University of Economics and Law (2017Y1411)

REFERENCES
[1] Jiang S, Chin K S, Wang L, et al. Modified Genetic Algorithm-based
Feature Selection Combined with Pre-trained Deep Neural Network for
Demand Forecasting in Outpatient Department[J]. Expert Systems with
Applications, 2017.
[2] Cheng C H, Wang J W, Li C H. Forecasting the number of outpatient
visits using a new fuzzy time series based on weighted-transitional
matrix[J]. Expert Systems with Applications, 2008, 34(4):2568-2575.
[3] Wang Y, Gu J, Zhou Z, et al. Diarrhoea outpatient visits prediction
based on time series decomposition and multi-local predictor fusion[J].
Knowledge-Based Systems, 2015, 88(C):12-23.



You might also like