Professional Documents
Culture Documents
SAS Introduction To Time Series Forecasting-Libre
SAS Introduction To Time Series Forecasting-Libre
SAS Introduction To Time Series Forecasting-Libre
)f you use computer in this laboratory, please start SAS from Desktop or Start/programs .
You can use the SAS software at the laboratory of the Computer center of our university, or even
by the server of our university if you have the permission.
You can get a temporary license of the SAS software by contacting our computer assistant.
debugging.
Submit the whole program or just submit a few lines SAS programs to SAS System
SAS library A folder in which the SAS data set is. You can create a new library by libname or
shortcut
DATA step Deal with SAS dataset, or change raw data into a SAS data set, which can be
identified by SAS System and dealt with by PROC step
=====================================
DATA dataset name;
INPUT variable<format>;
CARDS;
.. data line
=====================================
The dataset name must contain no more than 8 characters alphabet a, b , digit
underscore (_)), and begin with alphabet or underscore.
PROC step Deal with SAS data set, and output results of analysis
=====================================
PROC procedure name DATA= dataset name;
RUN;
=====================================
, or
The procedure name is the name of SAS Command, and includes PRINT, PLOT, GPLOT, and
INSIGHT etc.
Lib1
Physical Path
D:\example
Using shortcut.
For example, lib1.blood means that data set blood is saved in the library lib1.
The library_name can be sashelp, sasuser, maps, work or lib1. The dataset_name is due to you,
such as blood.
When library_name is equal to work, the data set work.dataset_name is temporary SAS data set,
which will be deleted automatically when you shut down the SAS software. At this time, the
work can be ignored. For example, you use blood or work.blood as the name of the data set.
The data that you want to deal with are also SAS data set.
DATA dataset name;
SET dataset name that you want to deal with;
RUN;
It can be used to draw Several Types Graph such as Line Plot, Scatter Plot, Rotating Plot, 3dimensions Scatter Plot Matrix, etc.
OUTLIER options;
FORECAST options;
RUN;
QUIT;
BY
A BY statement can be used in the ARIMA procedure to process a data set in groups of
observations defined by the BY variables. Note that all IDENTIFY, ESTIMATE, and FORECAST
statements specified are applied to all BY groups.
IDENTIFY
ALPHA= significance-level: The ALPHA= option specifies the significance level for tests in the
IDENTIFY statement. The default is 0.05.
ESACF: computes the extended sample autocorrelation function and uses these estimates to
tentatively identify the autoregressive and moving average orders of mixed models.
The ESACF option generates two tables. The first table displays extended sample
autocorrelation estimates, and the second table displays probability values that can be used to
test the significance of these estimates. The P= (pmin: pmax) and Q= (qmin: qmax) options
determine the size of the table.
NLAG= number: indicates the number of lags to consider in computing the autocorrelations and
cross-correlations.
STATIONARITY=(ADF= AR orders DLAG= s) or STATIONARITY=(DICKEY= AR orders DLAG= s):
performs augmented Dickey-Fuller tests. If the DLAG=s option specified with s is greater than
one, seasonal Dickey-Fuller tests are performed. The maximum allowable value of s is 12. The
default value of s is one.
VAR= variable ( d1, d2, ..., dk ) : names the variable containing the time series to analyze. The
VAR= option is required. A list of differencing lags can be placed in parentheses after the
variable name to request that the series be differenced at these lags. For example, VAR=X(1)
takes the first differences of X. VAR=X(1,1) requests that X be differenced twice, both times with
lag 1, producing a second difference series, which is (Xt-Xt-1)-(Xt-1-Xt-2)=Xt-2Xt-1+Xt-2 .
VAR=X(2) differences X once at lag two (Xt-Xt-2) . If differencing is specified, it is the
differenced series that is processed by any subsequent ESTIMATE statement.
ESTIMATE
METHOD=ML/ULS /CLS: specifies the estimation method to use. METHOD=ML specifies the
maximum likelihood method. METHOD=ULS specifies the unconditional least-squares method.
METHOD=CLS specifies the conditional least-squares method. METHOD=CLS is the default.
P= order: specifies the autoregressive part of the model. By default, no autoregressive
parameters are fit. P=(l1, l2, ..., lk) defines a model with autoregressive parameters at the
specified lags. P= order is equivalent to P=(1, 2, ..., order). A concatenation of parenthesized lists
specifies a factored model. For example, P=(1,2,5)(6,12) specifies the autoregressive model
FORECAST
ALPHA= n: sets the size of the forecast confidence limits. The ALPHA= value must be between 0
and 1. When you specify ALPHA=, the upper and lower confidence limits will have a confidence
level. The default is ALPHA=.05, which produces 95% confidence intervals. ALPHA values are
rounded to the nearest hundredth.
ID= variable: names a variable in the input data set that identifies the time periods associated
with the observations.
OUT= SAS-data-set: writes the forecast (and other values) to an output data set.
Simulate an MA(2):
/* Simulate an MA(2) process */
data ts.ma;
a1=0; a2=0;
do t = -50 to 200;
a = rannor( 32565 );
z = a + a1*0.2+a2*0.5;
if t > 0 then output;
a2=a1; a1=a;
end;
keep z t;
run;
Simulate an ARMA(1,1):
/* Simulate an ARMA(1,1) process */
data ts.arma;
z1=0; a1=0;
do t = -50 to 200;
a = rannor( 32565 );
z = z1*0.5 + a + a1*0.3;
if t > 0 then output;
a1=a; z1=z;
end;
keep z t;
run;
The result:
Simulated AR(2) Time Series
different values of k
3 different tests
3 deterministic trends
Estimated
parameters
Mean
Intercep
Variance of the
white noise
Standard deviation
of the white noise
P values of
significance
A normality test:
Sample autocorrelation function (ACF) of the residuals and Sample partial ACF of the residuals:
Ljung-Box test:
Test statistic
Degree of
freedom
P-values
The results:
The airline passenger data records the number of passengers traveling by air per month from
January, 1949 to December, 1960.
It is given as Series G in Box and Jenkins (1976), and has been used in time series analysis
literature as a standard example of a non-stationary seasonal time series.
The sample ACF of the sequence after both common differencing and seasonal differencing:
Ljung-Box test:
Compare the
estimated
coefficients
Compare
the model
criteria
Normality tests:
10
The result: