Professional Documents
Culture Documents
SAS Aid
SAS Aid
SAS Aid
Contents
• Explorer – Navigate
and check libraries
and datasets
SAS
Language
Data step
DATA Step PROC Step
• LIBNAME* statement can be used to create permanent libraries at user defined locations
• A permanent library will store datasets permanently, i.e., they will not get deleted after the SAS session
is closed
• Syntax –
libname Win_R "D:\C_Windows Research Projects\Project_1";
• A valid library name must start with an alphabet and cannot have more than 8 characters
• When the SAS session is closed, the library reference will get deleted but the SAS datasets in
that location will be retained
In the syntax, brackets denote optional TEST requests asymptotic tests for measures of association
and agreement.
specifications, and vertical bars denote a choice of
one of the specifications separated by the vertical WEIGHT identifies a variable with values that weight each
bars. observation.
Sample Code
The asterisk between TPV and Segment tells PROC FREQ that you want a two-way table with TPV
forming the rows of the table and Segment forming the columns.
tables (A B) * (C D);
[This request generates four tables: A by C, A by D, B by C, and B by D]
Standardization : To standardize the data
Sample Code
PROC REG data = itp.It_pro_wv17;
Model Sales = Price GDP IIP ATL / details VIF;
ods output parameterestimates=est;
quit;
RUN;
Further Reading : http://www.sfu.ca/sasdoc/sashtml/stat/chap55/index.htm
Stepwise Regression
Syntax notations are similar to PROC REG, with a few changes as given in the example below.
Sample Code
PROC REG data=tpv.bam_jan10_v12_v2;
Model TPV= Q74_1_5 Q74_1_6 Q74_1_21 Q74_1_15 Q74_1_3 Q74_1_20 Q74_1_7 Q21 Q29_3
/ details stb vif
selection=stepwise slentry=0.05 slstay=0.05;
RUN;
Where
Stepwise regression analysis is requested by specifying the SELECTION=STEPWISE option in the MODEL statement
The option SLENTRY=0.05 specifies that a variable has to be significant at the 0.05 level before it can be entered into
the model
The option SLSTAY=0.05 specifies that a variable in the model has to be significant at the 0.05 level for it to remain in
the model
The DETAILS option requests detailed results for the variable selection process
In the syntax, brackets denote optional OUTROC names the output data set
specifications, and vertical bars denote a choice of OUTPUT creates an output data set and names the variables to
one of the specifications separated by the vertical contain predicted values, residuals, and other
bars. diagnostic statistics.
Sample Code
PROC LOGISTIC DATA =itp.It_pro_wv17_recommend descending;
model Dependent Variable =
Independent variables / selection = stepwise stb slentry = 0.055 slstay =
0.055 lackfit outroc = roc;
ods output ParameterEstimates = pest;
output out = test p = predval;
RUN;
Further Reading : http://www.sfu.ca/sasdoc/sashtml/stat/chap39/index.htm
Factor Analysis
Lets understand PROC FACTOR directly with an example
PROC FACTOR data= tpv.Psatrefresh_us
METHOD=Principal PRIORS=ONE ROTATE = Varimax NFACTORS=5
OUT=tpv.Psatrefresh_us_1
SCREE MSA CORR SCORE RES REORDER
MINEIGEN=1;
VAR Q21_1 Q21_2 Q21_3 Q21_4 Q21_5;
RUN;
Statement Description
METHOD=name specifies the method for extracting factors
PRIORS=name specifies a method for computing prior communality estimates, PRIORS=ONE sets all prior communalities to 1.0
ROTATE=name specifies the rotation method, ROTATE=VARIMAX specifies orthogonal varimax rotation
NFACTORS=n specifies the maximum number of factors to be extracted
OUT=SASdataset creates a data set containing all the data from the DATA= data set + variables Factor1, Factor2,… containing factor scores
SCREE displays a scree plot of the eigenvalues
MSA produces the partial correlations b/w each pair of variables controlling for all other variables
CORR displays the correlation matrix or partial correlation matrix
SCORE reads scoring coeff. (_TYPE_='SCORE') from TYPE=FACTOR, TYPE=CORR, TYPE=UCORR, TYPE=COV, or TYPE=UCOV data set
RES displays the residual correlation matrix and the associated partial correlation matrix
REORDER causes the rows (variables) of various factor matrices to be reordered on the output. Variables with their highest absolute
loading on the first factor are displayed first, from largest to smallest loading, followed similarly for the second factor …
MINEIGEN=p specifies the smallest eigenvalue for which a factor is retained. If you specify two or more of the MINEIGEN=, NFACTORS=,
and PROPORTION= options, the number of factors retained is the minimum number satisfying any of the criteria
Statement Description
METHOD=name determines the clustering method used by the procedure
WARD One of the many clustering methods possible, requests Ward's minimum-variance method (error sum of squares, trace
W). Distance data are squared unless you specify the NOSQUARE option
ROTATE=name specifies the rotation method, ROTATE=VARIMAX specifies orthogonal varimax rotation
OUTTREE=SAS- creates an output data set that can be used by the TREE procedure to draw a tree diagram. You must give the data set a
data-set two-level name to save it
PSEUDO displays pseudo F and t2 statistics. This option is effective only when the data are coordinates or when
METHOD=AVERAGE, METHOD=CENTROID, or METHOD=WARD
To get a Dendogram use PROC TREE statement, a sample code is given below