Professional Documents
Culture Documents
SAS1
SAS1
Linear Regression is used to identify the relationship between a dependent variable and one or more
independent variables. A model of the relationship is proposed, and estimates of the parameter
values are used to develop an estimated regression equation.
Various tests are then used to determine if the model is satisfactory. If it is then, the estimated
regression equation can be used to predict the value of the dependent variable given values for the
independent variables. In SAS the procedure PROC REG is used to find the linear regression model
between two variables.
Syntax
variable_1 and variable_2 are the variable names of the dataset used in finding the
correlation.
Example
The below example shows the process to find the correlation between the two variables horsepower
and weight of a car by using PROC REG. In the result we see the intercept values which can be used
to form the regression equation.
PROC SQL;
FROM
SASHELP.CARS
RUN;
run;
Syntax
VAR variable;
CLASS Variable;
Variable_1 and Variable_2 are the variable names of the dataset used in t test.
Example
Below we see one sample t test in which find the t test estimation for the variable horsepower with
95 percent confidence limits.
PROC SQL;
FROM
SASHELP.CARS
RUN;
var horsepower;
run;
The paired T Test is carried out to test if two dependent variables are statistically different from each
other or not.
Example
As length and weight of a car will be dependent on each other we apply the paired T test as shown
below.
paired weight*length;
run;
This t-test is designed to compare means of same variable between two groups.
Example
In our case we compare the mean of the variable horsepower between the two different makes of
the cars("Audi" and "BMW").
proc ttest data = cars1 sides = 2 alpha = 0.05 h0 = 0;
class make;
var horsepower;
run;
ANOVA stands for Analysis of Variance. In SAS it is done using PROC ANOVA. It performs analysis of
data from a wide variety of experimental designs. In this process, a continuous response variable,
known as a dependent variable, is measured under experimental conditions identified by
classification variables, known as independent variables. The variation in the response is assumed to
be due to effects in the classification, with random error accounting for the remaining variation.
Syntax
CLASS Variable;
MEANS ;
MODEL defines the model to be fit using certain variables from the dataset.
Variable_1 and Variable_2 are the variable names of the dataset used in analysis.
Applying ANOVA
Example
Lets consider the dataset SASHELP.CARS. Here we study the dependence between the variables car
type and their horsepower. As the car type is a variable with categorical values, we take it as class
variable and use both these variables in the MODEL.
CLASS type;
RUN;