Professional Documents
Culture Documents
New in SAS 9.2
New in SAS 9.2
2
STATISTICAL PROCEDURES
SUN LI SENIOR STATISTICIAN CENTRE FOR ACADEMIC COMPUTING LSUN@SMU.EDU.SG
OUTLINE
SAS/STAT
SAS Stat Studio software SAS Power and Sample Size (PSS application) PROC GENMOD Bayesian capability PROC GLIMMIX Generalized linear mixed model
USEFUL LINKS
Enhanced
procedures to discuss:
PROC MEANS
PROC UNIVARIATE
PROC FREQ
BASE SAS
1. PROC MEANS
The PRT statistic is now an alias for the PROBT statistic The MODE statistic can now be used with PROC MEANS.
BASE SAS
2. PROC UNIVARIATE
CDFPLOT: plots the observed cdf of a variable and enables you to superimpose a fitted theoretical distribution on the graph. PPPLOT: creates a P-P plot which compares the ecdf of a variable with a specified theoretical cumulative distribution function.
BASE SAS
3. PROC FREQ
Testing for specified proportions Distribution plot and other plots Binomial proportion tests and confidence intervals
Date/Time Value:
New procedures:
Bayesian capabilities are introduced under procedures: GENMOD, LIFEREG, and PHREG GLIMMIX, GLMSELECT, and QUANTREG Experimental procedures: HPMIXED, MCMC, SEQDESIGN , SEQTEST ,TCALIS
9
SAS/STAT
1. SAS Stat Studio 3.1
New software for data exploration and analysis Start > All Programs > Stat Studio 3.1 Demo basic steps with hands-on: Graph & Polynomial regression with data set: hurricanes.sas7bdat Transform data & Model fitting with data set baseball.sas7bdat SAS help and documentation: SAS products -> SAS Stat Studio 3.1 -> SAS Stat Studio 3.1 Users Guide
10
11
SAS/STAT
2. SAS Power and Sample Size 3.1
PSS application Start > All Programs > SAS > SAS Power and Sample Size > SAS Power and Sample Size 3.1
Demo basic steps with hands-on: Power analysis for One Sample T Test
SAS help and documentation: SAS products -> SAS/STAT -> SAS/STAT Users Guide -> The Power and Sample Size Application
12
13
SAS/STAT
3. PROC GENMOD
Bayesian Analysis of a Linear Regression Model:
Here is a study of 54 patients undergoing a certain kind of liver operation in a surgical unit. The data set Surg contains survival time and certain covariates for each patient.
14
SAS/STAT
15
SAS/STAT
BAYES : produces Bayesian analysis via Gibbs sampling for most of the statistical analyses. SEED: specifies randomization seed. It is used to maintain reproducibility.
OUTPUT : saves the samples in the SAS data set PostSurg for further processing.
By default, a uniform prior distribution is assumed on the regression coefficients. A noninformative gamma prior is used for the scale parameter sigma. ODS Graphics is enabled as specified in the SAS statements to display the diagnostic plots.
16
SAS/STAT
17
SAS/STAT
There is a 1.00 probability of a positive relationship between the logarithm of a blood clotting score and survival time, adjusted for the other covariates.
18
SAS/STAT
4. PROC GLIMMIX
Comparison btw GLIMMIX and other procedures
1) Response can have a nonnormal distribution (MIXED assumes normally distributed response.) 2) Incorporates random effects in the model and so allows for conditional and marginal inference (GENMOD allows only for marginal inference.)
3) The class of generalized linear mixed models is a special case of the nonlinear mixed models; hence some of the models you can fit with NLMIXED can also be fit with GLIMMIX.
(The details can be found in the SAS help documentation.)
19
SAS/STAT
Logistic Regressions with Random Intercepts Researchers investigated the performance of banks in a multicenter study. They randomly selected 15 centers. One of the study goals was to compare the occurrence of loan defaults for the banks. In each center two types of banks were selected and coded as A" and "B. Under each type, there are a few of banks selected for the study.
20
SAS/STAT
To model the probability of defaults happened in the two types of banks: A and B, you need to account for the fixed group effect and the random selection of centers. We assume a linear model that relates group and center effects to the logit of the probabilities:
A-B measures the difference in the logits. 0 is the overall intercept in the model.
21
SAS/STAT
CLASS : specifies the classification variables. MODEL : specifies response variable as a sample proportion by using the events/n syntax. SOLUTION : requests a list of solutions for fix-effects parameter estimates. Because the default/n syntax, the procedure gives binomial distribution with logit link.
RANDOM : specifies that the linear predictor contains an intercept term that randomly varies at the level of center effect. In other words, a random intercept is drawn separately and independently for each center in the study.
LSMEANS : requests the least squares means of the group effect on the logit scale. ILINK : adds estimates, standard errors, and confidence limits on the mean 22 (probability) scale .
SAS/STAT
23
SAS/STAT
The details can be found in the SAS help documentation in the individual procedures.
24
25
SAS/ETS
PROC COUNTREG:
Poisson regression Negative binomial regression with linear (NEGBIN1) and quadratic (NEGBIN2) variance functions (Cameron and Trivedi 1986) New: Zero-inflated Poisson (ZIP) model (Lambert 1992) New: Zero-inflated negative binomial (ZINB) model
26
SAS/ETS
ZIP and ZINB Models for Data Exhibiting Extra Zeros
An often encountered characteristic of count data is that the number of zeros in the sample exceeds the number of zeros predicted by either the Poisson or negative binomial model.
Zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models explicitly model the production of zero counts to account for excess zeros.
27
SAS/ETS
This study examines how factors such as gender (fem), marital status (mar), number of young children (kid5), prestige of the graduate program (phd), and number of articles published by a scientists mentor (ment), affect the number of articles (art) published by the scientist.
28
SAS/ETS
29
SAS/ETS
30
31
32
END!
33