Professional Documents
Culture Documents
Complete Sas
Complete Sas
data entry, retrieval, and management report writing and graphics statistical and mathematical analysis business planning, forecasting, and decision support operations research and project management quality improvement applications development.
SAS Training
Log where SAS displays messages which indicates any errors that may be in a program
SAS Training
SAS Training
Log Window
SAS Training
Output Window
SAS Training
SAS expects your data to be in a special form. This special form is called a SAS data set The SAS data set is a tabular form with Variables and Observations The rows are the Observations The columns are the Variables
SAS Training 6
WT
41 54 . 56 48 43
Observation s
3 55 4 56 5 57 58 6
ID, HT and WT are Numeric Variables NAME is a Character Variable Character Variables if blank are represented by a space Numeric Variables if blank are represented by a . SAS Training 7
SAS Training
Data Step
SAS statement that read data, create new datasets or variables, modify datasets, perform calculation.
Procedures
SAS statements that can perform statistical analyses, create & print reports & graphs
SAS Training
Every SAS program is constructed using the Data Step and/or Procedures e.g. DATA distance; miles = 23; DATA Step kilometer = 1.61 * miles; RUN; PROC PRINT DATA = distance; RUN;
PROC Step
Any combination of Data Step and/or Procedures may be used Run statement should be used throughout the program
SAS Training
10
PROC
Begin with PROC
Read and modify data Perform specific analysis or function Note: The table is not meant to imply that PROC can never create Produce results or report SAS Create a (some do), setthat DATA step can never create data sets SAS data or
reports (they can) But it is much easier to write SAS programs if one can understand the basic functions of DATA and PROC steps.
SAS Training
11
Data Steps execute line by line and observation by observation SAS takes the first observation and runs it all the way through the data step (line by line) before looping back to pick up the second observation SAS sees one observation at a time.
DATA step Line 1 Line 2 Line 3 Line 4 Line 5 SAS Training Output data set Observation 1 Observation 2 Observation 3
12
Every SAS statement end with a semicolon. e.g. Data test; Names must be 8 characters or fewer in length. e.g. valid names:- distance invalid names is distance_a Names must be start with letter or an underscore (_). Valid Names are e.g. distance, _abc are valid. Names can contain only letters, numerals and the underscore (_). No %$!*&#@ please.
SAS Training
13
SAS Training
14
SAS Training
15
SAS Training
16
INPUT
Describes the arrangement of values in the input data record and assigns input values to the corresponding SAS variables
INPUT <specification(s)><@|@@>;
SAS Training
17
INPUT (SPECIFICATIONS)
variable - names a variable that is assigned input values $ - indicates to store the variable value as a character value rather than as a numeric value. pointer-control - moves the input pointer to a specified line or column in the input buffer. informat. - specifies an informat to use to read the variable value
SAS Training
18
SAS Training
19
INPUT (DATALINES)
Data newdata; Input name $ age; DATALINES; Mary 24 Suzan 32 ; Run;
SAS Training 20
INPUT TYPES
INPUT, COLUMN : Reads input values from specified columns and assigns them to the corresponding SAS variables Syntax: INPUT variable <$> start-column <-- endcolumn> <.decimals> <@ | @@>;
SAS Training 21
Example
data scores; input name $ 1-18 score1 25-27 score2 30-32 score3 3537; datalines; Joseph 11 32 76 Mitchel 13 29 82 Sue Ellen 14 27 74 ; Run;
SAS Training
22
INPUT TYPES
INPUT, Formatted : Reads input values with specified informats and assigns them to the corresponding SAS variables.
Syntax:
INPUT <pointer-control> variable informat. <@ | @@>;
SAS Training
23
Example
data sales; infile file-specification; input item $10. +5 jan comma5. +5 feb comma5. +5 mar comma5.; run;
It can read these input data records: ----+----1----+----2----+----3----+----4 trucks 1,382 2,789 3,556 vans 1,265 2,543 3,987 sedans 2,391 3,011 3,658
SAS Training 24
INPUT TYPES
INPUT, List : Scans the input data record for input values and assigns them to the corresponding SAS variables. Syntax : INPUT <pointer-control> variable <$> <&> <@ | @@>;
SAS Training
25
Example
data scores; input name $ score1 score2 score3 team $; datalines; Joe 11 32 76 red Mitchel 13 29 82 blue Susan 14 27 74 green ;
SAS Training
26
Merge Statement
The MERGE statement is flexible and has a variety of uses in SAS programming One-to-One Match Merge
SAS Training
27
One-to-one matching
To combine variables from several data sets where there is a one-toone correspondence between the observations in each of the data sets, list the data sets to be joined on a merge statement.
SAS Training
28
Match Merge
When there is not an exact one-to-one correspondence between data sets to be merged, the variables to use to identify matching observations can be specied on a by statement. The data sets being merged must be sorted by the variables specified on the by statement.
SAS Training
29
SAS Training
30
SAS Training
31
SAS Training
32
* If statement keeps only those records that are in both data sets;
SAS Training
33
SAS Functions
Definition of Functions
A SAS function performs a computation or system manipulation on arguments and returns a value. Most functions use arguments supplied by the user, but a few obtain their arguments from the operating environment. In base SAS software, you can use SAS functions in DATA step programming statements, in a WHERE expression, in macro language statements, in PROC REPORT, and in Structured Query Language (SQL).
SAS Training
34
Syntax of Functions
function-name (OF variable-list) where function-name names the function. argument can be a variable name, constant, or any SAS expression, including another function. The number and kind of arguments allowed are described with individual functions. Multiple arguments are separated by a comma.
SAS Training
35
Numeric Functions
Details : The ROUND function returns a value rounded to the nearest round-off unit. If round-off-unit is not provided, a default value of 1 is used and argument is rounded to the nearest integer.
SAS Training
36
SAS Training
38
SAS Training
39
MEAN(argument,argument, . . .) Arguments argument is numeric. At least one argument is required. The argument list may consist of a variable list, which is preceded by OF. Results Examples : SAS Statement x1=mean(2,.,.,6); x2=mean(1,2,3,2); x3=mean(of x1-x2); 4 2 3
SAS Training
40
SAS Training
41
-2
3 -1
SAS Training
42
SAS Training
43
SAS Training
44
SAS Training
45
Character Functions
UPCASE(argument) Arguments argument specifies any SAS character expression. Details : The UPCASE function copies a character argument, converts all lowercase letters to uppercase letters, and returns the altered value as a result.
SAS Training
46
SAS Training
47
LOWCASE(argument) Arguments argument specifies any SAS character expression. Details : The LOWCASE function copies a character argument, converts all uppercase letters to lowercase letters, and returns the altered value as a result.
SAS Training
48
SAS Training
49
SAS Training
50
CATTY
SAS Training
51
SAS Training
52
SAS Training
53
SAS Training
54
ABC
SAS Training
55
TRIM: Removes trailing blanks from character expressions returns one blank if the expression is missing.
and
Syntax TRIM(argument) Arguments argument specifies any SAS character expression. Details : TRIM copies a character argument, removes all trailing blanks, and returns the trimmed argument as a result. If the argument is blank, TRIM returns one blank. TRIM is useful for concatenating because concatenation does not remove trailing blanks. Assigning the results of TRIM to a variable does not affect the length of the receiving variable. If the trimmed value is shorter than the length of the receiving variable, SAS pads the value with new blanks as it assigns it to the variable.
SAS Training
56
SAS Training
58
SAS Training
59
LENGTH(argument) Arguments argument specifies any SAS expression. Details : The LENGTH function returns an integer that represents the position of the rightmost nonblank character in the argument. If the value of the argument is missing, LENGTH returns a value of 1. If the argument is a numeric variable (either initialized or uninitialized), LENGTH returns a value of 12 and prints a note in the SAS log that the numeric values have been converted to character values.
SAS Training
60
SAS Training
61
SAS Training
62
SAS Training
63
SAS Statements
Definition of Statements
A SAS statement is a series of items that may include keywords, SAS names, special characters, and operators. All SAS statements end with a semicolon. A SAS statement either requests SAS to perform an operation or gives information to the system. There are two kinds of SAS statements:
those used in DATA step programming those that are global in scope and can be used anywhere in a SAS program.
SAS Training
64
SAS Training
65
SAS Training
66
SAS Training
67
SAS Training
68
Syntax
data set.
Details : The DROP statement applies to all the SAS data sets that are created within the same DATA step and can appear anywhere in the step. The variables in the DROP statement are available for processing in the DATA step. If no DROP or KEEP statement appears, all data sets that are created in the DATA step contain all variables. Do not use both DROP and KEEP statements within the same DATA step.
SAS Training
69
These examples show the correct syntax for listing variables with the DROP statement: drop time shift batchnum; drop grade1-grade20; In this example, the variables PURCHASE and REPAIR are used in processing but are not written to the output data set INVENTRY: data inventry; drop purchase repair; infile file-specification; input unit part purchase repair; totcost=sum(purchase,repair); run;
SAS Training
70
SAS Training
71
These examples show the correct syntax for listing variables in the KEEP statement: keep name address city state zip phone; keep rep1-rep5; This example uses the KEEP statement to include only the variables NAME and AVG in the output data set. The variables SCORE1 through SCORE20, from which AVG is calculated, are not written to the data set AVERAGE. data average; keep name avg; infile file-specification; input name $ score1-score20; avg=mean(of score1-score20); run;
SAS Training
72
quotation marks to
SAS Training
73
Specifying Labels Here are several LABEL statements: label compound='Type of Drug'; label date="Today's Date "; label n='Mark''s Experiment Number'; label score1="Grade on April 1 Test" score2="Grade on May 1 Test";
SAS Training
74
Removing a Label This example removes an existing label: data rtest; set rtest; label x=' '; run;
SAS Training
75
SAS Training
76
SAS Training
77
RENAME : Specifies new names for variables in output SAS data sets
Syntax RENAME old-name-1=new-name-1 . . . <old-name-n=new-name-n>; Arguments old-name specifies the name of a variable or variable list as it appears in the input data set, or in the current DATA step for newly created variables. new-name specifies the name or list to use in the output data set. Details The RENAME statement allows you to change the names of one or more variables, variables in a list, or a combination of variables and variable lists. The new variable names are written to the output data set only. Use the old variable names in programming statements for the current DATA step. RENAME applies to all output data sets.
SAS Training
78
These examples show the correct syntax for renaming variables using the RENAME statement
rename street=address; rename time1=temp1 time2=temp2 time3=temp3; rename name=Firstname score1-score3=Newscore1-Newscore3; This example uses the old name of the variable in program statements. The variable Olddept is named Newdept in the output data set, and the variable Oldaccount is named Newaccount. rename Olddept=Newdept Oldaccount=Newaccount; if Oldaccount>5000; keep Olddept Oldaccount items volume;
SAS Training
79
WHERE : Selects observations from SAS data sets that meet a particular condition
Syntax WHERE where-expression-1 < logical-operator where-expression-n>; Arguments where-expression is an arithmetic or logical expression that generally consists of a sequence of operands and operators. logical-operator can be AND, AND NOT, OR, or OR NOT. Details :Using the WHERE statement may improve the efficiency of your SAS programs because SAS is not required to read all observations from the input data set. The WHERE statement cannot be executed conditionally; that is, you cannot use it as part of an IF-THEN statement. WHERE statements can contain multiple WHERE expressions that are joined by logical operators.
SAS Training
80
Basic WHERE Statement Usage This DATA step produces a SAS data set that contains only observations from data set CUSTOMER in which the value for NAME begins with Mac and the value for CITY is Charleston or Atlanta data testmacs; set customer; where substr(name,1,3)='Mac' and (city='Charleston' or city='Atlanta'); run; Using Operators Available Only in the WHERE Statement Using BETWEEN-AND: where empnum between 500 and 1000; Using CONTAINS: where company ? 'bay'; where company contains 'bay';
SAS Training
81
conditions
Syntax IF expression THEN statement; <ELSE statement;> Arguments expression is any SAS expression and is a required argument. statement can be any executable SAS statement or DO group.
SAS Training
82
SAS Training
83
if status='OK' and type=3 then count+1; if age ne agecheck then delete; if x=0 then if y ne 0 then put 'X ZERO, Y NONZERO'; else put 'X ZERO, Y ZERO'; else put 'X NONZERO';
SAS Training
84
Arguments DESCENDING indicates that the data sets are sorted in descending order by the variable that is specified. DESCENDING means largest to smallest numerically, or reverse alphabetical for character variables.
SAS Training
85
SAS Training
86
Observations are in ascending order of SALESREP and, within each SALESREP value, in descending order of the values of JANSALES: by salesrep descending jansales; BY-Processing with Nonsorted Data Observations are ordered by the name of the month in which the expenses were accrued: by month notsorted;
SAS Training
87
retain its value from one iteration of the DATA step to the next
Syntax RETAIN <element-list(s) <initial-value(s) | (initial-value-1) | (initial-value-list-1) > < . . . element-list-n <initial-value-n | (initial-value-n ) | (initial-value-list-n)>>>;
Without Arguments If you do not specify an argument, the RETAIN statement causes the values of all variables that are created with INPUT or assignment statements to be retained from one iteration of the DATA step to the next.
SAS Training
88
SAS Training
89
SAS Training
90
Statement RETAIN Examples : This RETAIN statement retains the values of variables MONTH1 through MONTH5 from one iteration of the DATA step to the next: retain month1-month5;
This RETAIN statement retains the values of nine variables and sets their initial values: retain month1-month5 1 year 0 a b c 'XYZ'; The values of MONTH1 through MONTH5 are set initially to 1; YEAR is set to 0; variables A, B, and C are each set to the character value XYZ. This RETAIN statement assigns the initial value 1 to the variable MONTH1 only: retain month1-month5 (1); Variables MONTH2 through MONTH5 are set to missing initially.
Observations are ordered by the name of the month in which the expenses were accrued: by month notsorted;
SAS Training
91
Syntax OUTPUT<data-set-name(s)>; Without Arguments Using OUTPUT without arguments causes the current observation to be written to all data sets that are named in the DATA statement. Arguments data-set-name specifies the name of a data set to which SAS writes the observation. Details: The OUTPUT statement tells SAS to write the current observation to a SAS data set immediately, not at the end of the DATA step. If no data set name is specified in the OUTPUT statement, the observation is written to the data set or data sets that are listed in the DATA statement.
SAS Training
92
/* writes the current observation */ /* to a SAS data set */ output; /* writes the current observation */ /* when a condition is true */
/* writes an observation to data */ /* set MARKUP when the PHONE */ /* value is missing */ if phone=. then output markup;
SAS Training
93
PUT : Writes lines to the SAS log, to the SAS procedure output file, or to an external file
Without Arguments The PUT statement without arguments is called a null PUT statement. The null PUT statement writes the current output line to the current file, even if the current output line is blank. releases an output line that is being held by a previous PUT statement with a trailing @.
SAS Training
94
Global Statements
LIBNAME : Associates or disassociates a SAS data library with a libref (a shortcut name); clears one or all librefs; lists the characteristics of a SAS data library; concatenates SAS data libraries; implicitly concatenates SAS catalogs.
Syntax 1. LIBNAME libref <engine> 'SAS-data-library' < options > <engine/host-options>;
SAS Training
95
SAS Training
96
SAS Training
97
Statement LIBNAME Examples : Assigning and Using a Libref This example assigns the libref SALES to an aggregate storage location that is specified in quotation marks as a physical pathname. The DATA step creates SALES.QUARTER1 and stores it in that location. The PROC PRINT step references it by its two-level name, SALES.QUARTER1
libname sales 'SAS-data-library'; data sales.quarter1; infile 'your-input-file; input salesrep $20. +6 jansales febsales marsales; run; proc print data=sales.quarter1; run;
SAS Training 98
Syntax TITLE <n> <'text' | "text">; Without Arguments Using TITLE without arguments cancels all existing titles. Arguments n specifies the relative line that contains the title line. 'text' | "text" specifies text that is enclosed in single or double quotation marks. Details: A TITLE statement takes effect when the step or RUN group with which it is associated executes. Once you specify a title for a line, it is used for all subsequent output until you cancel the title or define another title for that line. A TITLE statement for a given line cancels the previous TITLE statement for that line and for all lines with larger n numbers.
SAS Training
99
Statement TITLE Examples : This statement suppresses a title on line n and all lines after it: titlen;
title 'First Draft'; title2 "Year's End Report"; title2 'Year''s End Report';
title 'Quarterly Sales for #byval(site)'; title 'Annual Costs for #byvar2'; title 'Data Group #byline';
SAS Training
100
FOOTNOTE : Prints up to ten lines of text at the bottom of the procedure or DATA step
output
Syntax FOOTNOTE<n > <'text' | "text" >; Without Arguments Using FOOTNOTE without arguments cancels all existing Arguments n specifies the relative line to be occupied by the footnote. 'text' | "text" specifies the text of the footnote in single or double
footnotes.
quotation marks.
SAS Training
101
SAS Training
102
SAS Options
Definition of Options
Data set options specify actions that apply only to the SAS data set with which they appear. They let you perform such operations as
renaming variables selecting only the first or last n observations for processing dropping variables from processing or from the output data set specifying a password for a SAS data set.
SAS Training
103
Syntax of Options
Specify a data set option in parentheses after a SAS data set name. To specify several data set options, separate them with spaces. (option-1=value-1<. . . option-n=value-n>) These examples show data set options in SAS statements:
data scores(keep=team game1 game2 game3); proc print data=new(drop=year); set old(rename=(date=Start_Date));
SAS Training
104
DROP : Excludes variables from processing or from output SAS data sets.
Syntax DROP=variable(s) Syntax Description variable(s) lists one or more variable names. You can list the variables in any form that SAS allows.
Details : If the option is associated with an input data set, the variables are not available for processing. If the DROP= data set option is associated with an output data set, SAS does not write the variables to the output data set, but they are available for processing.
SAS Training
105
Excluding Variables from Input In this example, the variables SALARY and GENDER are not included in processing and they are not written to either output data set:
data plan1 plan2; set payroll(drop=salary gender); if hired<'01jan98'd then output plan1; else output plan2; run;
You cannot use SALARY or GENDER in any logic in the DATA step because DROP= prevents the SET statement from reading them from PAYROLL.
SAS Training
106
KEEP : Specifies variables for processing or for writing to output SAS data sets
Syntax KEEP=variable(s) Syntax Description variable(s) lists one or more variable names. You can list the variables in any form that SAS allows.
Details : If the KEEP= data set option is associated with an input data set, only those variables that are listed after the KEEP= data set option are available for processing. If the KEEP= data set option is associated with an output data set, only the variables listed after the option are written to the output data set, but all variables are available for processing.
SAS Training
107
SAS Training
108
Details : If you use the RENAME= data set option when you create a data set, the new variable name is included in the output data set. If you use RENAME= on an input data set, the new name is used in DATA step programming statements.
SAS Training
109
Renaming a Variable at Time of Output This example uses RENAME= in the DATA statement to show that the variable is renamed at the time it is written to the output data set. The variable keeps its original name, X, during the DATA step processing:
SAS Training
110
observations
Details : The FIRSTOBS= data set option is valid when an existing SAS data set is read. You cannot use this option when a WHERE statement or WHERE= data set option is specified in the same DATA or PROC step.
SAS Training
111
This PROC step prints the data set STUDY beginning with observation 20:
This SET statement uses both FIRSTOBS= and OBS= to read only observations 5 through 10 from the data set STUDY. Data set NEW contains six observations.
SAS Training
112
Syntax OBS=n|MAX Syntax Description n specifies a positive integer that is less than or equal to the number of observations in the data set or zero. MAX represents the total number of observations in the data set. Details : This option specifies the number of the last observation to process, not how many observations should be processed. It is valid only when an existing SAS data set is read. Use OBS=0 to create an empty data set that has the structure, but not the attributes, of another data set.
SAS Training
113
In this example, the OBS= data set option in the SET statement reads in the first ten observations from data set OLD:
SAS Training
114
SAS Training
115
Selecting Observations from an Input Data Set This example uses the WHERE= data set option to subset the SALES data set as it is read into another data set:
Selecting Observations from an Output Data Set This example uses the WHERE= data set option to subset the SALES output data set:
SAS Training
116
SAS Format
Definition of Format
A format is an instruction that SAS uses to write data values. You use formats to control the written appearance of data values, or, in some cases, to group data values together for analysis. For example, the WORDS22. format, which converts numeric values to their equivalent in words, writes the numeric value 692 as six hundred ninety-two
SAS Training
117
Syntax of Format
SAS formats have the following form: <$>format<w>.<d> where $ indicates a character format; its absence indicates a numeric format. format names the format. w specifies the format width, which for most formats is the number of columns in the output data. d specifies an optional decimal scaling factor in the numeric formats.
SAS Training
118
SAS Training
119
SAS Training
120
SAS Training
121
SAS Training
122
DOLLARw.d: Writes numeric values with dollar signs, commas, and decimal points
Syntax DOLLARw.d Syntax Description w specifies the width of the output field. d optionally specifies the number of digits to the right of the decimal point in the numeric value. Details:The DOLLARw.d format writes numeric values with a leading dollar sign, with a comma that separates every three digits, and a period that separates the decimal fraction
SAS Training
123
SAS Training
124
SAS Informat
Definition of Informat
An informat is an instruction that SAS uses to read data values into a variable. Unless you explicitly define a variable first, SAS uses the informat to determine whether the variable is numeric or character. SAS also uses the informat to determine the length of character variables.
SAS Training
125
Syntax of Informat
SAS Informats have the following form: <$>informat<w>.<d> where $ indicates a character format; its absence indicates a numeric format. informat names the informat. w specifies the informat width, which for most informats is the number of columns in the input data. d specifies an optional decimal scaling factor in the numeric informats.SAS divides the input data by 10 to the power of d.
SAS Training
126
Syntax Description w specifies the width of the input field. Details:The $CHARw. informat does not trim leading and trailing blanks or convert a single period in the input data field to a blank before storing values. If you use $CHARw. in an INFORMAT or ATTRIB statement within a DATA step to read list input, then by default SAS interprets any blank embedded within data as a field delimiter, including leading blanks.
SAS Training
127
Syntax $w. Syntax Description w specifies the width of the input field. You must specify w because SAS does not supply a default value. Details: The $w. informat trims leading blanks and left aligns the values before storing the text. In addition, if a field contains only blanks and a single period, $w. converts the period to a blank because it interprets the period as a missing value. The $w. informat treats two or more periods in a field as character data.
SAS Training
129
SAS Procedures
Proc Print
The PRINT procedure prints the observations in a SAS data set, using all or some of the variables.
SAS Training
132
Syntax
Options: Obs=n
SAS Training
133
Hands-On Exercise
Create a List Report from the Cake Dataset displaying the first 10 Observations Expected Output:
SAS Training
134
Solution
proc print data = cake (obs=10); var lastname flavor pscore; run;
SAS Training
135
Proc Format The FORMAT procedure enables you to define your own informats and formats for variables.
SAS Training
136
Syntax
SAS Training
137
Example
Proc Format; value $genders m=Male f=Female Run; Proc Print data=customer.demo; format sex $gender.; run;
SAS Training 138
Output
Cust_id Name Address Sex 2335 Jimmy Birmingham,UK Male 5889 Chen, Len Birmingham,UK Female 3878 Davis, Brad Plymouth,UK Male 4553 Maria Miami USA
SAS Training 139
Hands-On Exercise
Display the values of the variable SEX in Class dataset as 1 and 2 , instead of the default values.
SAS Training
140
Hands-On Exercise
Expected Output
SAS Training
141
Solution
proc format; value $sex 'M' = 1 'F' = 2 ; run; proc print data = train.class; format sex $sex.; run;
SAS Training 142
Proc Sort
The SORT procedure sorts observations in a SAS data set by one or more character or numeric variables, either replacing the original data set or creating a new, sorted data set.
SAS Training 143
Syntax
PROC SORT <option(s)> <collating-sequence-option> BY <DESCENDING> variable-1 <...<DESCENDING> variable-n>; Options(s) :nodupkey
SAS Training
144
Example
Account
Company Paul's Pizza Apex World Wide Electronics Garner Strickland Industries Debt 83.00 119.95 657.22 Town Apex Apex Apex Apex
SAS Training
145
proc sort data=account out=bytown; by town company; run; proc print data=bytown; var company town debt; title 'Customers with Past-Due Accounts'; SAS Training title2 'Listed Alphabetically
146
Output
Customers with Past-Due Accounts Listed Alphabetically within Town Obs Company Town Debt
1 2 3 4 Apex World Wide Electronics Garner Strickland Industries Morrisville Ice Cream Delight Paul's Pizza
SAS Training
Duplicate observations
proc sort data=account out=towns nodupkey; by town; run;
proc print data=towns; var town company debt ; title 'Towns of Customers with Past-Due Accounts; run;
SAS Training
148
Output
Towns of Customers with Past-Due Accounts
Obs Town Company Debt
1 2 3 4
Paul's Pizza World Wide Electronics Ice Cream Delight Strickland Industries
SAS Training
149
Hands-On Exercise
SORT DUPOBS dataset to create dataset A. sort the data by descending tourtype and vendor and ascending landcost Expected Output
SAS Training
150
Solution
proc sort data = dupobs ; by descending tourtype descending vendor landcost; run;
SAS Training
151
Proc Append
The APPEND procedure adds the observations from one SAS data set to the end of another SAS data set Remember: Both data sets contained the same variables. If Appending data sets with different variables, use the FORCE option
SAS Training
152
Syntax
SAS Training
153
Class 1 -2
name Andrew 15 Robert 15 Philip 14
Example
age height 67
Class
name age height 60 55 72
SAS Training
154
Example
SAS Training
155
Output
name Andrew Robert Philip
Andrea Linda
age 15 15 14
16 13
height 67 78 70
60 55
Stephen
17
72
SAS Training
156
Hands On - Exercise
Append datasets DEPT1 and DEPT2 and create replace DEPT1 with the appended dataset Expected Output
SAS Training
157
Solution
proc append base = dept1 data=dept2; run;
SAS Training
158
Proc Contents
The CONTENTS procedure shows the contents of a SAS data set and prints the directory of the SAS data library.
SAS Training
159
Example
SAS Training
160
Output
SAS Training
161
Hands-On Exercise
Display the Dataset structure of DUPOBS dataset
SAS Training
162
Proc Transpose
The TRANSPOSE procedure creates an output data set by restructuring the values in a SAS data set, transposing selected variables into observations.
Original data X Y 12 19 21 15 33 27 14 32
Z 14 19 82 99
=)
SAS Training
163
Syntax
PROC TRANSPOSE <DATA=input-data-set> <LABEL=label> <LET> <NAME=name> <OUT=output-data-set> <PREFIX=prefixBY <DESCENDING> variable-1 <...<DESCENDING> variable-n> <NOTSORTED>; COPY variable(s); ID variable; IDLABEL variable; VAR variable(s);
SAS Training
164
To do this
Transpose each BY group Copy variables directly without transposing them Specify a variable whose values name the transposed variables Create labels for the transposed variables List the variables to transpose
SAS Training
165
Hands-On Exercise
Transpose the dataset grp_tran in the Train library to get the output as shown below
SAS Training
166
Solution
proc sort data = grp_tran; by grp; run; proc transpose data = grp_tran out = result (drop= _name_); by grp; run; proc print data = result; run;
SAS Training 167
Proc SQL
The SQL procedure implements Structured Query Language (SQL) for the SAS System. SQL is a standardized, widely used language that retrieves and updates data in tables and views based on those tables
SAS Training 168
Syntax
PROC SQL <option(s)>; ALTER TABLE table-name <constraint-clause> <,constraint-clause>...>; <ADD column-definition <,column-definition>...> <MODIFY column-definition <,column-definition>...> <DROP column <,column>...>;
CONT.
SAS Training 169
Syntax
CONT.
SAS Training 170
Syntax
SELECT <DISTINCT> object-item <,object-item>... FROM from-list <WHERE sql-expression> <GROUP BY group-by-item <,group-by-item>...> <HAVING sql-expression> <ORDER BY order-by-item <,order-by-item>...>;
SAS Training
171
Example
Proc Sql; select Lname, Fname, City, State, IdNumber, Salary, Jobcode from staff, payroll where idnumber=idnum ; Quit;
SAS Training
172
Hands-On Exercise
Create Table for Females from the Class dataset using Proc Sql. Expected Output
SAS Training
173
Solution
proc sql; create table females as select * from class where sex = 'F'; select * from females; quit;
SAS Training
174
Proc GPLOT
The GPLOT procedure plots the values of two or more variables on a set of coordinate axes (X and Y). The coordinates of each point on the plot correspond to two variable values in an observation of the input data set.
SAS Training 175
Syntax
PROC GPLOT <DATA=input-data-set> </option(s) >; PLOT plot-request(s) </option(s)PLOT2 plotrequest(s) option(s)>;
SAS Training
176
Example
proc gplot data = dupobs; goptions reset = all; plot landcost*country; run;
SAS Training
177
Output
SAS Training
178
Example
proc gplot data = dupobs; goptions reset=all; symbol1 color = green value=triangle ; symbol2 color=blue value=circle; symbol3 color=red value=square; plot landcost*country=vendor; run;
SAS Training
179
Output
SAS Training
180
PROC MEANS
The MEANS procedure provides data summarization tools to compute descriptive statistics for variables across all observations and within groups of observations.
SAS Training
181
Syntax
PROC MEANS <option(s)> BY <DESCENDING> variable-1 <... <DESCENDING> variablen><NOTSORTED>; CLASS variable(s) </ option(s)>; OUTPUT <OUT=SAS-data-set> <output-statistic-specification(s)>
SAS Training
182
proc sort data = temp; by cat; run; proc means data = temp; by cat; output out = temp1; run;
SAS Training 184
SAS Training
185
SAS Training
186
proc means data = sales; class prod; output out = temp n = Cost_n sal_n mean=Cost_m Sale_m; run;
SAS Training
187
SAS Training
188
PROC Summary
The SUMMARY procedure provides data summarization tools that compute descriptive statistics for variables across all observations or within groups of observations SAS Training
190
Syntax
PROC SUMMARY <option(s)> <statistic-keyword(s)BY <DESCENDING> variable-1<...<DESCENDING> variable-n> <NOTSORTED>; CLASS variable(s) </ option(s)>; FREQ variable; OUTPUT <OUT=SAS-data-set><output-statistic-specification(s)> <id-group-specification(s)> <maximum-id-specification(s)> <minimum-id-specification(s)></ option(s)> ; VAR variable(s)</ WEIGHT=weight-variable>;
SAS Training
191
SAS Training
192
SAS Training
193
PROC REPORT
Overview:
The REPORT procedure combines features of the PRINT, MEANS, and TABULATE procedures with features of the DATA step in a single report-writing tool that can produce a variety of reports
SAS Training
194
SAS Training
195
SAS Training
196
SAS Training
197
SAS Training
198
proc report data = rep nowd headline headskip; column product sales; define product / 'Product Name' order; define sales / 'Sales Occured' format=8.2; run;
SAS Training
199
report-item(s) report-item-1, report-item-2 <. . . , report-item-n> (`header-1 ' < . . . `header-n '> report-item(s) ) report-item=name
SAS Training
202
SAS Training
203
PROC REPORT
Decide the usage of a variable in DEFINE statement these usages are DISPLAY ORDER ACROSS GROUP ANALYSIS
SAS Training
204
DISPLAY Do not affect the order of variables in a row ORDER Used to change order of variables (Ascending/Descending/Formatted) ACROSS - creates a column for each value of an across variable. GROUP Creates Groups Analysis To calculate statistics Computed Variables defined for the report not on input data set
SAS Training 205
WIDTH= DESCENDING
SUMMARIZE
Start a new page after the last break line of a break located at the beginning of the report
SAS Training
PAGE
207
Example
Input Data set
proc report data = rep nowd headline headskip split='*'; column ('This is sales report' product N sales sales,min discount newprice); define product / 'Product Name' group; define sales / 'My*Sales' format=8.2; define min / 'Min'; define newprice / 'Discount Price' computed; compute newprice; newprice = _c3_ - _c5_; endcomp; break after product / skip; run; SAS Training 208
SAS Date
SAS Date : is a value that represents the number of days between January 1, 1960, and a specified date. SAS can perform calculations on dates ranging from A.D. 1582 to A.D. 19,900. Dates before January 1, 1960, are negative numbers; dates after are positive numbers.
SAS Training
209
SAS Date
How SAS Converts Calendar Dates to SAS Date Values :
SAS Training
210
SAS Date
Working with SAS Dates :
The SAS System converts date values back and forth between calendar dates with SAS language elements called formats and informats. Formats present a value, recognized by SAS, such as a date value, as a calendar date in a variety of lengths and notations. Informats read notations or a value, such as a calendar date, which may be in a variety of lengths, and then convert the data to a SAS date.
SAS Training
211
SAS Date
Example: Reading, Writing, and Calculating Date Values
data meeting; options nodate pageno=1 linesize=80 pagesize=60; input region $ mtg : mmddyy8.; sendmail=mtg-45; datalines; N 11-24-99 S 12-28-99 E 12-03-99 W 10-04-99 ;
SAS Training
212
SAS Date
proc print data=meeting; format mtg sendmail date9.; title 'When To Send Announcements'; run;
SAS Training
213
SAS Date
Date formats DATEw. Format: Writes date values in the form ddmmmyy or ddmmmyyyy. Syntax : DATEw.
SAS Training
214
SAS Date
Examples
The example table uses the input value of 15415, which is the SAS date value that corresponds to March 16, 2002.
SAS Training
215
SAS Date
DDMMYYw. Format : Writes date values in the
form ddmmyy or ddmmyyyy. Syntax DDMMYYw.
SAS Training
216
SAS Date
Examples : The example table uses the input value of 15415, which is the SAS date value that corresponds to March 16, 2002.
SAS Training
217
SAS Date
Date Informats DATEw. : Reads date values in the form ddmmmyy or ddmmmyyyy. Syntax DATEw.
SAS Training
218
SAS Date
Example : input calendar_date date11.;
SAS Training
219
SAS Date
DDMMYYw. Informat :Reads date values in the form ddmmyy or ddmmyyyy. Syntax DDMMYYw.
SAS Training
220
SAS Date
Example : input calendar_date ddmmyy10.;
SAS Training
221
SAS Date
Functions DATE Function : Returns the current date as a SAS date value Syntax : DATE() Details : The DATE function produces the current date in the form of a SAS date value, which is the number of days since January 1, 1960.
SAS Training
222
SAS Date
Example : tday=date(); Put tday ddmmyy8.;
SAS Training
223
SAS Date
TODAY function : Returns the current date as a SAS date value. Syntax : TODAY() Details : TODAY is identical to the DATE function. The TODAY function produces the current date in the form of a SAS date value, which is the number of days since January 1, 1960.
SAS Training
224
SAS Date
YEAR function : Returns the year from a SAS date value. Syntax : YEAR(date) Details : The YEAR function produces a four-digit numeric value that represents the year.
SAS Training
225
SAS Date
Example :
SAS Training
226
SAS Date
MONTH function : Returns the month from a SAS date value. Syntax : MONTH(date) Details : The MONTH function returns a numeric value that represents the month from a SAS date value. Numeric values can range from 1 through 12.
SAS Training
227
SAS Date
Example :
SAS Training
228
SAS Date
DAY Function : Returns the day of the month from a SAS date value. Syntax : DAY(date) Details : The DAY function produces an integer from 1 to 31 that represents the day of the month.
SAS Training
229
SAS Date
Example :
SAS Training
230
SAS Date
QTR function: Returns the quarter of the year from a SAS date value. Syntax : QTR(date) Details : The QTR function returns a value of 1, 2, 3, or 4 from a SAS date value to indicate the quarter of the year in which a date value falls.
SAS Training
231
SAS Date
Example :
SAS Training
232
SAS Date
WEEKDAY Function : Returns the day of the week from a SAS date value. Syntax : WEEKDAY(date) Details : The WEEKDAY function produces an integer that represents the day of the week, where 1=Sunday, 2=Monday, . . . , 7=Saturday.
SAS Training
233
SAS Date
Example :
SAS Training
234
SAS Date
MDY Function :Returns a SAS date value from month, day, and year values. Syntax : MDY(month,day,year)
SAS Training
235
SAS Date
Example :
SAS Training
236
INTRODUCTION TO MACROS
SAS Training
237
The macro facility is a tool for extending and customizing the SAS System and for reducing the amount of text you must enter to do common tasks
SAS Training
238
Macro variables are an efficient way of replacing text strings in SAS code The Simplest way to define a macro variable is to use the %LET statement to assign the macro variable a name and a value. Eg: %let macname = Test;
SAS Training 239
Resolve a Macro Variable Value using the & symbol Eg: %let macname = Test; title This is a &macname The macro processor resolves the reference to the macro variable MACNAME, and the statement becomes Title This is a Test
SAS Training 240
Generating SAS Code Using Macros Macros Allow you to execute a SAS code multiple times without compiling it. A macro definition is placed between a %MACRO statement and a %MEND (macro end) statement, as follows: %MACRO macro-name; SAS code %MEND macro-name; Execute the macro using %macro-name
SAS Training 241
242
Execution of this macro produces the following program: data temp1; set temp; x= Hello; run;
SAS Training
243
Hands On Exercise
Create a macro datcrt that creates a data set cake1 from train.cake
SAS Training
244
Solution
%macro datcrt; data cake1; set train.cake; run; %macro datcrt; %datcrt;
SAS Training
245
A macro variable defined in parentheses in a %MACRO statement is a macro parameter Macro parameters allow you to pass information into a macro
SAS Training
246
Hands On Exercise
Create a Macro datprnt that prints the data set cake and dept1 when passed as a macro parameter
SAS Training 248
Solution
%macro datprnt (dat= ); proc print data = &dat; run; %mend datprnt; %datprnt (dat=cake); %datprnt (dat=dept1);
SAS Training
249
Conditionally Generating SAS Code Use %IF-%THEN-%ELSE macro statements to conditionally generate SAS code with a macro. Eg:
%macro test (info= , mydata = ); %if &info = print %then %do; proc print data = &mydata; run; %end; %else %if &info = report %then %do; proc report data = &mydata; run; %end; %mend test;
SAS Training 250
%test(info=print,mydata = data1); /* Calling the Macro */ Result of the macro execution: Proc print data = data1; run;
SAS Training
251
Macro Variables
Macro variables are tools that enable you to dynamically modify the text in a SAS program through symbolic substitution. You can assign large or small amounts of text to macro variables, and after that, you can use that text by simply referencing the variable that contains it. Macro variables defined by macro programmers are called user-defined macro variables. Eg: %let name = Henry Woodbridge is Male; Those defined by the SAS System are called automatic macro variables Eg: Sysdate,Sysday, SysProcessID, etc.
SAS Training
252
%Let iterative %DO statement %GLOBAL statement %INPUT statement INTO clause of the SELECT statement in SQL %LOCAL statement %MACRO statement SYMPUT routine %WINDOW statement.
SAS Training 253
title1 "Contents of Data Set &dsn"; data temp; set &dsn; if age>=20; run;
SAS Training
254
%let name=sales; data new&name; set save.&name; more SAS statements run; The SAS system sees it as DATA NEWSALES; SET SAVE.SALES; more SAS statements RUN;
SAS Training
256
Eg:
DATA=PERSNL&YR.EMPLOYES, where &YR contains two characters for a year Data = &MONTH&YR %let name=sales; data new&name; set save.&name; more SAS statements run; The SAS system sees it as DATA NEWSALES; SET SAVE.SALES; more SAS statements RUN;
SAS Training
257
Hands On Exercise
Create a data set newcake from cake by passing cake as a macro parameter
SAS Training
258
Solution
%macro app(dat=); data new&dat; set cake; run; %mend app; %app(dat=cake);
SAS Training
259
Hands On Exercise
Create a data set cakenew from cake by passing cake as a macro parameter
SAS Training
260
Solution
%macro app(dat=); data &dat.new; set cake; run; %mend app; %app(dat=cake);
SAS Training
261
SAS Training
262
1. 2. 3. 4.
SAS Training
263
SAS Training
264
MLOGIC(TEST): Beginning execution. MLOGIC(TEST): Parameter X has value 1 SYMBOLGEN: Macro variable X resolves to 1 MLOGIC(TEST): %IF condition &x = 1 is TRUE MLOGIC(TEST): %PUT "Value of x is 1" "Value of x is 1" MPRINT(TEST): proc print data = train.names; MPRINT(TEST): run; NOTE: There were 4 observations read from the data set TRAIN.NAMES. NOTE: PROCEDURE PRINT used: real time 0.01 seconds cpu time 0.00 seconds MLOGIC(TEST): Ending execution.
SAS Training
266
SAS Training
267
Every macro variable has a scope. A macro variable's scope determines how it is assigned values and how the macro processor resolves references to it. Two types of scope exist for macro variables: global and local Global macro variables exist for the duration of the SAS session and can be referenced anywhere in the program--either inside or outside a macro Local macro variables exist only during the execution of the macro in which the variables are created and have no meaning outside the defining macro.
SAS Training
268
Eg:
%macro A; %let Loc1 = xyz; %macro B; %let Loc2 = abc; %mend B; %mend A; LOC1 is local to both A and B. However, LOC2 is local only to B. Macro variables are stored in symbol tables, which list the macro variable name and its value There is a global symbol table, which stores all global macro variables. Local macro variables are stored in a local symbol table that is created at the beginning of the execution of a macro.
SAS Training
269
%let county=Clark; %macro concat; data _null_; length longname $20; longname="&county"||" County"; put longname; run; %mend concat; %concat
Calling the macro CONCAT produces the following statements: data _null_; length longname $20; longname="Clark"||" County"; put longname; run; The PUT statement writes the following to the SAS log: Clark County
SAS Training
270
The new macro variable definition simply updates the existing global one.
SAS Training 271
Local macro variables are defined within an individual macro Each macro you invoke creates its own local symbol table. Local macro variables exist only as long as a particular macro executes; when the macro stops executing, all local macro variables for that macro cease to exist.
SAS Training
272
SAS Training
273
SAS Training
274
SAS Training
275
SAS Training
276
CALL SYMPUT
Syntax: CALL SYMPUT(macro-variable,value); Eg:
1)call symput('new','testing'); Assigns the character string testing to macro variable NEW 2) data team1; input position : $8. player : $12.; call symput(position,player); cards; shortstp Ann pitcher Tom frstbase Bill ; This DATA step creates the three macro variables SHORTSTP, PITCHER, and FRSTBASE and respectively assign them the values ANN, TOM, and SAS Training 277 BILL.
CALL SYMPUT
3)
data team2; input position $12. player $12.; call symput('POS'||left(_n_), position); cards; shortstp Ann pitcher Tom frstbase Bill
; This form is useful for creating a series of macro variables. For example, the CALL SYMPUT statement builds a series of macro variable names by combining the character string POS and the left-aligned value of _N_ and assigns values to SAS Training variables POS1, POS2, the macro 278
Hands On Exercise
Create macro varialbes ssn1, ssn2,.sssn from ssn variable in dept1 dataset using CALL SYMPUT
SAS Training 279
Solution
SAS Training
280
SYMGET
Syntax SYMGET(argument) Eg: 1) x=symget('g'); Assign the value of the macro variable G to the DATA step variable X. 2) length key $ 8; input code $; key=symget(code); Assigns the value stored in the DATA step variable CODE, which contains a macro variable name, to the DATA step variable KEY 3) score=symget('s'||left(_n_)); Assigns the letter s and the number of the current iteration
SAS Training
281
SAS Training
282
Macro Quoting
Macro quoting functions tell the macro processor to interpret special characters and mnemonics as text rather than as part of the macro language. If you did not use a macro quoting function to mask the special characters, the macro processor or the rest of the SAS System might give the character a meaning you did not intend.
Eg:
Is %sign a call to the macro SIGN or a phrase "percent sign"? Is OR the mnemonic Boolean operator or the abbreviation for Oregon? Is the quote in O'Malley an unbalanced single quotation mark or just part of the name? Is Boys&Girls a reference to the macro variable &GIRLS or a group of children?
SAS Training
283
Macro Quoting
The following macro quoting functions are most commonly used: %STR and %NRSTR %BQUOTE and %NRBQUOTE %SUPERQ Eg: %let print=proc print; run;; /* ERROR */ To avoid the ambiguity and correctly assign the value of PRINT, you must mask the semicolons with the macro quoting function %STR, as follows: %let print=%str(proc print; run;);
SAS Training 284