Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 7

SAS Programming by Example [11]

Example 1 Formatting Values in a Questionnaire Features: PROC FORMAT, Character Formats, Numeric Formats, Numeric Ranges, PUT Function, OTHER= Option PROC FORMAT; VALUE GENDER 1 = "Male" 2 = "Female" . = "Missing" OTHER = "Miscoded"; VALUE $RACE C" = "Caucasian" "A" = "African American" "H" = "Hispanic" "N" = "Native American" OTHER = "Other" " " = "Missing"; VALUE $LIKERT "1" = "Str dis" "2" = "Disagree" "3" = "No opinion" "4" = "Agree" "5" = "Str agree" OTHER = " "; VALUE AGEGROUP LOW-<20 = "< 20" 20-<40 = "20 to <40" 40-<60 = "40 to <60" 60-HIGH = "60+"; DATA QUESTION; INPUT ID $ 1-2 GENDER 4 RACE $ 6 AGE 8-9 SATISFY $ 11 TIME $ 13; FORMAT GENDER GENDER. RACE $RACE. SATISFY TIME $LIKERT; AGEGROUP=PUT (AGE, AGEGROUP.); DATALINES; 01 1 C 45 4 2 02 2 A 34 5 4 03 1 C 67 3 4 04 N 18 5 5 05 9 H 47 4 2 06 1 X 55 3 3 07 2 56 2 2 08 20 1 1 RUN; PROC PRINT DATA=QUESTION NOOBS; TITLE "Data listing with formatted values"; RUN;

Note: $LIKERT. Was assigned to more than one variable, pls notice That the period (.) after the format names in the FORMAT Statement PUT function to create a character variable. ID GENDER RACE AGE SATISFY TIME AGEGROUP 01 Male Caucasian 45 Agree Disagree 40 to <60 02 Female African American 34 Str agree Agree 20 to <40 03 Male Caucasian 67 No opinion Agree 60+ 04 Missing Native American 18 Str agree Str agree < 20 05 Miscoded Hispanic 47 Agree Disagree 40 to <60 06 Male Other 55 No opinion No opinion 40 to <60 07 Female Missing 56 Disagree Disagree 40 to <60 08 Missing Missing 20 Str dis Str dis 20 to <40 Example 2 Encountering a Subtle Problem with Missing Values, Formats, And PROC FREQ Features: PROC FORMAT, PROC FREQ PROC FORMAT; VALUE BADFMT 1="ONE" 2="TWO" OTHER="MISCODED"; RUN; DATA TEST; INPUT X Y; DATALINES; 11 22 55 3. ; PROC FREQ DATA=TEST; TABLES X Y; FORMAT X Y BADFMT.; RUN; X Frequency Percent Frequency Percent ONE 1 25.0 1 25.0 TWO 1 25.0 2 50.0 MISCODED 2 50.0 4 100.0 Cumulative Cumulative Y Frequency Percent Frequency Percent ONE 1 50.0 1 50.0 TWO 1 50.0 2 100.0 Frequency Missing = 2

Example 3 Resolving the Subtle Problem Features: PROC FORMAT, Separate Missing Assignment PROC FORMAT; VALUE GOODFMT. = 'MISSING' 1 = 'ONE' 2 = 'TWO' OTHER = 'MISCODED' Example 4 Checking for Invalid Values: A DATA Step Approach (Setting Invalid Values to Missing) Features: IN Operator, Character Missing Values DATA SCREEN4; INPUT ID 1-3 GENDER $ 4 ...; IF GENDER NOT IN ('M','F') THEN GENDER=' '; Example 5 Checking for Invalid Values: A DATA Step Approach (Separating Invalid Values to Missing) Features: IN Operator, Character Missing Values DATA SCREEN5; INPUT ID 1-3 GENDER $ 4 ...; IF GENDER NOT IN ('M','F',' ') THEN GENDER='X'; Example 6 Using a User-Created In format to Filter Input Data (Setting Invalid Values to Missing) Features: PROC FORMAT, INVALUE Statement, In formats, IN Operator, _SAME_ Value PROC FORMAT; INVALUE $GENDER 'M', 'F'=_SAME_ OTHER=' '; RUN; DATA SCREEN6; INPUT @1 ID 3. @4 GENDER $GENDER1.; Example 7 Using a User-Created INFORMAT to Filter Input Data (Separating Invalid Values to Missing)

Features: PROC FORMAT, INVALUE Statement, In formats, IN Operator, _SAME_ Value PROC FORMAT; INVALUE $GENDER 'M', 'F', ' '=_SAME_ OTHER ='X'; RUN; DATA SCREEN7; INPUT @1 ID 3. @4 GENDER $GENDER1.; Example 8 Checking Ranges for Numeric Variables Features: PROC FORMAT, in formats, _SAME_ Value PROC FORMAT; INVALUE SBPFMT 40 - 300=_SAME_ OTHER =; INVALUE DBPFMT 10 - 150=_SAME_ OTHER =; RUN; DATA FORMAT8; INPUT @1 ID $3. @4 SBP SBPFMT3. @7 DBP DBPFMT3; DATALINES; 001160090 002310220 003020008 004 080 005150070 ; PROC PRINT DATA=FORMAT8; RUN; Example 9 Using Different Missing Values to Keep Track of High and Low Values Features: PROC FORMAT, INVALUE and VALUE Statements, _SAME_ Value, HIGH, LOW, Alternate Missing Values (.H and .L), MISSING Option Used with a TABLES Statement PROC FORMAT; INVALUE SBPFMT LOW-<40 =.L 40-300 =_SAME_ 301-HIGH=.H; INVALUE DBPFMT LOW-<10 =.L 10-150 =_SAME_ 151-HIGH=.H; VALUE CHECK .H="High" .L="Low" . ="Missing" OTHER="Valid";

RUN; DATA FORMAT9; INPUT @1 ID $3. @4 SBP SBPFMT3. @7 DBP DBPFMT3; DATALINES; 001160090 002310220 003020008 004 080 005150070 ; PROC PRINT DATA=FORMAT9 NOOBS; TITLE "Listing from Example 9"; RUN; Listing from Example 9 ID SBP DBP 001 160 90 002 H H 003 L L 004 . 80 005 150 70 PROC FREQ DATA=FORMAT9; FORMAT SBP DBP CHECK.; TABLES SBP DBP / MISSING NOCUM; RUN; Listing from Example 9 SBP Frequency Percent High 1 20.0 Low 1 20.0 Missing 1 20.0 Valid 2 40.0 DBP Frequency Percent High 1 20.0 Low 1 20.0 Valid 3 60.0 PROC MEANS DATA=FORMAT9 N MEAN MAXDEC=1; VAR SBP DBP; RUN; Listing from Example 9 Variable N Mean ------------------------SBP 2 155.0 DBP 3 80.0

------------------------Example 10 Creating and Using an Enhanced Numeric In format Features: PROC FORMAT, INVALUE statement, Enhanced Numeric In format PROC FORMAT; INVALUE TEMPER 70-110=_SAME_ "N" = 98.6 OTHER=; RUN; DATA TEST; INPUT TEMP: TEMPER. @@; DATALINES; 99.7 N 97.9 N N 112.5 ; PROC PRINT DATA=TEST NOOBS; TITLE "Temperature Listing"; RUN; Temperature Listing TEMP 99.7 98.6 97.9 98.6 98.6 . Example 11 Using a SAS Data Set to Create a Character Format Features: PROC FORMAT, CNTLIN and FMTLIB Options, RETAIN Statement, Creating a Character Format PROC FORMAT; VALUE $ICDFMT "072"="Mumps" "410"="Heart Attack" "487"="Influenza" "493"="Asthma" "700"="Corns"; RUN; DATA CODES; INPUT @1 ICD9 $3. @5 DESCRIPT $12; DATALINES; 072 MUMPS 410 HEART ATTACK 487 INFLUENZA 493 ASTHMA 700 CORNS

; DATA CONTROL; + RETAIN FMTNAME "$ICDFMT" TYPE "C; SET CODES (RENAME= (ICD9=START DESCRIPT=LABEL)); RUN; PROC FORMAT CNTLIN=CONTROL; RUN; DATA EXAMPLE; INPUT ICD9 $ @@; FORMAT ICD9 $ICDFMT; DATALINES; 072 493 700 410 072 700 ; PROC PRINT NOOBS DATA=EXAMPLE; TITLE "Using a Control Data Set"; VAR ICD9; RUN; Note: Use a RETAIN statement to assign values to the variables FMTNAME and TYPE since they are the same on every Observation, can use the assignment instead, but it is More efficient. After the control data set is created, you merely have to Tell PROC FORMAT to use it to create a format, this is Done with the CNTLIN option. Note: One very useful option with PROC FORMAT is the FMTLIB option, Which gives you a descriptive listing of your format. PROC FORMAT CNTLIN=CONTROL FMTLIB; Example 12 Using a SAS Data Set to Create a Numeric Format Features: PROC FORMAT, CNTLIN and FMTLIB Options, RETAIN Statement, Creating a Numeric Format DATA COUNTRY; RETAIN FMTNAME "COUNTRY" TYPE "N"; INPUT START 1-2 LABEL $ 3-15; DATALINES; 01UNITED STATES 02FRANCE 03ENGLAND 04SPAIN 05GERMANY ; PROC FORMAT CNTLIN=COUNTRY FMTLIB; RUN;

You might also like