Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

What SAS statements would you code to read an external raw data file to a DATA step?

We use SAS statements FILENAME to specify the location of the file INFILE - Identifies an external file to read with an IN !" statement IN !" to specify the #aria$les that the data is identified with% How do you read in the variables that you need? !sin& Input statement with column 'line pointers( informats and len&th specifiers% Are you familiar with special input delimiters? How are they used? )LM( )S) are the special input delimiters* )ELIMI"E+, delimiter(s) specifies an alternate delimiter -other than a $lan./ to $e used for LIS" input )S) -delimiter-sensiti#e data/ specifies that when data #alues are enclosed in 0uotation mar.s( delimiters within the #alue $e treated as character data% "he )S) option chan&es how SAS treats delimiters when you use LIS" input and sets the default delimiter to a comma% When you specify )S)( SAS treats two consecuti#e delimiters as a missin& #alue and remo#es 0uotation mar.s from character #alues http1''support%sas%com'onlinedoc'234'&et)oc'en'lrdict%hlp'a555367248%htm9a5553::3;2 If reading a variable length file with fixed input how would you prevent SAS from reading the next record if the last variable didn!t have a value? <ptions MISS<=E+ and "+!N><=E+ options%% MISS<=E+

pre#ents an IN !" statement from readin& a new input data record if it does not find #alues in the current input line for all the #aria$les in the statement% When an IN !" statement reaches the end of the current input data record( #aria$les without any #alues assi&ned are set to missin&% "+!N><=E+ o#errides the default $eha#ior of the IN !" statement when an input data record is shorter than the IN !" statement expects% ?y default( the IN !" statement automatically reads the next input data record% "+!N><=E+ ena$les you to read #aria$le-len&th records when some records are shorter than the IN !" statement expects% =aria$les without any #alues assi&ned are set to missin&% http1''support%sas%com'onlinedoc'234'&et)oc'en'lrdict%hlp'a555367248%htm9a5553::3;2 What is the difference between an informat and a format? "ame three informats or formats% INF<+MA" Statement - Associates informats with #aria$les It@s $asically used in an input ' SAL create ta$le statements to read external file raw data or data that is not in a SAS format% http1''support%sas%com'onlinedoc'234'&et)oc'en'lrdict%hlp'a5553:;866%htm e&1 commaw% datew% Wordatew% dollarw% B#aryin&len&thw% F<+MA" Statement Associates formats with #aria$les It@s $asically used in a datastep format ' SAL select ' rocedure format statements to output SAS data to a file'report etc% Formats can loo.-li.e informats $ut are differentiated as to which statement they are used in*

e&% )atew%( Worddatew%( mmddyyw% http1''support%sas%com'onlinedoc'234'&et)oc'en'lrdict%hlp'a5553:;838%htm "ame and describe three SAS functions that you have used if any? "he most common functions that would $e used are>on#ersion functions - Input ' ut ' int ' ceil ' floor >haracter functions - Scan ' su$str ' index ' Left ' trim ' compress ' cat ' catx ' upcase(lowcase Arithmetic functions - Sum ' a$s ' Attri$ute info functions Attrn ' len&th )ataset open ' close ' exist )irectory - dexist ' dopen ' dclose ' dcreate ' dinfo File functions fexist ' fopen' filename ' fileref SAL functions coalesce ' count ' sum' mean )ate functions date ' today ' datdif ' datepart ' datetime ' intc. ' mdy Array functions dim http1''sastechies%com'SASfunctions%php How would you code the criteria to restrict the output to be produced? In #iew of in-sufficient clarity as to what the inter#iewer refers to Clo$al statement options o$s,D )ataset options o$s, roc SAL - N< +IN" option for reportin& ' ino$s, ( outo$s, for SAL select

roc datasets N<LIS" option What is the purpose of the trailing # and the ##? How would you use them? Line-hold specifiers .eep the pointer on the current input record when

a data record is read $y more than one IN !" statement -trailin& E/ one input line has #alues for more than one o$ser#ation -dou$le trailin& E/ a record needs to $e reread on the next iteration of the )A"A step -dou$le trailin& E/%

!se a sin&le trailin& E to allow the next IN !" statement to read from the same record% !se a dou$le trailin& E to hold a record for the next IN !" statement across iterations of the )A"A step% Normally( each IN !" statement in a )A"A step reads a new data record into the input $uffer% When you use a trailin& E( the followin& occurs1 "he pointer position does not chan&e% No new record is read into the input $uffer% "he next IN !" statement for the same iteration of the )A"A step continues to read the same record rather than a new one%

SAS releases a record held $y a trailin& E when

a null IN !" statement executes1 inputD

an IN !" statement without a trailin& E executes the next iteration of the )A"A step $e&ins%

Normally( when you use a dou$le trailin& E -EE/( the IN !" statement for the next iteration of the )A"A step continues to read the same record% SAS releases the record that is held $y a dou$le trailin& E

immediately if the pointer mo#es past the end of the input record immediately if a null IN !" statement executes1 inputD

when the next iteration of the )A"A step $e&ins if an IN !" statement with a sin&le trailin& E executes later in the )A"A step1

input ED A record held $y the dou$le trailin& at si&n -EE/ is not released until

the input pointer mo#es past the end of the record% "hen the input pointer mo#es down to the next record%

F----G----35--$G358 28 :; 354 ;6 84 47 :H

an IN !" statement without a line-hold specifier executes%

input I) B6% EED % % input )epartment H%D

ena$les the next IN !" statement to read from the same record releases the current record when a su$se0uent IN !" statement executes without a line-hold specifier%

%nli&e the ## the single # also releases a record when control returns to the top of the DATA step for the next iteration' data perm%sales2:D infile data2: missover( input I) B6% ED do Auarter,3 to 6D input Sales 1 comma% ED outputD endD runD +aw )ata File )ata2: F----$----35---G----85---G----45---G----65 5:46 3(484%46 8(6:8%;H 4(8:7%7H H(46H%H8 5264 3(25;%46 8(H75%4; 3552 8(246%38 4(45;%63 6(3:7%3; :(H;3%;3

data perm%people )drop*type+D

infile censusD retain AddressD input type B3% ED if type,IJI then input E4 Address B3H%D if type,I ID input E4 Name B35% E34 A&e 4% E3H Cender B3%D runD

F$---G----35---G---J 483 S% MAIN S" data perm%residntsD MA+L E 83 F infile censusD retain AddressD WILLIAM M 84input M type B3% ED if type,IJI then doD S!SAN M 4 if F KnK F 3 then outputD "otal,5D input Address B 4-3:D endD else if type,I I then totalG3D F----G----35---G----85 J 483 S% MAIN S" MA+L E 83 F WILLIAM M 84 M S!SAN M 4 F J 486 S% MAIN S" "J<MAS J :2 M WAL"E+ S 67 M ALI>E A 68 F MA+LANN A 85 F J N<JN S 37 M 48HA S% MAIN S"

NAMES L 46 M J LIOA A 43 F 48H? S% MAIN S" MA+C< M 8: F WILLIAM + 8: M +<?E+" W 3 M

%nder what circumstances would you code a S,-,.T construct instead of I/ statements? "he SELE>" statement $e&ins a SELE>" &roup% SELE>" &roups contain WJEN statements that identify SAS statements that are executed when a particular condition is true% !se at least one WJEN statement in a SELE>" &roup% An optional <"JE+WISE statement specifies a statement to $e executed if no WJEN condition is met% An EN) statement ends a SELE>" &roup% Null statements that are used in WJEN statements cause SAS to reco&niPe a condition as true without ta.in& further action% Null statements that are used in <"JE+WISE statements pre#ent SAS from issuin& an error messa&e when all WJEN conditions are false% %sing Select0When impro#es processin& efficiency and understanda$ility in pro&rams that needed to chec. a series of conditions for the same #aria$le% %se I/0TH,"1,-S, statements for pro&rams with few statements% !sin& a su$settin& IF statement without a "JEN clause could $e dan&erous $ecause it would process only those records that meet the condition specified in the IF clause% http1''support%sas%com'onlinedoc'234'&et)oc'en'lrdict%hlp'a555853277%htm

What statement you code to tell SAS that it is to write to an external file? /I-,"A2, 1 /I-,1 3%T The /I-,"A2, statement is an optional statement that species the location of the external file% 3%T Statement Writes the #aria$le #alues to the external file% The /I-, statement specifies the current output file for !" statements in the )A"A step% When multiple FILE statements are present( the !" statement $uilds and writes output lines to the file that was specified in the most recent FILE statement% If no FILE statement was specified( the !" statement writes to the SAS lo&% "he specified output file must $e an external file( not a SAS data li$rary( and it must $e a #alid access type% If reading an external file to produce an external file what is the shortcut to write that record without coding every single variable on the record? !se the KinfileK option in the put statement filename some Ic1Qcool%datID filename cool3 Ic1Qcool3%datID data KnullKD infile someD input someD file cool3D put 4infile4( runD

If you!re not wanting any SAS output from a data step how would you code the data statement to prevent SAS from producing a set? )ata Knull_; 4"%--4 - specifies that SAS does not create a data set when it executes the )A"A step% )ata KnullK is maRorly used in
o

creatin& 0uic. macro #aria$les with call symput routine )ata KnullKD Set somedataD >all symput-Smac#ar@(dsn#aria$le/D +unD

e&%

>reatin& a >ustom +eport

E&% "he second )A"A step in this pro&ram produces a custom report and uses the KN!LLK .eyword to execute the )A"A step without creatin& a SAS data set1 data salesD input dept 1 B35% Ran fe$ marD datalinesD shoes 6466 4HHH 8777 housewares 4::: 6;;; :222 appliances H4333 :388 63444 D data KnullKD set salesD 0tr3tot,RanGfe$GmarD put I"otal Auarterly Sales1 I 0tr3tot dollar38%D runD

What is the one statement to set the criteria of data that can be coded in any step? WJE+E statement can sets the criteria for any data set in a datastep or a proc step% Have you ever lin&ed SAS code? If so describe the lin& and any re5uired statements used to either process the code or the step itself' SAS code could $e lin.ed usin& the C<"< or the Lin. statement% C<"< - http1''support%sas%com'onlinedoc'234'&et)oc'en'lrdict%hlp'a555853262%htm LINM - http1''support%sas%com'onlinedoc'234'&et)oc'en'lrdict%hlp'a5558532:8%htm "he difference $etween the LINM statement and the C< "< statement is in the action of a su$se0uent +E"!+N statement% A +E"!+N statement after a LINM statement returns execution to the statement that follows LINM% A +E"!+N statement after a C< "< statement returns execution to the $e&innin& of the )A"A step( unless a LINM statement precedes C< "<( in which case execution continues with the first statement after LINM% In addition( a LINM statement is usually used with an explicit +E"!+N statement( whereas a C< "< statement is often used without a +E"!+N statement% When your pro&ram executes a &roup of statements at se#eral points in the pro&ram( usin& the LINM statement simplifies codin& and ma.es pro&ram lo&ic easier to follow% If your pro&ram executes a &roup of statements at only one point in the pro&ram( usin& )<-&roup lo&ic rather than LINM-+E"!+N lo&ic is simpler% Coto e&% data infoD input xD if 3T,xT,H then &o to addD put x,D

add1 sumxGxD datalinesD : 7 484 D -in& ,g' data hydroD input type B depth station BD 'U lin. to la$el calcu1 U' if type ,Ialu#I then lin. calcuD date,today-/D 'U return to top of step U' returnD calcu1 if station,IsiteK3I then ele#atn,77H5-depthD else if station,IsiteK8I then ele#atn,HH55-depthD 'U return to date,today-/D U' returnD datalinesD alu# H84 siteK3 uppa 846 siteK8

alu# 777 siteK8 %%%more data lines%%% D

How would you include common or reuse code to be processed along with your statements? - !sin& SAS Macros% - !sin& a Vinclude statement When loo&ing for data contained in a character string of 678 bytes which function is the best to locate that data9 scan index or indexc? Index function - Searches a character expression for a strin& of characters SAS Statements a,IA?>%)EF -W,L/ID $,IW,LID x,index-a($/D put xD For learning purposes "he IN)EW> function searches for the first occurrence of any indi#idual character that is present within the character strin&( whereas the IN)EW function searches for the first occurrence of the character strin& as a pattern% :esults

35

$,Iha#e a &ood dayID x,indexc-$(IpleasantI(I#eryI/D put xD "he IN)EWW function searches for strin&s that are words( whereas the IN)EW function searches for patterns as separate words or as parts of other words% IN)EW> searches for any characters that are present in the excerpts% s,Iasdf ado& do&ID p,Ido& ID x,indexw-s(p/D put xD

If you have a data set that contains 688 variables but you need only five of those what is the code to force SAS to use only those variables? !se MEE , dataset option -data statement or set statement/ or MEE statement in a datastep% e&% )ata fewdata -.eep , #ar35 #ar33/D Set fulldata -Meep, =A+3 =A+8 =A+4 =A+6 =A+H/D Meep #ar7 #ar:D +unD

.ode a 3:;. S;:T on a data set containing State District and .ounty as the primary variables along with several numeric variables'

3roc sort data, )istK>ountyD ?y state district cityD :unD How would you delete duplicate observations? noduprecs option in a roc Sort% data cric.etD input id country B2% scoreD cardsD 3 australia 468 8 somerset 464 3 australia 468 8 somerset 463 D runD proc sort data , cric.et noduprecsD $y idD runD Jere in the example o$ser#ation 3 and 4 are duplicate records*%so <$s 3 is retained*

How would you delete observations with duplicate &eys? nodup.ey option in a roc Sort% proc sort data , cric.et nodup.eyD $y idD runD In the a$o#e example <$ser#ation 3' 4 and 8 ' 6 ha#e duplicate .ey -#aria$le id/ #alues i%e% 3 and 8 respecti#ely*so o$ser#ations 4 ' 6 &et deleted* How would you code a merge that will &eep only the observations that have matches from both sets' data mer&eddataD mer&e one-in,A/ two-in,?/D ?y I)D if A and ?D runD How would you code a merge that will write the matches of both to one data set the non0 matches from the left0most data' Data one two threeD Mer&e )SN3 -in,A/ )SN8 -in,?/D ?y I)D If A and ? then output oneD If A and not ? then output twoD If not A and ? then output threeD :unD

What is the 3rogram Data $ector )3D$+? What are its functions? )= is a lo&ical area in memory where SAS $uilds a data set( one o$ser#ation at a time% When a pro&ram executes( SAS reads data #alues from the input $uffer or creates them $y executin& SAS lan&ua&e statements% "he data #alues are assi&ned to the appropriate #aria$les in the pro&ram data #ector% From here( SAS writes the #alues to a SAS data set as a sin&le o$ser#ation% Alon& with data set #aria$les and computed #aria$les( the )= contains two automatic #aria$les( KNK and KE++<+K% "he KNK #aria$le counts the num$er of times the )A"A step $e&ins to iterate% "he KE++<+K #aria$le si&nals the occurrence of an error caused $y the data durin& execution% "he #alue of KE++<+K is either 5 -indicatin& no errors exist/( or 3 -indicatin& that one or more errors ha#e occurred/% SAS does not write these #aria$les to the output data set%

Does SAS !Translate! )compile+ or does it !Interpret!? ,xplain' At compile time when a SAS data set is read what items are created? SAS compiles the code sent to the compiler% When you su$mit a )A"A step for execution( SAS chec.s the syntax of the SAS statements and compiles them( that is( automatically translates the statements into machine code% In this phase( SAS identifies the type and len&th of each new #aria$le( and determines whether a type con#ersion is necessary for each su$se0uent reference to a #aria$le% )urin& the compile phase( SAS creates the followin& three items1 input $uffer is a lo&ical area in memory into which SAS reads each record of raw data when SAS executes an IN !" statement% Note that this $uffer is created only when the )A"A step reads raw data% -When the )A"A step reads a SAS data set( SAS reads the data

directly into the pro&ram data #ector%/ pro&ram data #ector - )=/ is a lo&ical area in memory where SAS $uilds a data set( one o$ser#ation at a time% When a pro&ram executes( SAS reads data #alues from the input $uffer or creates them $y executin& SAS lan&ua&e statements% "he data #alues are assi&ned to the appropriate #aria$les in the pro&ram data #ector% From here( SAS writes the #alues to a SAS data set as a sin&le o$ser#ation% Alon& with data set #aria$les and computed #aria$les( the )= contains two automatic #aria$les( KNK and KE++<+K% "he KNK #aria$le counts the num$er of times the )A"A step $e&ins to iterate% "he KE++<+K #aria$le si&nals the occurrence of an error caused $y the data durin& execution% "he #alue of KE++<+K is either 5 -indicatin& no errors exist/( or 3 -indicatin& that one or more errors ha#e occurred/% SAS does not write these #aria$les to the output data set% is information that SAS creates and maintains a$out each SAS data set( includin& data set attri$utes and #aria$le attri$utes% It contains( for example( the name of the data set and its mem$er type( the date and time that the data set was created( and the num$er( names and data types -character or numeric/ of the #aria$les% The ,xecution 3hase ?y default( a simple )A"A step iterates once for each o$ser#ation that is $ein& created% "he flow of action in the Execution hase of a simple )A"A step is descri$ed as follows1 "he )A"A step $e&ins with a )A"A statement% Each time the )A"A statement executes( a new iteration of the )A"A step $e&ins( and the KNK automatic #aria$le is incremented $y 3% SAS sets the newly created pro&ram #aria$les to missin& in the pro&ram data #ector - )=/%

descriptor information

SAS reads a data record from a raw data file into the input $uffer( or it reads an o$ser#ation from a SAS data set directly into the pro&ram data #ector% Lou can use an IN !"( ME+CE( SE"( M<)IFL( or ! )A"E statement to read a record% SAS executes any su$se0uent pro&rammin& statements for the current record% At the end of the statements( an output( return( and reset occur automatically% SAS writes an o$ser#ation to the SAS data set( the system automatically returns to the top of the )A"A step( and the #alues of #aria$les created $y IN !" and assi&nment statements are reset to missin& in the pro&ram data #ector% Note that #aria$les that you read with a SE"( ME+CE( M<)IFL( or ! )A"E statement are not reset to missin& here% SAS counts iteration( reads the next record or o$ser#ation( and executes the su$se0uent pro&rammin& statements for the current o$ser#ation% "he )A"A step terminates when SAS encounters the end-of-file in a SAS data set or a raw data file%

All the #aria$les are assi&ned missin& #alues -?lan. for character( % for numeric #alues/

"ame statements that are recogni<ed at compile time only? drop( .eep( rename( la$el( format( informat( attri$( where( $y( retain( len&th( array "ame statements that are execution only' INFILE( IN !"( <utput( >all routines Identify statements whose placement in the DATA step is critical' )A"A( IN !"( +!N( >A+)S (INFILE(WJE+E(LA?EL(SELE>"(INF<+MA"(F<+MA" "ame statements that function at both compile and execution time' options( title( footnote

In the flow of DATA step processing what is the first action in a typical DATA Step? "he )A"A step $e&ins with a )A"A statement% Each time the )A"A statement executes( a new iteration of the )A"A step $e&ins( and the KNK automatic #aria$le is incremented $y 3% What is 4n4? "he KNK #aria$le counts the num$er of times the )A"A step $e&ins to iterate% It is one of the Automatic data step -and not proc@s/ #aria$les -the other one $ein& KE++<+K/ that SAS pro#ides in a )=% It should $e noted that KnK does not necessarily e0ual the o$ser#ation num$er in a dataset% How do I convert a numeric variable to a character variable? ractically( the data type of a #aria$le cannot $e chan&ed in one data step( $ut the data #alues could*<ne should create a new #aria$le with data type character and assi&n the #alues of the numeric #aria$le with a !" function( drop the numeric #aria$le( and rename the character #aria$le to the numeric #aria$le name% Note1 Lou would recei#e a warnin& sayin& that the #aria$le has already $een defined as numeric% E&%

http1''support%sas%com'onlinedoc'234'&et)oc'en'lrdict%hlp'a5553224H6%htm9a5558876H8 How do I convert a character variable to a numeric variable? ractically( the data type of a #aria$le cannot $e chan&ed in one data step( $ut the data #alues could*<ne should create a new #aria$le with data type numeric and assi&n the #alues of the character #aria$le with a IN !" function( drop the character #aria$le( and rename the numeric #aria$le to the character #aria$le name% Note1 Lou would recei#e a warnin& sayin& that the #aria$le has already $een defined as character% http1''support%sas%com'onlinedoc'234'&et)oc'en'lrdict%hlp'a5553;54H:%htm

find more E http1''sastechies%$lo&spot%com'

You might also like