Professional Documents
Culture Documents
Sas Simulation
Sas Simulation
Sas Simulation
REFERENCES
Linked references are available on JSTOR for this article:
http://www.jstor.org/stable/4128171?seq=1&cid=pdf-reference#references_tab_contents
You may need to log in to JSTOR to access the linked references.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted
digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about
JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
http://about.jstor.org/terms
Royal Statistical Society, Wiley are collaborating with JSTOR to digitize, preserve and extend access to
Journal of the Royal Statistical Society. Series D (The Statistician)
This content downloaded from 141.211.4.224 on Fri, 10 Mar 2017 04:52:29 UTC
All use subject to http://about.jstor.org/terms
The Statistician (2003)
52, Part 1, pp. 83-86
Ilya Novikov
Gertner Institute for Epidemiology and Health Policy Research, Tel Hashomer, Israel
Summary. SAS software is often used for statistical simulations. The paper demonstrates a
simple and effective approach for performing simulations in SAS. The method that is usually
used generates data sets and performs the calculations sequentially in time, using the macro
%DO loop to execute each cycle. The more efficient approach is to generate one large data set
that includes all the individual data sets from each cycle as subsets and to perform the calcu-
lations on each subset in one pass, using the BY command. The paper presents an example
of both methods of simulation, with their SAS codes. Both programs give the same numerical
results, but the BY approach is 80 times faster than the macro %DO approach.
Address for correspondence: Ilya Novikov, Biostatistical Unit, Gertner Institute for Epidemiology
Policy Research, Tel Hashomer, 52621, Israel.
E-mail: ilian@gertner.health.gov.il
This content downloaded from 141.211.4.224 on Fri, 10 Mar 2017 04:52:29 UTC
All use subject to http://about.jstor.org/terms
84 I. Novikov
An example is the estimation of the coverage level for a normal-based confidence interval
(sample mean plus or minus 1.96 standard errors) for the expectation of a variable with a non-
normal distribution. Suppose that Y is a variable that with probability q equals 0, and with
probability p (= 1 - q) has a log-normal distribution with parameters m and V. Such problems
appear often in medical applications (Rahme et al., 2001).
Appendix A shows two SAS programs that estimate this coverage level, which give exactly the
same numerical results, but, for 5000 samples with 200 subjects per sample, the usual approach
takes 7 min 2.37 s, whereas the recommended approach takes 5.00 s. The programs were run
in SAS 8.12 on an IBM personal computer with a Pentium III 733 MHz processor and 128
Mbytes of random access memory, operating under Windows 98.
Acknowledgements
Thanks are due to Laurence Freedman (Gertner Institute, Israel) and Phil Gibbs (SAS Institute,
Cary, USA) who encouraged me to write this note.
This content downloaded from 141.211.4.224 on Fri, 10 Mar 2017 04:52:29 UTC
All use subject to http://about.jstor.org/terms
Efficient Simulations in SAS 85
proc append base=c force; *==== adding the last results to summary data
set c;
%end; * ===== end of the main loop
data d; file print; * ===== summarizing calculations ===;
set c end=eof;
retain coverage 0 nsam;
coverage=coverage+((my-1 . 96*sey<=tm)&(my+l . 96*sey>tm))/&nsam;;
if eof then do;
put 'coverage=' coverage 8.5;
time=time(); * === fixing the time of the end of the macro approach ==;
put 'END MACRO PROCESSING:' time=time 16.6;
end;
run;
%mend cover;
%cover(200,5000,1.5,1,0.05,0.7,4635209,3762973);
This content downloaded from 141.211.4.224 on Fri, 10 Mar 2017 04:52:29 UTC
All use subject to http://about.jstor.org/terms
86 I. Novikov
A. 1. SAS output
START MACRO PROCESSING: time=12:45:40.340000
coverage=0.92700
END MACRO PROCESSING: time=12:52:42.710000
START BY PROCESSING: time=12:52:42.770000
coverage=0.92700
END BY PROCESSING: time=12:52:47.770000
References
Ambrosius, W. T. and Hui, S. L. (2000) A quality control measure for longitudinal studies with c
outcomes. Statist. Med., 19, 1339-1362.
Rahme, E., Joseph, L., Kong, S. X., Watson, D. J. and LeLourier, J. (2000) Gastrointestinal health car
use and costs associated with non-steroidal anti-inflammatory drugs versus acetaminophen. Arth. Rh
917-924.
SAS Institute (1999) SASO Language Reference, Version 8. Cary: SAS Institute.
This content downloaded from 141.211.4.224 on Fri, 10 Mar 2017 04:52:29 UTC
All use subject to http://about.jstor.org/terms