Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Biostatistics 100A: Laboratory Two Spring 2021

Computer Exercise and Competency Assessment

I. Computer Exercise

OBJECTIVE. To develop analytical skills involving distributions and measures of


central tendency and spread. A data set is to be entered into STATA and manipulated.

DATA. 1) Create a data file for summarizing and graphing the following sample of 31
hemoglobin levels (g/cc) for children with acyanotic and cyanotic congenital heart
disease :

Acyanotic (n=19): 13.1, 14.0, 13.0, 14.2, 11.0, 12.2, 13.1, 11.6, 14.2,
20.5, 13.4, 13.5, 11.6, 12.1, 13.5, 13.0, 14.1, 14.7,
12.8

Cyanotic (n=12): 13.0, 16.8, 17.6, 14.8, 13.9, 14.6, 13.0, 17.5 14.8,
15.1, 19.3, 5.4

ANALYSIS. Turn in the following:

1. Printout with summary statistics by disease subgroup for hemoglobin levels. Find
medians, means, ranges, interquartile ranges and standard deviations for
hemoglobin levels by disease types.

2. Boxplots and histograms by disease subgroups for the variable hemoglobin. Does
the distribution of hemoglobin appear to differ between acyanotic and cyanotic
children with congenital heart disease?

3. Based on summary measures and plots, your opinion as to the appropriateness of


comparing values of central tendency in hemoglobin levels between acyanotic and
cyanotic children.

4. Printout with values of the geometric mean by disease subgroup for hemoglobin
levels. Verify the relationship that the arithmetic mean  geometric mean within
each disease subgroup. Among the three measures of central tendency (arithmetic
mean, geometric mean, and median), which best represents the central location of
the data for children with acyanotic congenital heart disease? With cyanotic heart
disease? Why?

1
Biostatistics 100A: Laboratory Two Spring 2021
Computer Exercise and Competency Assessment

STATA and PROGRAMMING PROCEDURES


· In STATA (the dot prompt), type the following commands:
COMMAND PURPOSE

1. log using lab2.log begins recording commands and results.


2. input dise hemo inputs data set. At prompt, enter a child’s disease
group then, separated by a blank, his/her hemoglobin
level. Code acyanotic children as “1” and cyanotic
children as “2”. For example,
dise hemo
1. 1 13.1
2. 1 14.0
...
19. 1 12.8
20. 2 13.0
...
31. 2 5.4
32. end
Note the key word “end” that instructs STATA to stop
input-mode. Check and correct data entry with
STATA’s list and replace commands. Type help
list or help replace for more information.
3. sort dise sorts data by disease type.
4. by dise: sum hemo, detail produces summary statistics by disease subgroups.
5. graph7 hemo, box by(dise) ylab total requests boxplots of hemoglobin levels for each
disease type.
6. Using mouse, place cursor over graph in window and right click mouse / Click Copy
copies graph onto clipboard. Next, open Microsoft
Word (or another word processor) and use paste to
insert clipboard graph into Word document. Save
document as lab2graphs.doc.
7. graph7 hemo, by(dise) freq xlab requests histograms of hemoglobin levels for each
disease type.
8. Using mouse, place cursor over graph in window and right click mouse / Click Copy
copies graph onto clipboard. Next, open Microsoft
Word (or another word processor) and use paste to
insert clipboard graph into Word document. Save
document as lab2graphs.doc.
9. gen lhemo=ln(hemo) generates a new variable equal to the natural
logarithm of hemoglobin levels.
10. by dise: sum lhemo produces summary statistics by disease group on log
11. display exp( put your value of acyanotic log mean here ) , exp( cyanotic log mean here )
displays the exponential values of the logarithm
means (i.e. geometric means)
12. sort dise hemo sorts data by 1) disease type and 2) hemoglobin
values.

2
Biostatistics 100A: Laboratory Two Spring 2021
Computer Exercise and Competency Assessment

13. log off stops sending commands and results to the ASCII
file 'lab2.log'.
14. exit,clear exit STATA
15. In Windows, save lab2.log and lab2graphs.doc copy to a removable memory device or email files for
analysis.

3
Biostatistics 100A: Laboratory Two Spring 2021
Computer Exercise and Competency Assessment

II. Competency Assessment

Distinguish among the different measurement scales and the implications for
statistical descriptive methods to be used based on these distinctions.

To successfully complete this assignment, answer the following review question from
Triola (2006):

The clinical trial of a polio vaccine involved 200,000 children treated with the Salk
vaccine and 200,000 other children given a placebo. Only 33 of the children in the
Salk treatment group later developed polio, but 115 of the children in the placebo
group later developed polio.
a. Are the given values discrete or continuous?
b. Identify the level of measurement (nominal, ordinal, interval, ratio) for the
numbers 33 and 115.
c. If 50 children from the treatment group are randomly selected and their
average (mean) age is calculated, is the result a statistic or a parameter?
d. Identify the nature (observation/experimental) and timing of the study
(retrospective/prospective). What impression do you have of the relationship
between vaccine and polio (causative/associative)?

Due date: Both computer exercise and competency assessment are due one
week after your lab section

You might also like