short-notes-in-medical-statistics-for-medical-examinations_e272c005-0c58-4e6b-b485-73c9eefb55e8

Short Notes in Medical Statistics
SHORT NOTES IN
For Medical Examinations
a
Table of contents
Part 1: Study Design ............................... 1 Relative Risk Reduction (RRR) ............ 22
Observational and Experimental Number Needed to Treat, and
Studies ......................................................... 1 Number Needed to Harm ................... 23
Case Series ................................................ 1 Part 4: Descriptive statistics ........... 24
Cross-sectional Studies ........................ 1 Types of data variables....................... 24
Case-control studies.............................. 3 Descriptive statistics for categorical
Cohort Studies .......................................... 4 variables ................................................... 25
Ecological Studies ................................... 5 Descriptive statistics for numerical

variables ................................................... 25
Interventional studies: Randomized
controlled trials (RCTs) .......................... 6 The normal distribution ....................... 27
Meta-analysis ........................................... 9 Skewed distribution .............................. 28
Bias ............................................................... 9 Part 5: Hypothesis Testing and

statistical tests....................................... 29
Confounding effect ............................... 10
Null and alternative hypotheses ...... 29
Reliability and Validity .......................... 11
Type I and type II error ......................... 30
Part 2: Diagnostic Tests ..................... 12
Power, statistical significance and
Incidence and prevalence ................. 12
sample size .............................................. 31
Sensitivity and specificity.................... 13 P-value ...................................................... 32
Positive and Negative Predictive Clinical Significance and Statistical
Values ........................................................ 16 Significance ............................................. 33
Likelihood ratios...................................... 17
Confidence interval .............................. 33
Receiver Operating Characteristic Statistical tests ....................................... 34
(ROC curve) ............................................. 18
Correlation ............................................... 36
Part 3: Risk Estimation ....................... 19
Regression ............................................... 37
Relative Risk and Odds Ratio ............. 19
Survival analysis .................................... 39
Attributable Risk (AR) ........................... 21
Absolute Risk Reduction (ARR) .......... 21
Dr. Mohamed Elsherif, MBBCh, MPH

www.stats4drs.com , contact@stats4drs.com
WhatsApp: +2 01029418284 or +965 99130772
The book “Applied Medical statistics
b for beginners” is available here:
https://leanpub.com/Applied_Medical_Statistics
Part 1: Study Design
Observational and Experimental Studies
Generally, research studies are classified as either observational or interventional
(experimental) studies.
Observational
Experimental studies
(Non-experimental) studies
The researchers just observe, measure, The researchers apply an intervention
or collect data without intervening with that may be a new drug, a surgical
the study objects. technique, an educational program, or
- Case series any other intervention.
- Cross-sectional studies - Randomized controlled trials
- Case-control studies
- Cohort studies
The most important types of observational studies are:
Case Series
• This type of study includes a few numbers of patients with a specific disease or
condition.
• It is not a true study and does not aim to draw general conclusions.
• It is just a description of the characteristics and outcomes of this group.
Cross-sectional Studies
• In a cross-sectional study, a sample is chosen and data on each individual is
collected at one point in time. It is usually looked at as a snapshot.
• It can be used for descriptive purposes (as prevalence), or analytical purposes
(association between exposure and outcome).
1
• Cross-sectional studies can be used to estimate the prevalence of a disease in
the population (current cases) but not the incidence (new cases).
• Surveys are common examples for cross sectional studies.
Example
• Research Question: Is there an association
between obesity and depression among
Cairo University students?
• A questionnaire is administered to assess
obesity (based on body mass index) and

2023
depression (based on a specific scale),
PHQ-9 scale for example.
Advantages
• Quick, easy, and inexpensive (No waiting for the occurrence of outcome).
• No loss to follow up (there is no follow up).
• It helps in hypothesis generation (identifying possible risk factors).
• Several exposures and/or outcomes can be studied.
• Surveys with validated questions allow comparison between different studies
using the same tool.
Disadvantages
• Cannot determine causality (No time sequence between exposure and
outcome, and we cannot establish cause and effect).
• Not suitable for rare outcomes or diseases.
• High refusal or non-response can cause bias.
• Cannot provide measures of incidence or relative risk.
2
Case-control studies
• In case-control studies, a group of patients (cases) is compared to a similar
group of individuals who are free of the disease (controls) regarding the past
exposure to a suspected agent or risk factor.
• We start with cases and controls, then we look for the past exposure
(retrospective).
Example
• Research Question: Is there an
association between obesity
and bipolar disorder among
Cairo university students.
A Case-Control study design:
• Cases: Students from Cairo University who are diagnosed with bipolar disorder.
• Controls: Students from Cairo University who don’t have bipolar disorder.
• Exposure: Self-report of obesity during high school.
Advantages
• Useful with rare diseases or outcomes.
• Inexpensive and efficient (may be the only feasible option).
• Establishes measures of association (Odds ratios).
• Useful for generating hypotheses (multiple risk factors can be explored).
Disadvantages
• Selection bias (when choosing inappropriate controls).
• Recall bias (the study is retrospective, and participants may not report their
exposure accurately, especially the controls).
• Cannot estimate incidence or prevalence.

3
Cohort Studies
• Cohort studies evaluate a possible association between exposure and
outcome by following two groups of individuals (exposed and unexposed) over
a period of time (often years) to see whether they develop the disease or
outcome of interest.
• The rates of disease incidence among the exposed and unexposed groups are
determined and compared.
• Subjects should not have the outcome
variable (should be disease-free) on entry to
the study and should have the potential to
develop the outcome.
• Cohort studies may be prospective or
retrospective, but both types define the cohorts based on the exposure status,
not the outcome.
• In retrospective cohort studies: exposure and outcome have already occurred
at the time of the study (current time). Pre-existing data, such as medical
records or employee files, can be used to assess exposure status in the past.
Example
To study if obesity is a risk factor for depression among Cairo University students:
• Selection of a sample from the population (Cairo University students).
• Measuring the exposure (obesity) variable in the sample by assessing BMI.
• Ensuring that the outcome is not present (none of the participants have
depression).
• Follow up the different exposure groups (obese and non-obese) for a specific
period of time.
4
• Measure the occurrence of the outcome variable (depression) in each group
during the study period.
Advantages
• Valuable in studying rare exposures.
• Stronger evidence for causation as compared to cross-sectional and case-
control studies.
• Measures the incidence of a disease or an outcome.
• Can study multiple outcomes for a single exposure.
• Relative risk (RR) can be used as a measure of association (it is more
straightforward and easier to understand than the odds ratio).
Disadvantages
• Expensive and a large number of subjects is usually needed.
• Often needs a long follow-up period.
• Loss to follow-up can affect the validity of results.
• Retrospective cohort studies need complete and accurate records.
Ecological Studies
• They are observational studies in which the outcome of interest is the rate of an
outcome within a population. The unit of analysis is a population rather than an
individual and associations are studied across different populations.
• Ecological studies may be conducted from published statistics regarding rates
of an outcome derived from other studies.
5
Interventional studies: Randomized controlled trials (RCTs)
• The most important type of the interventional studies is the randomized
controlled trials (RCT)
• Randomized controlled trials (RCTs) are considered the gold standard of
individual research studies.
• In this study: subjects are randomly allocated to different treatment options.
• The comparison is done against an active agent or with an inert substance
(placebo).
Example
• Randomized controlled trial for
andomi ation
efficacy of weight reduction in utcome +ve
Treatment A
Pop lation
controlling depression in obese ew treatment
utcome ve
Sample
patients. utcome +ve

Treatment
• A sample of the population is Standard o care or utcome ve
placebo
selected (obese patients having The present The uture
depression).
• Randomized to weight reduction plus standard treatment (intervention group)
versus standard treatment for depression (control group).
• Outcome: change in depression score after 2, 4 and 6 months of treatment.
Blinding (Masking)
• It refers to withholding information about the treatment allocation from patients
and/or treating physicians/investigators.
• The goal is to reduce bias due to the subjectivity in reporting (by patient), and
evaluation (by physician).
6
Types of blinding
• Open label (no blinding): the patient and physicians know which
treatment/intervention the patient is receiving.
• Single blinded: the patient does not know what drug he/she is receiving.
• Double-blinded: both the patient and the investigator do not know which
patient is receiving which treatment.
Randomization
• It refers to the random assignment of patients to the different treatment groups
in the study.
• It removes bias due to subjectivity of assignment of patients.
• It produces balanced groups, i.e., the measured and unknown factors and
characteristics of the participants at the time of randomization will be, on
average, evenly balanced between the intervention and control group.
Advantages of RCTs
• It is the gold standard as it has stronger evidence over observational studies.
• Randomization controls for unmeasured confounding variables.
• Can provide comparison between the new drug and the current standard
treatment (or a placebo).
Disadvantages of RCTs
• Non-compliance with treatments or treatment crossover can occur.
• Loss to follow up can occur.
• Difficult to study rare events and a long follow up period might be needed.
• Can be very costly.
• Sometimes, it is not possible to be conducted for practical or ethical
considerations.
7
• There are some concerns with external validity of results (generalizability), as
participants eligible for trials may not be representative of all patients with the
condition of interest.
Common designs of RCTs
• Parallel design where patients are randomized to one of the two groups of
treatments, A and B, and each patient receives only one type of treatment.
• Crossover design where participants
andomi ation
switch groups in the study. For example, esult esult
Treatment A Treatment
patients that started with treatment A

Pop lation
Sample
switch to getting treatment B, while those asho t

period
Treatment Treatment A
that started with treatment B get
esult esult
treatment A after the halfway point. In this study design participants act as their
own controls. The main disadvantage is the carryover effect, which may affect
the direct intervention effect (if the effect of the drug in the first period affects
that in the second period).
Intention to treat analysis and per-protocol analysis
• Intention-to-Treat Analysis (ITT): participants are analyzed according to the
group they were originally allocated to during randomization (i.e. using data
from all patients, including those who did not complete the study or changed
treatment). ITT analysis is preferred as it provides an unbiased comparison of
the treatments.
• Per-Protocol Analysis (PP): Only participants who strictly adhered to the study
protocol are analyzed. All others – such as participants who moved, who did not
complete the intervention originally allocated, or who may have taken a
concomitant medication they should not have – are excluded from the
analysis.
8
Meta-analysis
• This type is a summary study of multiple studies.
• Meta-analysis is a quantitative method for systematically combining results of
previous research to arrive at conclusions about the body of research.
• Meta-analysis comes after a systematic review Which involves the systematic
identification, appraisal and abstraction of data from all identified literature
relating to a well-defined specific research question.
• Statistical methods are used to combine the results of each independent study
to provide a summary statistic of overall results as well as graphical
representation of included studies (forest plot).
Bias
• It is a systematic error leading to an incorrect estimate of the true association
between exposure and outcome.
Selection bias: a systematic error in the recruitment or retention of study participants:
• Berkson’s bias can occur in a case-control study using hospitalized controls, as
they may not be a representative sample of the population.
• Non-response bias occurs when participants differ from non-participants in a
study (volunteers for a study might differ from the population).
• Attrition bias occurs when patients who are lost to follow-up differ in a
systematic way from those who continued the study (they might be older or
sicker than those who continued the study).
9
Information bias: the way of information collection is inadequate:
• Recall bias occurs when individuals with disease may be more likely to
incorrectly recall/believe they were exposed to a possible risk factor than those
who are free of disease.
• Observer bias occurs when knowledge of exposure status (e.g. race, gender)
biases the observer towards a diagnosis; this occurs more commonly with
subjective diagnoses like those found in psychiatry.
• Measurement bias occurs when information is recorded in a distorted manner
(e.g. an inaccurate measurement tool).
Publication bias is important when considering systematic reviews and meta-
analysis. This occurs when some studies are less likely to be published (usually those
showing no statistically significant results).
Confounding effect
• A confounder is a variable that is related to both the exposure and outcome
and distorts the estimated effect of an exposure if not accounted for in the
study design/analysis.
• It appears that there is a relationship between
the exposure and outcome based on the
results, but in fact, there is no relationship.
Some factors other than what is being studied
are distorting the results. For example, there
might be initially an association between
coffee drinking and lung cancer if smoking is
ignored. But, if we control for smoking, this association will disappear.
• Minimizing the confounding effect can be done through randomization,
stratification, matching, and regression.
10
Reliability and Validity
When a new measurement tool (a scale or a questionnaire) is designed, we are
concerned about its validity and reliability.
Validity/Accuracy (of a measurement tool): how well an instrument (or a scale)
measures what it is intended to measure? (e.g. Is a questionnaire designed to assess
anxiety, is assessing anxiety correctly? Its types include:
• Face validity: Is the scale measuring what it sets out to measure?
• Content validity: Does the scale cover all the relevant areas?
Reliability/Precision: how reproducible are the findings if we repeat the measurement
on the same subject?
Its types include:
• Test-retest reliability: if a subject takes an exam/survey twice, how much his
responses are stable?
• Between-observers reliability: is there an agreement between different
observers in assessing the same individuals?
11
Part 2: Diagnostic Tests
Incidence and prevalence
• Incidence is the occurrence of new cases of a specific disease, complication,
injury or health condition in a specific period of time.
• Incidence: is the proportion of a population that is disease-free at the beginning
who develops the disease during a specific period of time (new cases/total).
• It ranges from 0 to 1 (in percentages: 0% to 100%).
number of new cases during a specific period of time

Incidence = size of population (who are disease free)at the begining of this time period
Over one year, if 10 women are diagnosed with breast cancer, out of the total female
study population of 1000 (who do not have breast cancer at the beginning of the study
period), then the incidence of breast cancer in this population is 10/1000= 0.01, or 1%.
• Prevalence: is the percentage of people in a population who have a disease or
other health condition at one point in time (all cases/total).
number all cases at a specific point in time

Prevalence = size of the population at that time point
If a survey included 1150 university students, a total of 170 reported daily smoking. The
prevalence of smoking: Prevalence = (170/1150) × 100 = 0.148 × 100 = 14.8%.
12
Sensitivity and specificity
• If a group of researchers comes up with a new diagnostic test (e.g., blood test)
to diagnose certain disease (e.g., presence of cancer), they will have to run an
experiment to see how good this new diagnostic test is (which may be cheaper,
easier, or less invasive test).
• We need to compare this new diagnostic test to the gold standard test that
provides a definitive diagnosis of the condition (it may be a histopathology
exam in the condition of cancer).
• So, we apply this new test and the gold standard test (true diagnosis) to a group
of individuals who might have the disease or not.
• Based on the results of the two tests, we will come up with 4 groups:
A. True positive: Positive for the blood test and positive for the histopathology
B. False positive: Positive for the blood test and negative for the histopathology
C. False negative: Negative for the blood test and positive for the histopathology
D. True negative: Negative for the blood test and negative for the histopathology
Those results are presented in a table as follows:
Based on the gold standard test

Total
Disease present Disease absent
The new Test True positive False positive Total test positive
diagnostic positive (a) (b) (a+b)
test Test False negative True negative Total test negative
negative (c) (d) (c+d)
Total diseased Total normal Total population

Total
(a+c) (b+d) (a+b+c+d)
We use sensitivity and specificity to describe the accuracy of a diagnostic test.
13
Example
If 1000 individuals were exposed to the two tests and the result is summarized as
follows:
Based on the gold standard test
Disease present Disease absent
180 80
Test positive True positive False positive
The new (a) (b)
diagnostic test 20 720
Test negative False negative True negative
(c) (d)
• Sensitivity
Sensitivity is the percentage of true positives, i.e., the proportion of those who have the
disease who are correctly identified by the test as positive.
In other words: the probability that a test result will be positive when the disease is
present.
𝑎 number of true positive (a)
Sensitivity = 𝑎+𝑐 = number of true positive (a)+number of false negative (c)
180
Sensitivity = = 0.9 = 90%
180+20
This 90% sensitivity means that if we are sure that 100 patients have the disease
(based on the gold standard test), the new diagnostic test will be positive in 90 cases.
90% of people who have the disease will test positive.
14
• Specificity
Specificity is the percentage of true negatives, i.e., the proportion of those who don’t
have the disease who are correctly identified by the test as negative.
In other words: the probability that a test result will be negative when the disease is
absent.
𝑑 number of true negative (d)
Specificity = 𝑏+𝑑 =number of false positive (b)+number of true negative (d)
720
Specificity = = 0.9 = 90%
80+720
This 90% specificity means that if we are sure that 100 individuals don’t have the
disease (based on the gold standard test), the new diagnostic test will be negative in
90 cases.
90% of people who do not have the disease will test negative.
Higher sensitivity or higher specificity
• To rule out a disease, we want to be sure that a negative result is really negative
(no disease); therefore, a few false negatives should occur. High sensitivity
helps rule out if the test is negative. If we use SN for sensitivity, we use a highly
sensitive test to rule OUT, (SNOUT).
• To confirm a disease (rule in), we want a positive result to indicate a high
probability that the patient has the disease (a positive test result should really
indicate disease). Therefore, we want a few false positives. High specificity helps
rule in if the test is positive. If we use SP for specificity, we use a highly specific
test to rule IN, (SPIN).
15
Positive and Negative Predictive Values
Sensitivity and specificity are characteristics of the test. But the physician and the
patient may have a different question: what is the chance that a person with a positive
test truly has the disease? Here comes two other calculations:
• Positive predictive value is the probability that when having a positive test
result, that individual will truly have that specific disease. It is the proportion of
people with a positive test who have the disease.
𝑎 number of true positive (a)

Positive Predictive Value (PPV) = 𝑎+𝑏 = number of true positive (a) + number of false positive (b)
𝑎 180
PPV = = 180+80 = 0.69 = 69%
𝑎+𝑏
For those who test positive, 69% have the disease.
• Negative predictive value is the probability that when having a negative test
result, that individual will truly be free of the disease. It is the proportion of people
with a negative test who are free of disease.
𝑑 number of true negative (d)

Negative Predictive Value (NPV) = 𝑐+𝑑 = number of false negative (c) + number of true negative (d)
𝑑 720
NPV = = 20+720 = 0.97 = 97%
𝑐+𝑑
For those who test negative, 97% are not having the disease.
For simplicity, remember:
Sensitivity = true positive / diseased Specificity = true negative / non-diseased
PPV = true positive / testing positive NPV = true negative / testing negative
Prevalence = all diseased / total population
Sensitivity and specificity are characteristics of the test and are not affected by the
disease prevalence, while PPV and NPV are affected by the disease prevalence.
16
Likelihood ratios
Likelihood ratio is another measure of the performance of a diagnostic test.
• The likelihood ratio of a positive test result: ratio between the probability of a
positive test result in the presence of the disease and the probability of a
positive test result in the absence of the disease.
Positive likelihood ratio: LR+ = Sensitivity / (1 – Specificity)
• The likelihood ratio of a negative test result: ratio between the probability of a
negative test result in the presence of the disease and the probability of a
negative test result in the absence of the disease.
Negative likelihood ratio: LR– = (1 – Sensitivity) / Specificity
High likelihood ratios for positive test results and low likelihood ratios for negative test
results indicate a more useful diagnostic test.
17
Receiver Operating Characteristic (ROC curve)
• When developing a diagnostic or screening test, sometimes we are concerned
about choosing the appropriate cut-off value for a numerical measurement
(e.g. the serum level of troponin-T that indicates myocardial infarction).
• The Receiver Operating Characteristic (ROC) curve is used to present the
sensitivity and specificity for all possible cut-off points.
• In the ROC curve: (Sensitivity) is plotted
on Y-axis, while (1-Specificity) is plotted
on X-axis, and each point on the curve
represents a sensitivity/specificity pair
corresponding to a particular possible
cut-off point.
• Sensitivity and specificity are inversely
related; if we change the cut-off for
higher sensitivity, this will reduce the
specificity.
• We can choose the optimal cut-oﬀ point depending on the implications of false
positive and false negative results, and the prevalence of the condition.
• When screening for a deadly disease that is curable, it may be desirable to
accept more false positives (lower specificity) in return for fewer false negatives
(higher sensitivity).
18
Part 3: Risk Estimation
Relative Risk and Odds Ratio
Risk Ratio (Relative Risk, RR) and Odds Ratio (OR) are different measures of association.
We need to know the difference between them.
Example
If a cohort study was done to follow 800 individuals for 5 years period, 400 are smokers,
and 400 are non-smokers. They were followed up for the occurrence of coronary heart
disease.
The result is presented in the following table:
Diseased Not Diseased Total
40 360 400
Smokers
a b a+b
20 380 400
Non-smokers
c d c+d
60 740 800
Total
a+c b+d a+b+c+d
• Relative risk (RR) is the risk (incidence) of having the disease among the
exposed divided by the risk of having the disease among the non-exposed.
Risk (incidence) is calculated by dividing the number who developed the disease by
the total sample (part/total).
Number who have the disease

Risk (incidence) = Total group
19
Relative risk (Risk ratio) calculation:
𝑎
Incidence among exposed
RR = 𝑎+𝑏
𝑐
Incidence among non exposed
𝑐+𝑑
𝑎/(𝑎+𝑏)
RR = 𝑐/(𝑐+𝑑)
𝑎/(𝑎+𝑏) 40/400
RR= = 20/400 = 2
𝑐/(𝑐+𝑑)
• Odds ratio (OR) is the odds of having the disease among the exposed divided
by the odds of having the disease among the non-exposed.
The odds are calculated by dividing the number of have the disease by the number
who don’t have the disease (part/part).
Number who have the event (disease)

Odds = Number who do not have the event (disease)
Odds ratio calculation:
𝑎
Odds of having the disease among exposed
OR = 𝑏
𝑐
Odds of having the disease among non exposed
𝑑
𝑎/𝑏 𝑎𝑑
OR = =
𝑐/𝑑 𝑏𝑐
𝑎/𝑏 40/360
OR = = = 2.08
𝑐/𝑑 20/380
Interpretation of RR and OR:
RR <1 if the group represented in the numerator is at lower “risk” of the event.
RR >1 if the group represented in numerator is at greater “risk” of the event.
RR =1 if the group represented in the numerator is at the same “risk” of the event.
OR is interpreted in the same way, but we use the word “odds” instead of “risk”.
20
Attributable Risk (AR)
• Attributable risk is simply the difference in incidence (risk) between the exposed
group and the non-exposed group. It refers to the increase in risk that can be
attributed to this risk factor.
𝑎 𝑐
Attributable risk = Incidence among exposed ( )− Incidence among non exposed ( )
𝑎+𝑏 𝑐+𝑑
• If the incidence for a specific disease among smokers is 12%, and the incidence
of the same disease among non-smokers is 5%. So, the attributable risk is 12%-
5% = 7%. This indicates that smoking is responsible for 7% increase in the
incidence of this disease.
• Note that in calculating the relative risk, we divide the risk in one group by the
risk in the other group, while here in the attributable risk, we calculate the
difference of the risks.
Absolute Risk Reduction (ARR)
• Absolute risk reduction (ARR), or risk difference is the same as the attributable
risk. It is the difference between two risks. We use it when a treatment causes
risk reduction as compared to the control group.
ARR= Incidence (risk) among treatment group − Incidence (risk) among control group
• If the incidence for a specific disease among non-vaccinated group is 12%, and
the incidence of the same disease among the vaccinated group is 5%. So, the
absolute risk reduction is 12%-5% = 7%.
• Vaccination is responsible for 7% decrease in the incidence of this disease (in
the population).
To overcome the confusion: ARR or AR = higher risk – lower risk
21
Relative Risk Reduction (RRR)
Relative Risk Reduction (RRR) is the amount of risk reduction relative to the baseline
risk. It is the difference in the risk of the event between the control and experimental
groups, relative to the control group.
Incidence (risk) among treatment group−Incidence (risk) among control group

RRR = =
Incidence (risk) among control group
ARR
Incidence (risk) among control group
An alternative way of calculating the (RRR) is to use the relative risk (RR):
RRR = (1 - RR)
• If the incidence for a specific disease among non-vaccinated (control) group
is 12%, and the incidence of the same disease among the vaccinated
(experimental) group is 5%. So, the relative risk reduction is:

12−5 7
RRR=
12
= 12
= 0.58, Or: RRR = (1 - RR) = 1 - (5/12) = 1 - 0.42 = 0.58
We can change it into percentage 0.58 = 58%.
Baseline risk decreased by 58% due to vaccination.
22
Number Needed to Treat, and Number Needed to Harm
• The Number Needed to Treat (NNT) is the number of individuals that need be
treated for one person to benefit from treatment.
• Number Needed to Harm (NNH) is the number of people that need to be
exposed to a risk factor (or experimental treatment) to lead to one additional
person being harmed.
• Both Number Needed to Treat (NNT) and Number Needed to Harm (NNH) are
calculated as 1 divided by the absolute risk reduction or attributable risk
(whichever is more appropriate).
• If there is risk reduction, we calculate the number needed to treat, and if there
is risk increase, we calculate the number needed to harm.
Number needed to treat = 1 / absolute risk reduction (ARR)
Number needed to harm = 1 / attributable risk (AR)
If the absolute risk reduction (ARR) = 7%, which is= 0.07.
NNT = 1/0.07 = 14.3.
We need to vaccinate 14 people to prevent one of them from having the disease.
23
Part 4: Descriptive statistics
Types of data variables
• Data variables are either categorical or
numerical.
Categorical variables are variables that consist of
categories. They have no unit of measurement, and
individuals are described as belonging to one of the
categories. They are either nominal or ordinal.
• Nominal variables: categorical variables that have no intrinsic order (can’t be
ordered), as sex (female, male) and blood groups: (A, B, AB, O).
• Ordinal variables: categorical variables that have meaningful order, as BMI
status: (underweight, normal, overweight, obese, extremely obese), and
agreement level: (strongly disagree, disagree, undecided, agree, strongly
agree).
• Categorical variables that consist of only two categories are called binomial
variables, as having a disease (yes, no).
Numerical variables are either measured or counted, presented in numbers, and have
a measurement unit. They are either discrete or continuous.
• Discrete variables: They take only integer numbers (no decimals) such as 0, 5,
22, 106, etc. They usually represent a count of something, as number of kids in a
family.
• Continuous variables: They can take any real numerical value, including
decimals (as 14.55, 48.8, 178.2). They involve measurements such as height and
weight.
24
Descriptive statistics for categorical variables
Categorical variables such as sex, smoking status, and disease severity are presented
using:
• Frequencies (numbers): which is the number of individuals in each category, as
the number of males and the number of females.
• Relative frequencies (percentages): which is the percentage of individuals in
each category.
Descriptive statistics for numerical variables
Numerical variables are usually described using two numbers, one represents the
center of the data (central tendency), and the other represents the spread of the data
(dispersion).
• Measures of central tendency (mean, median, and mode)
Mean: it is the sum of the observed values divided by the number of observations. It is
affected by extreme values.
Median: it is the point at the center of the data values, where half of the data points
are above, and half are below it. To calculate the median, we first arrange (order) our
data from the smallest value to the largest value. Then, the median is the value in the
middle. It is not affected by extreme values.
Mode: it is the most frequently occurring value in the dataset.
• Measures of dispersion
Measures of dispersion (spread of the data) are used to describe variability in the
data. The commonly used measures of dispersion are range, inter-quartile range,
variance, and standard deviation.
25
Range: it is the difference between the largest
and smallest values (maximum – minimum).
Range is affected by extreme values.
Inter-quartile range (IQR): it is the difference
between the upper quartile and the lower quartile
= Q3-Q1.
• The first quartile (Q1, lower quartile): in the point where 25% of the data are below
it. It is also called the 25th percentile.
• The third quartile (Q3, upper quartile): in the point where 75% of the data are
below it. It is also called the 75th percentile.
Variance: it is a measure of spread that considers all data points in the calculation. It
represents the distance of all data points from the mean. Variance is measured using
squared units which is not usually easy to understand.
Standard deviation: it is a measure of spread that represents the average distance of
the data values from their mean. It is calculated as the square root of the variance, so
it uses the same measurement unit of the mean.
Note that:
• The higher the value of the variance or
standard deviation, the more the data
points are spread around the mean.
• For numerical variables, use mean with standard deviation if normally
distributed, or median with IQR if not normally distributed.
• Mean and standard deviations are affected by the presence of extreme values.
26
Standard error of the mean
• If we take number of samples from a population, then the mean of each sample
is calculated, those means will be arranged into a distribution around the true
population mean.
• The standard deviation of this distribution, i.e. the standard deviation of sample
means, is called the standard error.
• The standard error tells us how accurate the mean of any sample is likely to be
compared to the true population mean.
• It is affected by the standard deviation (variability of data), and sample size.
The normal distribution
Normally distributed variables are common in
biological measurements (as height, blood pressure,
IQ, …) and have the following characteristics:
• The mean, median, and mode are almost
equal.
• They are denser in the center and less dense
in the tails (bell shape) and are symmetrical around the mean.
• 50% of values less than the mean, and 50%
greater than the mean
• 68% of the area of a normal distribution is
within one standard deviation of the mean.
• Approximately 95% of the area of a normal
distribution is within two standard
deviations of the mean.
• Approximately 99.7% of the area of a normal distribution is within three standard
deviations of the mean.
27
Skewed distribution
Non-normally distributed data can be skewed, meaning it tends to have a long
tail on one side.
• Positive skew is when the long tail is on the right side and is skewed to the
right.
• Negative skew is when the long tail is on the left side and is skewed to the left.
The mean for a skewed data variable is located nearer to the tail (as it is affected by
the extreme values).
28
Part 5: Hypothesis Testing and statistical tests
Null and alternative hypotheses
For each research question, we define two types of hypotheses: the null hypothesis
(Ho) and the alternative hypothesis (H1).
Both are mutually exclusive (not overlapping) and only one of them is true.
Ho: Null hypothesis H1 /Ha: Alternative hypothesis
▪ Is the currently accepted belief/ ▪ Is the researcher’s idea

idea /parameter ▪ something is happening/ there is
▪ Nothing is happening / there is no a difference/ there is an
difference / there is no association
association ▪ The researcher believes it to be
▪ The researcher doubts it to be true and wishes to prove
true
Example:
Is there an association between smoking and the

Research question
risk of cardiovascular diseases?
There is no association between smoking and the
The null hypothesis (Ho)
risk of cardiovascular diseases.
There is an association between smoking and the
The alternative hypothesis (H1)
risk of cardiovascular diseases.
We perform the statistical analysis to test our hypotheses and reach a conclusion
regarding the null and alternative hypotheses. The conclusion is either:
• Fail to reject the null hypothesis (implies accepting the null hypothesis) and
conclude that nothing is happening / no difference / no association.
• Reject the null hypothesis (implies accepting the alternative hypothesis) and
conclude that something is happening/ there is a difference/ there is an
association.
29
Type I and type II error
While doing medical research, there is a possibility to reach a false conclusion and
commit type I error or type II error.
• If the null hypothesis is true (Drug X is not effective) and we rejected it
(concluded that if is effective), we have committed type I error (false positive
result).
• If the null hypothesis is false (Drug X is effective) and we failed to reject it
(concluded that it is not effective), we have committed type II error (false
negative result).
• Type I error is called alpha α, and type II error is called β.
• Type I error is more serious (it might indicate that a drug is effective while in fact
it is not) than type II error.
• While designing studies, we tend to minimize type I error as compared to type II
error.
30
Power, statistical significance and sample size
Power is the probability of not committing type II error. So, power = 1-β
• The statistical power of a study is the power (or ability) of the study to detect a
difference (or effect) if a difference (or effect) really exists.
• In practice, β is usually set at 0.2. This provides a power value of 0.8 (80%).
• If there is a difference, then the probability of the statistical test to detect it is
80%.
The level of significance (α) is the maximum allowed probability of committing type I
error.
• The smaller the value of α, the lower the risk of committing type I error.
• We choose a level of significance depending on the consequence of
committing type I error.
• Common values for α are 0.05 and 0.01 indicating 5% and 1%, respectively.
Sample size: is calculated when designing any study.
• It depends on many factors such as power, level of significance, expected effect
size of the outcome, etc..
• Studies with larger power will need larger sample size, and studies with lower
probability for type I error (α) will need larger sample size.
31
P-value
• When doing a statistical test using the computer software, we get the p-value
(which tells us if a test is statistically significant or not).
• If the null hypothesis is true, the p-value is the probability of obtaining this result
(or something more extreme). In other words, the p-value is the probability of
seeing the observed difference (in the collected data), or greater, just by
chance if the null hypothesis is true (there is no effect).
• For much simplicity: p-value is the probability of seeing the observed difference
just by chance.
• P-value lies between 0 and 1.
• We compare the p-value (from the statistical test) to the level of significance
(α), which is usually 0.05, to make a decision.
• If the p-value is greater than the level of significance, then we do not reject the
null hypothesis (we say that the p-value is not significant and there is no
difference).
• If the p-value is less than the level of significance, then we reject the null
hypothesis (we say that the p-value is significant and there is a statistically
significant difference).
Reject the null

less than α
hypothesis
Statistical test P-value
Fail to reject the
more than α
null hypothesis
32
Clinical Significance and Statistical Significance
It is important to consider the clinical significance and not only the statistical
significance.
• If we have a very large sample size, comparing two groups might be statistically
significant even if the difference between them is very small and has no clinical
importance.
• On the other hand, if we have a small sample size, the result might be
statistically not significant (due to low power of the study), even if the difference
is large (clinically important).
Clinicall signi cant
es No
ndicates that the two The sample si e mi ht

es
roups are di erent be lar e
Statisticall
signi cant
ndicates that the two
The sample si e mi ht
No roups are not
be ver small
di erent
Confidence interval
• A Confidence Interval (CI) is a range of values within which we are fairly sure
the true population value lies (e.g. the mean). It is bounded by the upper and
lower confidence limits.
• It is frequently reported as 95%CI (i.e. if this study was repeated 100 times,
estimates would fall within this 95%CI 95 out of 100 times).
• A common interpretation (however not very accurate): we are 95% confident
that the true population (mean) lies between …. & ….
• As the sample size increases, the confidence interval becomes narrower (more
precise).
• There is a relationship between the CI and the p-value:

33
• If the confidence interval for the difference between the two groups contains 0,
this difference is not significant (the p value is >0.05).
Not Signi cant
Signi cant
di erence
Not Signi cant
Di erence
• If the confidence interval for a ratio as
the relative risk (RR) or odds ratio (OR) Not Signi cant
contains 1, the difference is not Signi cant
significant (the p value is >0.05).
• It is always better to report the CI with the

dds atio
p-value rather than the p-value alone.
Statistical tests
• Statistical tests are used to study the association between variables or the
difference between study groups.
• We use them to get the p-value and reach a conclusion if there is a statistical
significance or not.
• Statistical tests are either parametric or non-parametric tests.
• Parametric tests are used to compare means of the groups while non-
parametric tests are used to compare the medians.
• Parametric tests are used to compare groups where the numerical variables
are normally distributed. While non-parametric tests are used to compare
samples with non-normally distributed numerical data, or with ordinal data.
The most important statistical tests are:
34
Parametric
Statistical test Used for or non- Example
parametric
Used to compare means Parametric Comparing haemoglobin
Independent
of two independent level between patients in the
samples t test
groups treatment and control
(Student t test)
groups
Used to compare the Non- Comparing the hospital
medians of two parametric length of stay (not normally
Mann-Whitney
independent groups distributed) between the
test
(variable is not normally treatment and control
distributed) groups
Used to compare the Parametric Comparing the weight of a
means of one group group of individuals before
Paired t-test
under two conditions or and after being on a specific
time points (paired data) diet
Used to compare the Non- Comparing the pain score of
Wilcoxon Signed values for one group parametric a group of individuals before
Rank test under two conditions or and after receiving a specific
time points (paired data) medication
Used to compare the Parametric Comparing the birthweight
means of more than two of infants to mothers with
independent groups different smoking status
One-way ANOVA
(never smoke, quit before
pregnancy, smoke during
pregnancy)
Used to compare the Non- Comparing the neonatal
medians of more than parametric intensive care unit (NICU)
two independent groups length of stay for infants of
Kruskal-Wallis (variable is not normally mothers with different
test distributed) smoking status (never
smoke, quit before
pregnancy, smoke during
pregnancy)
To study if there is a Non- Comparing males and
relationship/association parametric females regarding having
between two categorical complications (yes or no). If
Chi-square test
variables there an association
between sex, and having a
complication
Fisher’s exact The same as Chi-square test but for small samples
35
Correlation
• Correlation: is a measure of the association between two continuous or ordinal
variables.
• It gives direction of relationship (positive/negative), and strength of relationship
(weak / medium / strong).
• The correlation coefficient (r): shows the strength and direction of the
relationship between the two variables. It ranges between -1 and +1.
• A positive r value = positive correlation, and a negative r value = negative
correlation.
• The closer the r value is to +1 or -1, the stronger the correlation between the two
variables, and the closer the value to 0, the weaker the relationship.
• The coefficient of determination = R2 (R squared): explains the proportion of
variance in one variable that can be explained by the other variable. It is the
square of the correlation coefficient (r).
• Types of correlation:
Pearson’s correlation(r): parametric test, used for numerical data that are linearly
associated and at least one of them is normally distributed.
Spearman’s correlation (rho): non-parametric test, used for ordinal data or numerical
data that are not normally distributed.
• Scatterplots help to illustrate the correlation. The following graphs show each
distribution and the corresponding correlation coefficient.
36
Regression
Regression is a statistical tool used mainly to study the association between one or
more variables and an outcome variable. It quantifies this relationship as we can get
a regression equation for this relationship.
Types of variables in regression:
• The variable that is being affected by other variables is called the outcome
variable, response variable or the dependent variable (Y).
• The variable that is studied for having possible effect on the outcome is called
predictor variable, explanatory variable, or independent variable (X).
Regression can be used for:
• Prediction: we can use one or more variables to estimate the value of the
outcome variable.
• To control for confounders: we can use regression to study the association
between independent variable(s) and the outcome of interest while controlling
for (adjusting) the effect of one or more variables.
37
Types of regression:
Selection of the appropriate regression model depends on type of the outcome
variable we are studying.
• If the outcome variable is continuous (numerical), we use linear regression.
• If the outcome variable is binary, we use logistic regression.
• If the outcome variable is time to event (survival data), we use Cox regression.
Dependent ariable / outcome /
ontinuous inar Time to event
inear inar lo istic o

re ression re ression re ression
Survival data
eart rate es/ o Time to death
lood pressure Diseased/ ot diseased Time to recurrence
ualit o li e score omplication/ o Time to second heart attack
complication
www.stats4drs.com
Moha med l s heri 2022
https://www.connectmedical .academ / 12
• Based on the number of predictor variables, the regression model is either
simple regression if there is only one predictor variable or multiple regression if
there is more than one predictor variable. So, we have simple linear regression,
multiple linear regression, simple logistic regression, multiple logistic regression,
etc.
Estimates resulting from simple regression are called crude or unadjusted, while those
resulting from multiple regression are called adjusted (adjusted and unadjusted OR).
When reporting the results of regression, we use the coefficients for linear regression,
while odds ratio (OR) is used for logistic regression, and hazard ratio (HR) is used for
Cox regression.
38
Survival analysis
• Survival analysis is used when the outcome of interest is the time until an event
occurs (time to event). This event is usually death, as survival after breast
cancer, but can be any other event.
Characteristics of survival studies:
• Individuals do not enter the study at the same time.
• When the study ends, some individuals still haven't had the event yet.
• Other individuals drop out or get lost in the middle of the study, and all we know
about them is the last time they were still 'free' of the event.
Survival analysis terms:
• Time to event: The time from entry into a study until a subject has a particular
event (outcome).
• Censoring (no event): Subjects are said to be censored if they are lost to follow
up or drop out of the study, or if the study ends before they die or have an
outcome of interest.
Displaying survival data:
• A survival curve, usually calculated by the
Kaplan–Meier method, displays the
cumulative probability (the survival
probability) of an individual remaining
free of the event at any time after
baseline.
• The vertical axis shows the probability of
surviving or the proportion of people surviving.
• The horizontal axis represents time.
39
• The curve moves down at the occurrence of every event.
• The Kaplan-Meier Curve can be used to estimate the probability of survival at
a specific time and can be used to estimate the median survival time which is
the time at which half the patients are expected to be alive.
• Kaplan-Meier Curve can be used to compare the survival in two groups. If the
curve goes down rapidly, the occurrence of the event is at a higher rate in this
group. The statistical test used for this comparison is the log rank test.
• Cox regression is the type of regression used in survival studies, and hazard
ratios (HR) are used to compare risk in different groups.
40

short-notes-in-medical-statistics-for-medical-examinations_e272c005-0c58-4e6b-b485-73c9eefb55e8

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

short-notes-in-medical-statistics-for-medical-examinations_e272c005-0c58-4e6b-b485-73c9eefb55e8

Uploaded by

Copyright:

Available Formats

Short Notes in Medical Statistics

For Medical Examinations

Ecological Studies ................................... 5 Descriptive statistics for numerical

Meta-analysis ........................................... 9 Skewed distribution .............................. 28

Bias ............................................................... 9 Part 5: Hypothesis Testing and

Dr. Mohamed Elsherif, MBBCh, MPH

Part 1: Study Design

Observational and Experimental Studies

Generally, research studies are classified as either observational or interventional

The researchers just observe, measure, The researchers apply an intervention

the study objects. technique, an educational program, or

- Case series any other intervention.

- Cross-sectional studies - Randomized controlled trials

The most important types of observational studies are:

• It is just a description of the characteristics and outcomes of this group.

• In a cross-sectional study, a sample is chosen and data on each individual is

collected at one point in time. It is usually looked at as a snapshot.

• It can be used for descriptive purposes (as prevalence), or analytical purposes

(association between exposure and outcome).

• Cross-sectional studies can be used to estimate the prevalence of a disease in

• Surveys are common examples for cross sectional studies.

• Research Question: Is there an association

between obesity and depression among

Cairo University students?

• A questionnaire is administered to assess

obesity (based on body mass index) and

PHQ-9 scale for example.

• No loss to follow up (there is no follow up).

• It helps in hypothesis generation (identifying possible risk factors).

• Several exposures and/or outcomes can be studied.

• Surveys with validated questions allow comparison between different studies

using the same tool.

• Cannot determine causality (No time sequence between exposure and

outcome, and we cannot establish cause and effect).

• Not suitable for rare outcomes or diseases.

• High refusal or non-response can cause bias.

• Cannot provide measures of incidence or relative risk.

• In case-control studies, a group of patients (cases) is compared to a similar

exposure to a suspected agent or risk factor.

• Research Question: Is there an

association between obesity

and bipolar disorder among

Cairo university students.

A Case-Control study design:

• Exposure: Self-report of obesity during high school.

• Useful with rare diseases or outcomes.

• Inexpensive and efficient (may be the only feasible option).

• Establishes measures of association (Odds ratios).

• Useful for generating hypotheses (multiple risk factors can be explored).

• Selection bias (when choosing inappropriate controls).

exposure accurately, especially the controls).

• Cannot estimate incidence or prevalence.

• Cohort studies evaluate a possible association between exposure and

outcome by following two groups of individuals (exposed and unexposed) over

determined and compared.

• Subjects should not have the outcome

variable (should be disease-free) on entry to

the study and should have the potential to

develop the outcome.

• Cohort studies may be prospective or

not the outcome.

• In retrospective cohort studies: exposure and outcome have already occurred

• Selection of a sample from the population (Cairo University students).

• Measuring the exposure (obesity) variable in the sample by assessing BMI.