Statistical Analysis PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 41

Statistics Guide!

Dr. Hamda Qotba, B.Med.Sc,


M.D, ABCM
Definition
Statistics is the science of collecting,
organizing, summarising, analysing,
and making inference from data

Descriptive stat. Includes Inferential stat. Includes


collecting, organizing, Making inferences,
summarising, analysing, hypothesis testing
and presenting data Determining relationship,
and making prediction

Dr.H.Qotba 2
Variables

Quantitative Qualitative
•Discrete •Ordinal
•Continuous •Categorical

Dr.H.Qotba 3
Parametric Vs.
non parametric tests

• Parametric: decision making method


where the distribution of the
sampling statistic is known
• Non-Parametric: decision making
method which does not require
knowledge of the distribution of the
sampling statistic
Dr.H.Qotba 4
Dr.H.Qotba 5
Dr.H.Qotba 6
Dr.H.Qotba 7
Dr.H.Qotba 8
Dr.H.Qotba 9
Dr.H.Qotba 10
Dr.H.Qotba 11
Dr.H.Qotba 12
t-Test
• Compare the means of a continuous
variable into samples in order to
determine whether or not the
difference between the 2 expected
means exceed the difference that
would be expected by chance

Dr.H.Qotba 13
Requirements

• The observations are independent


• Drawn from normally distributed
population
• Sample size < 30 if it’s >30 use
normal curve z test (binomial test)

Dr.H.Qotba 14
Types of t-Test
• One sample t test: test if a sample mean
for a variable differs significantly from the
given population with a known mean

• Unpaired or independent t test: test if the


population means estimated by
independent 2 samples differ significantly
(group of male and group of female)

• Paired t test: test if the population means


estimated by dependent samples differ
significantly (mean of pre and post
treatment for sameDr.H.Qotba
set of patients 15
5 – steps solution
1. Formulate 𝐻𝑜 and 𝐻𝑎
2. Set the level of significance 𝛼, then
determine the type of hypothesis test and the
tabular or p-value.
Type / 𝛼 0.01 0.025 0.05
One- tailed 2.33 1.96 1.65
Two- tailed 2.58 2.33 1.96

3. Set the criterion (when to reject 𝐻𝑜 )


4. Determine and compute for the test statistics.
Present the data. Make your decision
5. Formulate your conclusion.
Dr.H.Qotba 16
One – sample t-test
Example:
The scores of BSAIS-2A in Statistics test
were given below:
23, 45, 33, 37, 22, 12, 43, 30, 11, 8 with the
population mean of 27. Find out whether
these 10 students perform well with the
average of the whole population. Use
significance level at 0.05 two-tailed.

Dr.H.Qotba 17
One – sample t-test
a. 𝑯𝟎 = There is no significant difference
between the scores of 10 students from
BSAIS and the whole population.

𝑯𝒂 = There is a significant difference


between the scores of BSAIS and the
whole population.
b. 0.05 significance level, two-tailed
𝒕𝒕𝒂𝒃 = 1.96
c. Reject 𝑯𝟎 if 𝒕𝒄𝒐𝒎𝒑 > 𝒕𝒕𝒂𝒃 .

Dr.H.Qotba 18
One – sample t-test
A p value is used in hypothesis testing to
help you support or reject the null
hypothesis. The p value is the
evidence against a null hypothesis. The
smaller the p-value, the stronger the
evidence that you should reject the null
hypothesis.

Dr.H.Qotba 19
One – sample t-test
d. Presentation/ Decision
Std. Std. p-
N Mean t-cal Decision
Deviation Error value
Sample Accept the
10 26.4 13.35 4.22 -0.14 0.89
Students Ho
From the table, the mean of the ten sample
students is 26.4 with the standard deviation of
13.35. The t-computed value is -0.14 with the
corresponding p-value of 0.89. This means that the
null hypothesis has 89% of acceptance level. Since
-0.14 < 1.96, we accept the null hypothesis (Type II
Error).

Dr.H.Qotba 20
One – sample t-test
e. There is no significant difference between the
score of the BSAIS-2A with the score of the
population.

Therefore, there is no statistical evidence to show


that the sample perform better than the
population.

Dr.H.Qotba 21
Independent Sample Test
Example.
The following are the scores in spelling of
10 male and 10 female BSAIS students. Test
the null hypothesis using 0.05 level of
significance two-tailed.

Male:
14, 18, 17, 16, 4, 14, 12, 10, 9, 17

Female:
12, 9, 11, 5, 10, 3, 7, 2, 6, 13

Dr.H.Qotba 22
Independent Sample Test
a. 𝑯𝟎 = There is no significant difference
between the performance of male and
female in spelling.

𝑯𝒂 = There is a significant difference


between the performance of male and
female in spelling.
b. 0.05 significance level, two – tailed
𝒕𝒕𝒂𝒃 = 1.96

c. Reject 𝑯𝟎 if 𝒕𝒗𝒂𝒍𝒖𝒆 > 𝒕𝒕𝒂𝒃 .

Dr.H.Qotba 23
Independent Sample Test
d. Tabular Presentation/ Decision
Gender n Mean SD t-cal df p Decision

Male 10 13.10 19.43 2.88 9 0.01


Reject Ho
Female 10 7.80 14.4 9

From the table, the mean of the male


and female students with standard deviation
are M(13.10: 19.43) and F(7.80: 14.4),
respectively. The t-test calculated value was
2.88 with the degrees of freedom of 9 from
both groups. P value is found to be 0.01
which means there is a 1% chance of
Dr.H.Qotba 24
Independent Sample Test
d. Tabular Presentation/ Decision
Gender n Mean SD t-cal df p Decision

Male 10 13.10 19.43 2.88 9 0.01


Reject the
Ho
Female 10 7.80 14.4 9

accepting the null hypothesis. The t-


computed value 2.88 is greater than the
tabular value of 1.96 (2.88 > 1.96). This
means we have to reject the Ho (Type I
error).

Dr.H.Qotba 25
Independent Sample Test
e. 𝑯𝒂 = There is a significant difference
between the performance of male and
female in spelling.

This means that the performance of male


and female students in spelling contest was
statistically significant.

Dr.H.Qotba 26
Dependent Sample Test
Examples.
The following are the weights in pounds of
15 students before and after 6 months of
attending aerobics. Test the null hypothesis
at 0.05 significance level (two-tailed).
Before: 158, 192, 144, 243, 179, 201, 165,
183, 153, 170, 180, 212, 169, 172, 209

After: 159, 190, 140, 231, 173, 199, 162, 179,


152, 164, 177, 207, 170, 171, 196

Dr.H.Qotba 27
Dependent Sample Test
a. 𝑯𝟎 = There is no significant difference
between the weights of the students
before and after the aerobics program.
𝑯𝒂 = There is a significant difference
between the weights of the students
before and after the aerobics program.

b. 0.05 significance level, two-tailed


𝒕𝒕𝒂𝒃 = 1.96

c. Reject 𝑯𝟎 if 𝒕𝒄𝒐𝒎𝒑 > 𝒕𝒕𝒂𝒃

Dr.H.Qotba 28
Dependent Sample Test
d. Mean
Program N SD t-cal df p Decision
Difference
Before
15 4 1.05 3.81 14 0.002 Reject Ho
After

From the table, the mean difference


between the two programs is 4 with the
standard deviation of 1.05. The calculated t-
value is 3.81 with the corresponding p-value
of 0.002. This means that there is 0.2% of
accepting the null hypothesis.
Since 3.81 > 1.65, we reject Ho (Type I Error).

Dr.H.Qotba 29
Dependent Sample Test
e. 𝑯𝒂 = There is a significant difference
between the weights before and after the
aerobics program.

This means that the aerobics program


is statistically significant to the students.

Dr.H.Qotba 30
ANOVA

• is used to uncover the main and


interaction effects of categorical
independent variables (called "factors")
on an interval dependent variable

Dr.H.Qotba 31
Types of ANOVA

• One-way ANOVA tests differences in


a single interval dependent variable
among two, three, or more groups
formed by the categories of a single
categorical independent variable.

Dr.H.Qotba 32
• Two-way ANOVA analyzes one interval
dependent in terms of the categories
(groups) formed by two independents,
one of which may be conceived as a
control variable
• Multivariate or n-way ANOVA. To
generalize, n-way ANOVA deals with n
independents. It should be noted that as
the number of independents increases,
the number of potential interactions
proliferates
Dr.H.Qotba 33
One-Way Anova (f-test)
A sari-sari store is selling 4 brands of
shampoo. The owner is interested if there is
a significant difference in the average sales
of the four brands of shampoo for one week.
The following data are recorded.
Sunsilk Pantene Rejoice Clear
7 9 2 4
3 8 3 5
5 8 4 7
6 7 5 8
9 6 6 3
4 9 4 4
3 10 2 5
Dr.H.Qotba 34
One-Way Anova
Perform the analysis of variance and test the
hypothesis at 0.05 level of significance that
the average sales of the four brands of
shampoo are equal.
Sunsilk Pantene Rejoice Clear
7 9 2 4
3 8 3 5
5 8 4 7
6 7 5 8
9 6 6 3
4 9 4 4
3 10 2 5

Dr.H.Qotba 35
One-Way Anova

Dr.H.Qotba 36
One-Way Anova
a. 𝑯𝟎 = There is no significant difference
among the sales of the four brands of
shampoo.
𝑯𝒂 = There is a significant difference
among the sales of the four brands of
shampoo.

b. 0.05 significance level, df = (c)(r – 1) by


(c – 1) = (4)(7 – 1) by (4 – 1) = 24 by 3
𝒕𝒕𝒂𝒃 = 3.009
c. Reject 𝑯𝟎 if 𝒇𝒗𝒂𝒍𝒖𝒆 > 𝒇𝒕𝒂𝒃
Dr.H.Qotba 37
One-Way Anova

Dr.H.Qotba 38
One-Way Anova
d.

From the table, the sums of the


squares of between treatments and within
treatments are 72.29 and 72.57, respectively
with the degrees of freedom of 3 and 24. The
means squares are 24.10 and 3.02.

Dr.H.Qotba 39
One-Way Anova
d.

The obtained f-value is 7.97 which is greater


than the f-tabular value. Since 7.97 > 3.009,
we reject Ho (Type I Error).

Dr.H.Qotba 40
One-Way Anova
e. 𝑯𝒂 = There is a significant difference
among the sales of the four brands of
shampoo.

Therefore, the sales of the four brands


of shampoo were statistically significant.

Dr.H.Qotba 41

You might also like