Professional Documents
Culture Documents
BIO1103PE2
BIO1103PE2
BIO1103PE2
PRACTICE EXERCISE 2
CHAPTER 2: Basic Statistics and Chemometrics
Cuevas, Marish Ydrick
Demafelis, Richie Anthea
Diaz, Aalia Mhylls
Significance Testing
● A way to determine whether the difference between two or more values is too large to be
explained by indeterminate error is via significance testing.
● First step of any significance testing is through stating the null (H0) and alternative
hypothesis (HA).
o One of these hypotheses will be rejected while the other one will be accepted.
However, when failing to reject null hypothesis, it is not the same as accepting it. A
null hypothesis is retained because there is insufficient evidence to reject it.
● Second step is to identify the confidence level/confidence interval (CI), which defines the
probability that we are rejecting the null hypothesis. The confidence level can also be
expressed as α value. The confidence value α refers to the probability that we are
incorrectly rejecting the null hypothesis.
o Usually we set the confidence interval at 95%, but it really depends on the
𝐶𝐼
researcher. On the other hand, we can solve for α = 1 − 100
.
o Say, if we are testing at 95% confidence level, it means that the probability that we
are correctly rejecting the null hypothesis is 95%, while the probability of
incorrectly rejecting the null hypothesis is 5%.
o Is it always ideal to reject the null hypothesis? Not really, because there are cases
where having no significant difference is much preferred. It really depends on the
researcher that conducts the significance testing.
● Aside from identifying the confidence level, we are also to determine if we are conducting
the significance testing whether under one-tailed or two-tailed testing.
o If we are to just determine if there is any significant difference between two or
more values, we are doing significance testing under two-tailed testing.
If we are testing under two-tailed testing, we state the hypothesis as “…no significant difference…”,
or :
𝐻0: µ = µ0
𝐻𝐴: µ ≠ µ0
o However, if we are to determine if there was an increase, or decrease in the
difference between values, we use one-tailed testing.
o If we are testing under one-tailed testing, we state the hypothesis as “…no
significant increase/decrease depending on the condition:
Student’s t-test
● Student’s t-test is a way to compare if there are significant difference between two means.
We conduct t-test for two groups samples of unknown variances and the N<30.
● A one sample t-test tests the mean of a single group against a known mean/population
mean.
● An independent samples t-test compares the means for two groups.
● A paired sample t-test compares means of two groups dependent of each other, could be
by cause and effect or time (before and after).
● We reject or fail to reject the null hypothesis depending on the conditions:
H0 HA
tcalc>ttab Reject H0 Accept HA
Working Example
A. A supplier claims that there would be change in the longevity of batteries after they added new
Li ores sourced from other mining sites. To validate the supplier’s claim, you were tasked to test if
there would be significance difference by measuring the number of hours the battery lasts before,
and after addition of Li ore. You apply the parameters 90% CI, p=0.10.
Group 1 Group 2
Battery without new Li ore (hrs) Battery with new Li ore (hrs)
1 177 6 152 1 181 6 180
2 185 7 165 2 189 7 158
3 172 8 194 3 170 8 189
4 163 9 183 4 161 9 187
5 181 10 175 5 182 10 180
3.2. Get the difference between values of two groups, then sum them up. We will call this as ∑ 𝐷.
2
3.3. Square the differences then add them up. We will call this as ∑ 𝐷 .
1 177 181
2 185 189
3 172 170
4 163 161
5 181 182
6 152 180
7 165 158
8 194 189
9 183 187
10 175 180
SUM
2
So, we determine that for sample size N=_______, ∑ 𝐷 is ________, while ∑ 𝐷 is _________.
3.4. Once we determine the sums, we plug in those values in the formula below.
∑𝐷
𝑁
𝑡𝑐𝑎𝑙𝑐 =
2
⎛∑𝐷⎞
2 ⎝ ⎠
∑𝐷 − 𝑁
𝑁(𝑁−1)
( )
( )
𝑡𝑐𝑎𝑙𝑐 = 2
( )
( )− ( )
( )( −1)
𝑡𝑐𝑎𝑙𝑐 = ___________
3.5. Compare tcalc to ttab. Determine if we will reject H0 or not. To determine t tab, use the t-table for
comparison. For this problem, CI = 95%, p=0.10, two-tailed testing. Aside from the two values
below, we also need to determine degrees of freedom (df). If we have CI, p, and df we can
determine the needed ttab.
df= N-1 = ________
H0 HA
tcalc>ttab Reject H0 Accept HA
tcalc<ttab Fail to reject H0 Do not accept HA
ttab:________
tcalc __________ ttab
Since tcalc __________ ttab, we (reject, fail to reject) the null hypothesis. Therefore, there is (a significant,
no significant) difference after addition of the new Li ores.
*How about independent t-test? For independent t-test, we either assume that your two groups have equal
or unequal variances. To confirm, we first apply F-test to confirm if variances are equal, then we apply a
new formula depending on variances if equal or unequal. EXAM WILL ONLY COVER paired t-test.
||𝑥 |
| 𝑞𝑢𝑒𝑠𝑡𝑖𝑜𝑛𝑎𝑏𝑙𝑒−𝑥||
𝐺𝑐𝑎𝑙𝑐 = 𝑠
t-TEST TABLE
How to use: Solve for df. Then determine p value, which
is the value in the upper part. Make sure to note if one-tailed or two-tailed testing.
Practice Exercises
1. Find the mean, median, standard deviation, variance, RSD (in terms of %) of both groups.
3. Show if there is a significant difference by presenting solutions on how you solved for t calc and its
comparison to ttab.
4. Do we reject the null hypothesis? What would be the p inequality for the groups?
B. You and your friends tried doing significance testing out on a whim to know if the Dolomite
Beach can temporarily improve the emotional state after looking at it. So, your group made
randomized survey of 10 participants where they were asked initially to rate their emotional state
in a scale of 1-10, where 1 being least happy, and 10 being the happiest. Then, you led them to the
Dolomite Beach, let them sit for 15 minutes, and then asked again for their emotional state,
applying the same scale. Below are the score of the participants before and after staying at the
dolomite beach.
Participant Before Looking After Looking Participant Before Looking After Looking
No. at DB at DB No. at DB at DB
1 3 5 6 7 9
2 5 6 7 2 5
3 2 5 8 5 3
4 6 3 9 6 7
5 8 8 10 5 3
1. Find the mean, median, standard deviation, variance, RSD (in terms of %) of both groups.
Null = H0 = 0.4122
Alternative = Ha ≠ 0.4122
Null = H0 = 0,.3907
Alternative = Ha ≠ 0.3907
3. Show if there is a significant difference by presenting solutions on how you solved for t calc and its
comparison to ttab.
VARIABLE 1 VARIABLE 2
Observation 10 10
df 9
t Stat -0.727606875
4. Do we reject the null hypothesis? What would be the p inequality for the groups?
We do not reject the null hypothesis. There is no significant difference between before and after
looking at Dolomite beach
C. Given are chloride (Cl-) content of most drinking water samples gathered from a well source.
The chloride content was determined via argentometry or titrating the water sample with 0.01 M
of AgNO 3.
1 2 3 4 5 6 7 8 9
g Cl- per 100 mL
1.05 2.79 2.67 2.99 3.85 3.77 3.56 4.21 4.02
water (×10-3)
1. Find the mean, median, standard deviation, and RSD (as decimal) of the data set.
2. Is the highest value an outlier? Show by computing for G calc then compare it with Gtab.
Gtab= 2.110
Gcalc= | questionable value – mean| / standard deviation
Gcalc= | 4.22 – 3.21| / 0.98
1.02, therefore, NOT AN OUTLIER
3. Is the lowest value an outlier? Show by computing for G calc then compare it with Gtab.
Gtab= 2.110
Gcalc= | questionable value – mean| / standard deviation
Gcalc= | 1.05 – 3.21| / 0.98
2.20, therefore, IT IS AN OUTLIER
4. Find the new mean, standard deviation, and RSD (as decimal) of the data set AFTER removing
the outlier.
D. Below are amounts of Au that were extracted from electronic scraps thru electrogravimetric
process.
1 2 3 4 5 6 7 8
Mass Au (μg) 18 29 20 27 22 39 21 23
1. Find the mean, median, standard deviation, and RSD (as decimal) of the data set.
2. Is the highest value an outlier? Show by computing for G calc then compare it with Gtab.
Gtab: 2.032
3. Is the lowest value an outlier? Show by computing for G calc then compare it with Gtab.
4. Find the new mean, standard deviation, and RSD (as decimal) of the data set AFTER removing
the outlier.
-END-