Download as pdf or txt
Download as pdf or txt
You are on page 1of 55

ANSWER SHEET IN PRACTICAL RESEARCH 2

MODULE 7

Quarter 2- Module 7
Data Analysis using Statistical Techniques

Name and Section of Student: Kenneth Aquino


Grade: 12-Comte
Name of Instructor: Ms. Princess Clarizz Joy Miranda Saludes

January 2021
Let Us Try! Complete the following problems.

C 1. What is the mean of the following numbers? 10, 39, 71, 39, 76, 38, 25
a. 42 b. 39 c. 42.5 d. 35.5

B 2. Find the median of the set of numbers: 21, 3, 7, 17, 19, 31, 46, 20 and 43.
a. 19 b. 20 c. 3 d. 167

B 3. The following represents age distribution of students in an elementary class. Find


the mode of the values: 7, 9, 10, 13, 11, 7, 9, 19, 12, 11, 9, 7, 9, 10, 11.

a. 7 b. 9 c. 10 d. 11

A 4. The following numbers represent the ages of people on a bus: 3, 6, 27, 13, 6, 8, 12,
20, 5, 10. Calculate their mean of their ages.

a. 11 b. 6 c. 9 d. 110

C 5. Find the mode from these test results: 17, 19, 18, 17, 18, 19, 11, 17, 16, 19, 15, 15,
15, 17, 13, 11. A

. 15 b. 11 c. 17 d. 19

A 6. Find the median of the set of numbers: 100, 200, 450, 29, 1029, 300 and 2001.

a. 300 b. 29 c. 7 d. 4,080

D 7. These numbers are taken from the number of people that attended a church every
Friday for 7 weeks: 62, 18, 39, 13, 16, 37, 25. Find the mean.

a. 25 b. 210 c. 62 d. 30

B 8. The number of service upgrades sold by each of 30 employees is as follows: 32, 6,


21, 10, 8, 11, 12, 36, 17, 16, 15, 18, 40, 24, 21, 23, 24, 24, 29, 16, 32, 31, 10, 30, 35, 32,
18, 39, 12, 20 What is the median number of service upgrades sold by the 30 employees?

a. 18 b. 21 c. 24 d. 32

C 9. Which of the following measures can be calculated for qualitative data?


a. Mean b. Median c. Mode d. All of the Above

A 10. What is the term used to describe the distribution of a data set with one mode?
a. Multimodal b. Unimodal c. Nonmodal d. Bimodal

Let Us Practice Task A: Write the letter of the correct answer to the following questions.

B 1. The coefficient of correlation is


a. is equal to the proportion of the variation in the Y variable that is due to variations in
the X variable.

b. a measure of the strength and direction of the linear relationship between two variables.

c. equal to the size of the change in the Y variable that is caused by a change in the X
variable.

d. All of the above are correct.

C 2. Scatter diagram is considered for measuring


a. Linear relationship between two variables b. Curvilinear relationship between two
variables c

. Both a and b d. None of the above

B 3. From the following data x 2 3 5 4 7 y 4 6 7 8 10 Two coefficient of correlation was


found to be 0.93. What is the correlation between u and v as given below? u - 3 -2 0 -1 2
v -4 -2 -1 0 2

a. -0.93 b. 0.57 c. 0.93 d. -0.57

D 4. The coefficient of determination


a. is maximized by ordinary least squares.

b. has a value between zero and one.

c. will generally increase if additional independent variables are added to a regression


analysis.

d. All of the above are correct.

A 5. The regression line of y is derived by


a. The minimization of vertical distances in the scatter diagram

b. The minimization of horizontal distances in the scatter diagram

c. All of the above

d. None of the above

Task B.

Here’s a data gathered by Purok A City High School administration regarding the number
of Grade 7 parents who opted to receive printed copies of the learning modules. Fill out
the boxes for total and percentage. Then write a brief interpretation of the table.

SECTIONS TOTAL NUMBER Number of Percentage (%)


OF PARENTS Parents who
opted to receive
printed copies of
learning modules

7-A 30 6 20%

7-B 25 0 0

7-C 32 16 50%

7-D 30 19 63.33%

Total 117 41 ≈35.04%

Interpretation

Data shows that there 6 parents who only who opted to receive printed copies of
learning modules Grade 7-A out of 30. The number of parents who opted to receive
printed copies of learning in Grade 7-A is only 20% to the total expected number of
parents that will get the learning modules. In the other hand, there are no parents who
opted to receive printed copies of learning module in Grade 7-B. In grade 7-B there
are 16 parents who opted to receive printed copies of learning module out 32 which
50% of the total expected numbers of parents that will get the copies of printed
learning modules. In Grade 7-D there 19 parents who opted to receive their learning
modules out of 30 which is 69% of the expected number of parents who receive the
copies of printed learning module in this section. Over in all, the total number of grade
7 students in by Purok A City High School is 117 and the total number of parents who
opted to receive the printed copies of learning module is 41. The total number of
parents who opted to receive the printed copies of learning module is approximately
35.04 % which indicates that there are only few parents get the copies of printed
learning module.

Task D. Here’s the data gathered from the survey on Study Habits conducted by the
Grade 12 students to the 150 Grade 7 students of Purok A City High School.

A review of Study Habits

Agree(4) Undecided(3) Disagree Strongly Mean Standard Verbal


Strongly (2) Disagree(1) Deviation Interpretation
Agree
(5)

The desk 90 30 10 5 15 4.17 3.86 Always


where I observed
study is
always
clear from
distraction
I use 10 50 30 20 40 2.8 2.19 Sometimes
earplugs observed
to
minimize
distracting
sounds

I study 15 35 30 20 50 2. 63 2.50 Seldom


facing a
wall

Mean formula for Linkert scale

x̅= ∑(x.w)/n

Where x is the number of respondents selected the level of agreement

w is the corresponding value level of agreement

n is the total number of respondents

Level of agreement Corresponding value level of


agreement

Strong Agree 5

Agree 4

Undecided 3

Disagree 2

Strongly Disagree 1
1. The desk where I study is always clear from distraction
2. x̅= ∑(x.w)/n
x(w)

90(5) 450

30(4) 120

10(3) 30

5(2) 10

15(1) 15

∑(x.w) 625

1. x̅= ∑(x.w)/n
x̅=625/150
x̅= 4.167 ≈ 4.17

Standard Deviation

SD= √ (x̅2- x̅)

x̅2= ∑(x.w^2)/n

x w w^2 xw^2

90 5 25 90(25)=2250

30 4 16 30(16)= 480

10 3 9 10(9)=90

5 2 4 5(4)=20

15 1 1 15(1)=15

∑(x.w^2)= 2855
x̅2= ∑(x.w^2)/n

x̅2= 2855/150

x̅2= 19.033

Standard Deviation

SD= √ (x̅2- x̅)

Where x̅= 4.167 and x̅2= 19.033

SD= √ (19.033-4.167)

SD= 3.85564 = 3.86

2. I use earplugs to minimize distracting sounds

x w w^2 xw xw^2

10 5 25 50 250

Strongly Agree

50 4 16 200 800

Agree

30 3 9 90 270

Undecided

20 2 4 40 80

Disagree

40 1 1 40 40

Strongly
Disagree
∑(x.w)= 420

∑(x.w^2)= 1140

x̅= ∑(x.w)/n

x̅= 420/150

x̅= 2.8

x̅2= ∑(x.w^2)/n

x̅2= 1140/150

x̅2= 7.6

SD= √ (x̅2- x̅)

SD= √(7.6-2.8)

SD= 2.19

3. I study facing a wall

x w w^2 xw xw^2

15 5 25 75 375

Strongly Agree

35 4 16 140 560

Agree

30 3 9 90 270

Undecided

20 2 4 40 80
Disagree

50 1 1 50 50

Strongly
Disagree

∑(x.w)= 395

∑(x.w^2)= 1335

x̅= ∑(x.w)/n

x̅= 395/150

x̅= 2.633 ≈ 2.63

x̅2= ∑(x.w^2)/n

x̅2= 1335/150

x̅2=8.9

SD= √ (x̅2- x̅)

SD= √(8.9-2.63)

SD= 2.503 ≈ 2.50


Legend:

Scale Range Verbal


Interpretation

5 4.6- 5.4 Always

4 3.7-4.5 Frequently

3 2.8-3.6 Sometimes

2 1.9-2.7 Seldom

1 1-1.8 Rarely

Rating scale: Highest range-lowest range/highest rang

(5-1)/ (5)= 4/5= 0.8

It means that we need to add 0.8 to our scale determine our range

Let Us Practice More

Task A. Solve the following problems completely as directed:

1. The values of y and their corresponding values of y are shown in the table below:

x 0 1 2 3 4

y 2 3 5 4 6

a. Find the least square regression line y = a x + b.

b. Estimate the value of y when x = 10.


a. Find the least square regression line y = a x + b.

x y xy x^2 y^2

0 2 0 0 4

1 3 3 1 9

2 5 10 4 25

3 4 12 9 16

4 6 24 16 36

∑x=10 ∑y=20 ∑xy=49 ∑x ^2=30 ∑y^2=90

Equation of the line y = ax+b

Computing for a and bd:

𝑏𝑦𝑥 = ∑ 𝑋𝑌− ⌊∑ 𝑋 ∑ 𝑌/𝑁⌋ [∑ 𝑋2−(∑𝑋)2/𝑁]

𝑎𝑦𝑥 = 𝑌 – 𝑏𝑦𝑥𝑋

Given: ∑x=10 ∑y20 ∑xy=46 ∑x ^2=30 ∑y^2=90 N=5

a = ∑ 𝑋𝑌− ⌊∑ 𝑋 ∑ 𝑌/𝑁⌋/ [∑ 𝑋2−(∑ 𝑋) 2/𝑁]

a= 𝑁 ∑ 𝑋𝑌− ⌊∑ 𝑋 ∑ 𝑌⌋ /[𝑁 ∑ 𝑋2−(∑ 𝑋) 2]

a= (5(49)-10(20))/(5(30)-(10)^2

a=(245-200)/(150-100)

a=9/10

a= 0.9
𝒀 = ∑ 𝒀/N

Y= 20/5

Y= 4

X= ∑ 𝑿/𝑵

X= 10/5

X=2

b = 𝑌 − a𝑋

b = 4-0.9(2)

b=4-1.2

b= 2.2

y= 0.9x+2.2

b. Estimate the value of y when x = 10.

0.9x+2.2

y=0.9(10)+2.2
y= 9+2.2
y= 11.2 or ≈ 11

2.Using the following summary data, perform a one-way analysis of variance using
α=.01.

n mean Sd

30 50.26 10.45
30 45. 32 12.76

30 53.67 11.47

Solution:

SOURCE SUM OF SQUARES DEGREES OF VARIANCE F RATIO


FREEDOM ESTIMATE

Between 𝑆𝑆B K-1 MSB= 𝑆𝑆B/K-1 MSB / 𝑀𝑆W

Within 𝑆𝑆W N-K MSW= 𝑆𝑆W/N-k

Total SSR= 𝑆𝑆B+ 𝑆𝑆w N-1

Computational Procedure:

1. Define the Null and Alternative Hypothesis:

𝐻o: Group 1= Group 2=Group 3

Ha: Atleast two of the means of Group 1, Group 2, Group 3 are not Equal

2. State Alpha

α=.01

3. df= n-1 = 30-1= 29

4. State Decision Rule

One-Tailed Test: ȁ𝑓ȁ > 𝑓𝛼 , 𝑑𝑓1 , 𝑑𝑓2 Reject Ho

Two-Tailed Test: ȁ𝑓ȁ > 𝑓𝛼 2 , 𝑑𝑓1 , 𝑑𝑓2 Reject Ho


5. Calculate Test Statistic

𝑆𝑆b=n∑ki=1 (ӯ𝑖 – ӯ)^2

Ӯ= ∑ x̄ /N = (50.26+ 45.32+53.67)/3= 49.75

𝑆𝑆b = 30(50.26-49.75)^2 + 30(45.32-49.75)^2+30(53.67-49.75)^2

𝑆𝑆b = 1057.542

𝑆𝑆W

Sd Sd^2 or Variance n-1(sd)^2 or n(Variance) ∑ n-1(sd)^2

10.45 109.2025 29(109.2025)=3166.8725 11703.809

12.76 162.8176 29(162.8176)=4721.7104

11.47 131.5609 29(131.5609)=


3815.2261

𝑆𝑆W=11703.809

Source of Sum of Degrees of Mean squares F Ratio


Variation Squares Freedom

Between 1057.542 k-1 =3-1=2 MSB= 𝑆𝑆B/K-1 MSB / 𝑀𝑆W=

MSB 528.771/134.5265..
=1057.542/2
MSB / 𝑀𝑆W=
MSB=528.771
3.930607292 or

3.9306
Within 11703.809 N-k=90-3=87 MSW= 𝑆𝑆W/N-k

MSW=
11703.809/87

MSW=
134.5265..

Total 12761.351 89

𝑓 α/2 , 𝑑𝑓1, 𝑑𝑓2= 𝑓0.005, 2, 87 = 5.634495

𝑓 α/2 , 𝑑𝑓1, 𝑑𝑓2= 5.63

l3.9306l>5.634495

Conclusion: Since 3.93 does not fall in the rejection region


which is 5.63 we need to accept the null hypothesis

3. Sleep researchers decide to test the impact of REM sleep deprivation on a


computerized assembly line task. Subjects are required to participate in two nights of
testing. On the nights of testing EEG, EMG, EOG measures are taken. On each night of
testing the subject is allowed a total of four hours of sleep. However, on one of the nights,
the subject is awakened immediately upon achieving REM sleep. On the alternate night,
subjects are randomly awakened at various times throughout the 4-hour total sleep
session. Testing conditions are counterbalanced so that half of the subject experience
REM deprivation on the first night of testing and half experience REM deprivation on the
second night of testing. Each subject after the sleep session is required to complete a
computerized assembly line task. The task involves five rows of widgets slowly passing
across the computer screen. Randomly placed on a one/five ratio are widgets missing a
component that must be "fixed" by the subject. Number of missed widgets is recorded.
Compute the appropriate t-test for the data provided

REDEPRIVED CONTROL CONDITION

26 20

15 4

8 9

44 36

26 20

13 3

38 25

24 10

17 6

29 14

Computational Procedure:

Type of t test: 2 sample t-test

x̅1= ∑x1/n

x̅1=(26+15+8+44+26+13+38+24+17+29)/10

x̅1= 24 is the mean for REMDEPRIVED

s1= √∑(x1- x̅1)^2/(10-1)


∑(x1-x̅1)^2=(26-24)^2+(15-24)^2+(8-24)^2+(44-24)^2+(26-24)^2+(13-24)^2+(38-24)^2+(24-
24)^2+(17-24)^2+(29-24)^2

∑(x1-x̅1)^2 = 1136

s1= √(1136/9)

s1= 11.23487 or 11.23 is the standard deviation for REMDEPRIVED

x̅2= ∑x2/n

x̅2 = (20+4+9+36+20+3+25+10+6+14)/10

x̅2 = 14.7 mean for CONTROL CONDITION

s2= √∑(x2- x̅2)^2/(10-1)

∑(x2-x̅2)^2=(20-14.7)^2+(4-14.7)^2+(9-14.7)^2+(36-14.7)^2+(20-14.7)^2+(3-
14.7)^2+(25-14.7)^2+(10-14.7)^2+(6-14.7)^2+(14-14.7)^2

∑(x2-x̅2)^2= 998.1

s2= √998.1/9

s2=10.53091 or 10.53 is the standard deviation for Control Condition

Null hypothesis: There is no significant difference between two nights of testing

Alternative hypothesis: There is a significant difference between two nights of testing

α=0.05

α/2 = 0.025

Degrees of freedom from 2 sample t test:

df=[((s1)^2)/((n1) + (s1)^2)] / [ ((s1)^2)/((n1))^2)/(n1-1) + [ ((s2)^2)/((n2))^2)/(n2-1)

We can show our degrees of freedom through this equation if it is 2 sample t-test.
Substituting the known values, we get df=17.92593309 or 18.
For convenience I created a legitimate and more easy formula for degrees of freedom in
2 sample t test

df=n1+n2-2

df=10+10-2

df= 18

We can see that we arrive at the same answer from the previous or traditional way formula
to my new formula of degrees of freedom. Therefore, our degrees of freedom is 18.

tα/2, df

t0.025, 18 = 2.101 Note: please refer to the t-table

Therefore, our t critical value is 2.101

For computed t value

t=( x1- x2)-/ √ (s1^2/n1) +( s2^2/ n2)

t=(24-14.7)/ √ (126.222222222222/10)+ (110.9/10)

t=1.909839 or 1.9098

1.909839> 2.101

Conclusion: Since 1.909839 is not in the rejection region which is 2.101 we can say
that the null hypothesis is accepted. Therefore there is not significant difference between
two night testing

a. What is your computed answer?

My computed t value is t=1.909839 or 1.9098. I use the formula of difference of means


also known as the formula for computing the t value for this problem as shown above in
my solution
. b. What would be the null hypothesis in this study?

Null hypothesis: There is no significant difference between two nights of testing

c. What would be the alternate hypothesis?

Alternative hypothesis: There is a significant difference between two nights of testing

d. What probability level did you choose and why?

I choose a confidence level of 95% meaning I’m 95% confident that if this test were
repeated it has 95% probability that it will yield the same results. So therefore P
value(alpha) that was chosen in this problem is 5% which is equal to 0.05

e. What is your tcrit?

Base on my calculation using the formula of computing tcrit for 2 samples my tcrit is 2.101
as shown above in my solution

f. Is there a significant difference between the two testing conditions?

No. Since 1.909839(t computed) is not in the rejection region which is 2.101 we can say
that the null hypothesis is accepted. Therefore, there is no significant difference
between two night testing

g. Interpret your answer

Based on the given samples the first night has a mean of 24 and a standard
deviation of 11.23. The data from the second night of the experiment has a mean of 14.7
and a standard deviation of 10.53. Researchers are confident that there is no significant
difference between the first night and the second night. The researcher is 95% confident
that if he/she repeated the trial he/she was a 5% of probability of committing an error.
Based on the researcher’s calculation he/she has a t critical of 2.101 which is the
rejection region. A two tailed t-test was used in order to interpret and make an inference
about the data. If the computed t value will lies in the 2 regions of t critical the null
hypothesis will be rejected. Based on the calculation the computed t value is 1.909839
which is not located in the rejection region. Therefore, there is no significant difference
between the test in the first night and the second night.

Let Us Remember

Task A: CROSSWORD PUZZLE. Read the clues and put the answers into the puzzle.
No Erasure.

P P
C O R R EL A T I O N
I REG R E S S I O N N

O N E W A Y A N O V A I P

E D U

A I L

S C A T T E R D I A G R A M C A

R T T

E I
D E G R E S O F F R E E D O
G R O

S R V

P R E G R E S S I O N E Q U AA T T I O N N

E S R

A P S I M

R E G R E S S O N L I N E A T T E S T
I
M A O B A

A R N L N

N S L I N E O F B E S T F I T

R O S

H N

O R
Answers:

1. CORRELATION

2. LINEAR REGRESSION

3. PREDICTIVE VARIABLES

4. POPULATION MEAN

5. ONE WAY ANOVA

6. REGRESSION

7. SCATTER DIAGRAM

8. DEGREES OF FREEDOM

9. SPEARMAN RHO

10.REGRESSION EQUATION

11.PEARSON R

12.T TEST

13.REGRESSION LINE

14. LINE OF BEST FIT

15.CRITERION

Task B. Here’s the data about the Math Pretest and Posttest scores of ten (10) Grade 12
students of Purok A City High School. Is there a significant relationship between the
pretest and posttest scores in Math?

Student Pre- Test Post Test


1 49 45

2 32 37

3 34 39

4 45 47

5 41 40

6 20 40

7 27 39

8 32 45

9 37 41

10 31 48

1. Compute the value Pearson’s r:

Student x y x^2 y^2 Xy

1 49 45 2401 2025 2205

2 32 37 1024 1369 1184

3 34 39 1156 1521 1326

4 45 47 2025 2209 2115

5 41 40 1681 1600 1640

6 20 40 400 1600 800


7 27 39 729 1521 1053

8 32 45 1024 2025 1440

9 37 41 1369 1681 1517

10 31 48 961 2304 1488

∑x= 348 ∑y=421 ∑x^2=12770 ∑y^2=17855 ∑xy=14768

𝑟 = 𝑛 ∑ 𝑥𝑦 − (∑ 𝑥)(∑ 𝑦) √[𝑛(∑ 𝑥 2) − (∑ 𝑥) 2][𝑛(∑ 𝑦 2 ) − (∑ 𝑦) 2]

r= [10(14768)-348(421)]/[ √(10(12770)-(348)^2 (10(17585-(421)^2)

r= . 0.398857

r= 0.398 = 0.4

2. Interpretation:

From the table shown we can see that the score of the student from the pretest
increases in the post test. We can see that the type of correlation we have based on the
table is linear direct correlation. Linear direct correlation means that for every increase of
variable there is also a corresponding increase in the second variable as what the pre-
test and post test shown. However after computing the correlation coefficient we obtain
r=0.398 or 0.4. According to the works of Garret (1969) Pearson r can be interpreted if it
is a high or low relationship. Based on the woks of Garret r from ±0.21 to ± 0.40 denotes
low but slight relationship. Therefore, because we r=0.398 or 0.4 we can say that the
there is a low but slight relationship between pre-test and post-test

3. What linear equation best predicts the posttest given the pretest in Math?
___________

Equation of the line y = ax+b

Computing for a and bd:

𝑏𝑦𝑥 = ∑ 𝑋𝑌− ⌊∑ 𝑋 ∑ 𝑌/𝑁⌋ [∑ 𝑋2−(∑𝑋)2/𝑁]


𝑎𝑦𝑥 = 𝑌 – 𝑏𝑦𝑥𝑋

a = ∑ 𝑋𝑌− ⌊∑ 𝑋 ∑ 𝑌/𝑁⌋/ [∑ 𝑋2−(∑ 𝑋) 2/𝑁]

a= 𝑁 ∑ 𝑋𝑌− ⌊∑ 𝑋 ∑ 𝑌⌋ /[𝑁 ∑ 𝑋2−(∑ 𝑋) 2]

a= (10(14768)-348(421))/(10(12770)-(348)^2

a=0.177683445

a= 0.177683445

Notice that it was round up. We must note to our self to find the best linear equation that
describes our pre-test and post-test. So we must be estimate closer as we plug in our
values

𝒀 = ∑ 𝒀/N

Y= 421/10

Y= 42.1

X= ∑ 𝑿/𝑵

X= 348/10

X=34.8

b = 𝑌 − a𝑋

b = 42.1-

b=42.1 - 0.177683445(34.8)

b= 35.91661629

y= 0.177683445x+35.91661629

y= 0.17768344x+35.91661629 the is linear equation best predicts the posttest given the
pretest in Math. It was not round off because we want to estimate the closest value as we
plug in our x.
4. If a student made a pretest score of 43 in Math, what grade would you expect
the posttest score the student will obtain?

y=0.17768344x+35.91661629

f(x)= 0.17768344x+35.91661629

f(43)= 0.17768344(43)+35.91661629

f(43)= 43.55700421

f(43)=44

If a student made a pretest score of 43 in Math the expected estimated post test score
according the calculation is 44

Show the line of best fit and its interpretation.

49

48

47

46

45

44

43

42

41

40

Y int :( 0 , 35.91661629)

40 41 42 43 44 45 46 47 48 49 50
Y=0.17768344x+35.91661629

x intercept:

Let y=0

0.17768344x+35.91661629

0=0.17768344x+35.91661629

-35.91661629= 0.17768344x

x=- 202.1103465 located in the negative axis of the Cartesian plane

y intercept:

let x=0

y= 0.17768344x+35.91661629

y=0.17768344(0)+35.91661629

y = 35.91661629

coordinate :(0, 35.91661629)

Interpretation:

The graph shows the continuous rising of the linear equation as the values
of x increasing. We can also see the scattered points from the graph which indicates the
scores in pretest in the x axis and scores in posttest in y axis. Some of the points were
closer to the line and some of the points is not close to the line. In other words, most of
the students scores higher than the protest than their posttest. Therefore, from data
shown there is a significant difference from the posttest from the pretest. As shown in the
graph as students take their pre-test there is a probability that they will score higher than
their pretest. However, the correlation value of Pearson r as shown from the calculation
is equal to 0.4 which indicates slight relationship. Therefore, even if there is a probability
that students will make a higher score in their posttest than their pre-test, the pre-test and
posttest has a slight relationship.
Let Us Assess

Task A. Solve the following problems completely as directed:

1. The data below shows the scores obtained by the top ten junior high school students
at a certain private high school on an entrance test for Senior High School (SHS) and a
mathematical ability aptitude test for STEM strand

STUDENT SHS ENTRANCE TEST (x) MATHEMATICAL


ABILITY APTITUDE TEST
(y)

1 55 52

2 32 26

3 68 56

4 62 50

5 40 38

6 62 60

7 40 50

8 30 18

9 48 44

10 68 56
a. Plot a scatter diagram for the data

Scatter Diagram
70
60
60 56
Mathematical Ability Aptitude Test

52
50 50
50 44
38
40

30 26

18
20

10

0
0 10 20 30 40 50 60 70 80
SHS Entrance EExam

b. Calculate the Pearson r

(x) (y) x^2 y^2 xy

55 52 3025 2704

32 26 1024 676 832

68 56 4624 3136 3808

62 50 3844 2500 3100

40 38 1600 1444 1520

62 60 3844 3600 3720


40 50 1600 2500 2000

30 18 900 324 540

48 44 2304 1936 2112

68 56 4624 3136 3808

∑x=505 ∑y=450 ∑x^2= 27389 ∑y^2= 21956 ∑xy=24300

𝑟 = 𝑛 ∑ 𝑥𝑦 − (∑ 𝑥)(∑ 𝑦) √[𝑛(∑ 𝑥 2) − (∑ 𝑥) 2][𝑛(∑ 𝑦 2 ) − (∑ 𝑦) 2]

r=[10(24300)-(505)(450]/[ √(10(27389)-(505)^2)(10(21596)-(450)^2)

r=0.877936

r= 0.88

c. Convert to ranks and calculate the Spearman Rank-Order Correlation Coefficient

(x) (y) Rank of x Rank of y D D^2

55 52 5 4 1 1

32 26 9 9 0 0

68 56 1.5 2.5 -1 1

62 50 3.5 5.5 -2 4

40 38 7 8 -1 1

62 60 3.5 1 2.5 6.25

40 50 7 5.5 1.5 2.25

30 18 10 10 0 0

48 44 6 7 -1 1
68 56 1.5 2.5 -1 1

∑x=505 ∑y=450 ∑D^2= 17.5

𝑟𝑠 = 1 − 6 ∑ 𝐷 ^2/ 𝑛(𝑛^ 2 − 1)

rs= 1 - 6(17.5)/(10(100-1)

rs= 1 – 105/10(99)

rs 1 – 7/66

rs= 0.893939

rs= 0.89

2. The ranks of the height and weight of seven male senior high school students are
given below. Calculate the correlation coefficient.
STUDENT

A 7 3.5

B 6 1

C 5 3.5

D 4 5.5

E 3 5.5

F 2 7

G 1 2

Calculation for Spearman Rho Correlation Coefficient

x y D D^2

7 3.5 3.5 12.25


6 1 5 25

5 3.5 1.5 2.25

4 5.5 -1.5 2.25

3 5.5 -2.5 6.25

2 7 -5 25

1 2 -1 1

∑D^2= 74

𝑟𝑠 = 1 − 6 ∑ 𝐷 ^2/ 𝑛(𝑛^ 2 − 1)

rs= 1 - 6(74)/ 7(49-1)

rs= 1 - 444/336

rs= 1- 37/28

rs= -0.3214285714

rs= -0.32 is the correlation coefficient of height and weight

3. The sales of a company (in million dollars) for each year are shown in the
table below.

X(years) 2005 2006 2007 2008 2009

Y(sales) 12 19 29 37 45

a. Find the least square regression line y = a x + b.


x y X^2 Y^2 xy

2005 12 4020025 144 24060

2006 19 4024036 361 38114

2007 29 4028049 841 58203


2008 37 4032064 1369 74296

2009 45 4036081 2025 90405

∑x= 10035 ∑y=142 ∑y^2= 4740 ∑xy=285078


∑x^2=20140255

𝑏𝑦𝑥 = ∑ 𝑋𝑌− ⌊∑ 𝑋 ∑ 𝑌/𝑁⌋ [∑ 𝑋2−(∑𝑋)2/𝑁]

𝑎𝑦𝑥 = 𝑌 – 𝑏𝑦𝑥𝑋

a = ∑ 𝑋𝑌− ⌊∑ 𝑋 ∑ 𝑌/𝑁⌋/ [∑ 𝑋2−(∑ 𝑋) 2/𝑁]

a= 𝑁 ∑ 𝑋𝑌− ⌊∑ 𝑋 ∑ 𝑌⌋ /[𝑁 ∑ 𝑋2−(∑ 𝑋) 2]

a= [5(285078)- 10035(142)]/[ 20140255-10035]

a= 8.4 Is the slope

𝒀 = ∑ 𝒀/N

Y= 142/5

Y= 28.4

X= ∑ 𝑿/𝑵

X= 10035

X=2007

b = 𝑌 − a𝑋

b = 28.4-8.4(2007)

b= -16830.4 is our y intercept

then the least regression line in a form of y=ax+b is


y=8.4x-16830.4

b. Use the least squares regression line as a model to estimate the sales of the
company in 2012.

y= 8.4x-16830.4

f(x)= 8.4x-16830.4

f(2012)=8.4(2012)- 16830.4

f(2012)= 70.4

Therefore, the estimated sales in 2012 is


70.4

4. A clinical psychologist has run a between-subjects experiment comparing two


treatments for depression (cognitive-behavioral therapy (CBT) and client-centered
therapy (CCT) against a control condition. Subjects were randomly assigned to the
experimental condition. After 12 weeks, the subject’s depression scores were
measured using the CESD depression scale. The data are summarized as follows:

n Mean sd

Control 40 21.4 4.5

CBT 40 16.9 5.5

CCT 40 19.1 5.8


Solution:

SOURCE SUM OF SQUARES DEGREES OF VARIANCE F RATIO


FREEDOM ESTIMATE

Between 𝑆𝑆B K-1 MSB= 𝑆𝑆B/K-1 MSB / 𝑀𝑆W

Within 𝑆𝑆W N-K MSW= 𝑆𝑆W/N-k

Total SSR= 𝑆𝑆B+ 𝑆𝑆w N-1

Computational Procedure:

1. Define the Null and Alternative Hypothesis:

𝐻o: Control= CBT=CCT

Ha: Atleast two of the means of Control, CBT, CCT are not Equal

2. State Alpha

α=.01

3. df= n-1 = 40-1= 39

4. State Decision Rule

One-Tailed Test: ȁ𝑓ȁ > 𝑓𝛼 , 𝑑𝑓1 , 𝑑𝑓2 Reject Ho

Two-Tailed Test: ȁ𝑓ȁ > 𝑓𝛼 2 , 𝑑𝑓1 , 𝑑𝑓2 Reject Ho

5. Calculate Test Statistic

𝑆𝑆b=n∑ki=1 (ӯ𝑖 – ӯ)^2

Ӯ= ∑ x̄ /N = (21.4+16.9+19.1)/3 =19.13333333 or 19.13


𝑆𝑆b = 40(21.4-19.13)^2+40(16.9-19.13)^2+40(19.1-19.13)^2

𝑆𝑆b = 405.068

𝑆𝑆W

Sd Sd^2 or Variance n-1(sd)^2 or ∑ n-1(sd)^2


n(Variance)

4.5 20.25 39(20.25)= 789.75 3280.71

5.5 30.25 39(30.25)= 1179.75

5.8 33.64 39(33.64)= 1311.96

𝑆𝑆W= 3280.71

Source of Sum of Degrees of Mean squares F Ratio


Variation Squares Freedom

Between 405.068 k-1 =3-1=2 MSB= 𝑆𝑆B/K-1 MSB / 𝑀𝑆W=

MSB = 202.534/28.04025..
405.068/2=
MSB / 𝑀𝑆W=
202.534
7.222972466

Or 7.22

Within 3280.71 N-k=120-3= MSW= 𝑆𝑆W/N-k

117 MSW=
3280.71/117
MSW=
28.04025025

Total 3685.778 119

F crit

𝑓 α/2 , 𝑑𝑓1, 𝑑𝑓2= 𝑓0.005, 2, 117 = 5.545661

𝑓 α/2 , 𝑑𝑓1, 𝑑𝑓2= 5.54

5.545661>7.222972466

Conclusion: Because our f ratio or f calculated lies on the f critical which the
rejection region we need to reject the null hypothesis. Our f calculated is 7.22 while
our f critical is 5.54 if which means 7.22 belongs in the region of rejection which
5.54 and above. Therefore, the null hypothesis is rejected

5. A research study was conducted to examine the differences between older and
younger adults on perceived life satisfaction. A pilot study was conducted to
examine this hypothesis. Ten older adults (over the age of 70) and ten younger
adults (between 20 and 30) were give a life satisfaction test (known to have high
reliability and validity). Scores on the measure range from 0 to 60 with high scores
indicative of high life satisfaction, low scores indicative of low life satisfaction. The
data are presented below. Compute the appropriate t-test.

OLDER YOUNGER
45 34

38 22

52 15

48 27

25 37

39 41

51 24

46 19

55 26

46 36

Computational Procedure:

Type of t test: 2 sample t-test

x̅1= ∑x1/n

x̅1= 45+38+52+48+25+39+51+46+55+46

x̅1= 44.5 is the mean for OLDER

s1= √∑(x1- x̅1)^2/(10-1)

∑(x1-x̅1)^2=(45-44.5)^2+(38-44.5)^2+(52-44.5)^2+(48-44.5)^2+(25-44.5)^2+(39-44.5)^2+(51-
44.5)^2+(46-44.5)^2+(55-44.5)^2+(46-44.5)^2

∑(x1-x̅1)^2 = 678.5

s1= √(678.5/9)

s1= 8.682677518 or 8.68 is the standard deviation for OLDER


x̅2= ∑x2/n

x̅2 = (34+22+15+27+37+41+24+19+26+36)/10

x̅2 = 28.1 is the mean for younger

s2= √∑(x2- x̅2)^2/(10-1)

∑(x2-x̅2)^2=+(34-28.1)^2+(22-28.1)^2+(15-28.1)^2+(27-28.1)^2+(37-28.1)^2+(41-
28.1)^2+(24-28.1)^2+(19-28.1)^2+(26-28.1)^2+(36-28.1)^2

∑(x2-x̅2)^2= 656.9

s2= √656.9/9

s2=√72.9888888

s2= 8.5433 or 8.54

Null hypothesis: There is no significant difference between the life satisfaction of the
older adults and younger adults

Alternative hypothesis: There is a significant difference between the life satisfaction of


the older adults and younger adults

α=0.05

α/2 = 0.025

Degrees of freedom from 2 sample t test:

df=[((s1)^2)/((n1) + (s1)^2)] / [ ((s1)^2)/((n1))^2)/(n1-1) + [ ((s2)^2)/((n2))^2)/(n2-1)

We can show our degrees of freedom through this equation if it is 2 sample t-test.
Substituting the known values, we get df=or 18.

For convenience I created a legitimate and more easy formula for degrees of freedom in
2 sample t test

df=n1+n2-2
df=10+10-2

df= 18

We can see that we arrive at the same answer from the previous or traditional way formula
to my new formula of degrees of freedom. Therefore, our degrees of freedom is 18.

tα/2, df

t0.025, 18 = 2.101 Note: please refer to the t-table

Therefore, our t critical value is 2.101

For computed t value

t=( x1- x2)-/ √ (s1^2/n1) +( s2^2/ n2)

t= (44.5-28.1)/ √ 75.3888888888889/10 + 72.9888888888889/10

t=4.25754666555816

2.101 > 4.2575

Conclusion: Since 4.2575 lies in the rejection region we can say that we need to reject
the null hypothesis. Therefore, there is a significant difference between the life
satisfaction of the older adults and younger adults

a.What is your computed answer?

My computed t value based on my solution and extensive calculation is 4.2575.

b. What would be the null hypothesis in this study?

Null hypothesis: There is no significant difference between the life satisfaction of the the
older adults and younger adults
c. What would be the alternate hypothesis?

Alternative hypothesis: There is a significant difference between the life satisfaction of


the older adults and younger adults

d. What probability level did you choose and why?

I choose a confidence level of 95% meaning I’m 95% confident that if this test were
repeated it has 95% probability that it will yield the same results. So therefore P
value(alpha) that was chosen in this problem is 5% which is equal to 0.05

e. What is your tcrit?

Based on my calculation I chosen alpha as 0.05 and the calculated df is 18. By using t-
table I located the corresponding value of t crit which is 2.101.

t critical= 2.101

f. Is there a significant difference between the two groups?

Yes. Since 4.2575 lies in the rejection region we can say that we need to reject the null
hypothesis. Therefore, there is a significant difference between the life satisfaction of the
older adults and younger adults

g. Interpret your answer.

Based on the table, older adults age from 70 and above were compared to younger
adults age from 20 to 30. The researchers are trying to determine if there is no significant
difference on the life satisfaction of older adults and younger adults. The mean from the
data of older adults is 44.5 while the mean from the data of younger adult is 28.1. Both
categories have the same number of samples n. The variance of older adult is 75.38 and
the younger adult is 72.98. Using 2 sample t-test Assuming for Unequal Variance the
following calculations were conducted. With an 18 df and an alpha of 5 % the t critical
was computed. It shows that the rejection is about 2.101. Based on the calculation the t
computed was 4.2575 using t sample t-test as statistical tool. The t computed was
compared to t critical and shows that t computed is in the rejection region. Since it is in
the rejection region the researcher must reject the null hypothesis. Therefore, there is a
significant difference between the life satisfaction of younger adults and older adults
Let Us Enhance

Task A. Solve the following problems completely as directed

A researcher hypothesizes that electrical stimulation of the lateral habenula will result in
a decrease in food intake (in this case, chocolate chips) in rats. Rats undergo stereotaxic
surgery and an electrode is implanted in the right lateral habenula. Following a ten-day
recovery period, rats (kept at 80 percent body weight) are tested for the number of
chocolate chips consumed during a 10-minute period both with and without electrical
stimulation. The testing conditions are counter balanced. Compute the appropriate t-test
for the data provided below.

Stimulation 12 7 3 11 8 5 14 7 9 10

No 8 7 4 14 6 7 12 5 5 8
stimulation

Computational Procedure:

Type of t test: 2 sample t-test

x̅1= ∑x1/n

x̅1= 12+7+3+11+8+5+14+7+9+10

x̅1= 8.6 is the mean for Stimulation

s1= √∑(x1- x̅1)^2/(10-1)

∑(x1-x̅1)^2=(12-8.6)^2+(7-8.6)^2+(3-8.6)^2+(11-8.6)^2+(8-8.6)^2+(5-8.6)^2+(14-8.6)^2+(7-
8.6)^2+(9-8.6)^2+(10-8.6)^2

∑(x1-x̅1)^2 = 98.4

s1= √(98.4/9)

s1= 3.30655914 or 3.31 is the standard deviation for Stimulation


x̅2= ∑x2/n

x̅2 = 8+7+4+14+6+7+12+5+5+8

x̅2 = 7.6 is the mean for no simulation

s2= √∑(x2- x̅2)^2/(10-1)

∑(x2-x̅2)^2=(8-7.6)^2+(7-7.6)^2+(4-7.6)^2+(14-7.6)^2+(6-7.6)^2+(7-7.6)^2+(12-
7.6)^2+(5-7.6)^2+(5-7.6)^2+(8-7.6)^2

∑(x2-x̅2)^2= 90.4

s2= √656.9/9

s2=√10.044444444444

s2= 3.16929715

Null hypothesis: There is no significant difference between the test of rats that undergo
stimulation and rats that didn’t undergo stimulation

Alternative hypothesis: There is no significant difference between the test of rats that
undergo stimulation and rats that didn’t undergo stimulation

α=0.05

α/2 = 0.025

Degrees of freedom from 2 sample t test:

df=[((s1)^2)/((n1) + (s1)^2)] / [ ((s1)^2)/((n1))^2)/(n1-1) + [ ((s2)^2)/((n2))^2)/(n2-1)

We can show our degrees of freedom through this equation if it is 2 sample t-test.
Substituting the known values, we get df=or 18.
For convenience I created a legitimate and more easy formula for degrees of freedom in
2 sample t test

df=n1+n2-2

df=10+10-2

df= 18

We can see that we arrive at the same answer from the previous or traditional way formula
to my new formula of degrees of freedom. Therefore, our degrees of freedom is 18.

tα/2, df

t0.025, 18 = 2.101 Note: please refer to the t-table

Therefore, our t critical value is 2.101

For computed t value

t=( x1- x2)-/ √ (s1^2/n1) +( s2^2/ n2)

t= (8.6-7.6)/ √ 10.9333333333333/10 + 10.0444444444444/10

t= 0.690430963423743

0.6904> 2.101

Conclusion: Since our t computed didn’t lie in the rejection region we can say that we
will not reject the null hypothesis. Therefore, there is no significant difference between the
test of rats that undergo stimulation and rats that didn’t undergo stimulation
a. What is your computed answer?
t=( x1- x2)-/ √ (s1^2/n1) +( s2^2/ n2)
t= (8.6-7.6)/ √ 10.9333333333333/10 + 10.0444444444444/10
t= 0.690430963423743
Our t computed is t=0.6904

b. What would be the null hypothesis in this study?

Null hypothesis: There is no significant difference between the test of rats that
undergo stimulation and rats that didn’t undergo stimulation

c. What would be the alternate hypothesis?

Alternative hypothesis: There is no significant difference between the test of rats


that undergo stimulation and rats that didn’t undergo stimulation

d. What probability level did you choose and why?

I choose a confidence level of 95% meaning I’m 95% confident that if this test were
repeated it has 95% probability that it will yield the same results. So therefore P
value(alpha) that was chosen in this problem is 5% which is equal to 0.05

e. What were your degrees of freedom?

Degrees of freedom from 2 sample t test: df=[((s1)^2)/((n1) + (s1)^2)] / [


((s1)^2)/((n1))^2)/(n1-1) + [ ((s2)^2)/((n2))^2)/(n2-1). We can show our degrees of
freedom through this equation if it is 2 sample t-test. Substituting the known values,
we get df=or 18.

For convenience I created a legitimate and more easy formula for degrees of freedom
in 2 sample t test

df=n1+n2-2
df=10+10-2
df= 18
f. Is there a significant difference between the two testing conditions?

Since our t computed didn’t lie in the rejection region we can say that we will not reject
the null hypothesis. Therefore, there is no significant difference between the test of rats
that undergo stimulation and rats that didn’t undergo stimulation

0.6904> 2.101 this indicates that t computed is located in a non-critical region

g. Interpret your answer.

Base in the on the table shown, the food intake of rats was recorded. The subjects
undergo into stimulation and no stimulation. Based on the data, the first group who
undergo stimulation has a mean of 8.6 and a standard deviation of 3.16. The other group
has a mean of 7.6 and standard deviation of 3.30. The mean, variance, and standard
deviation was used to determine whether there is a significant difference between the
group that undergoes simulation and the other group that does not undergo any
stimulation. Degrees of freedom from the calculation was 18 and the probability level of
the experiment is 0.05 meaning there is a 5% chance of error if experiment will be
repeated again. The critical is 2.101 which is the rejection region of test. The t value was
calculated to be 0.6904 and which make the researcher to accept the null hypothesis.
The null hypothesis of the study states that there is no significant difference from the
group. Because the t value didn’t lie in critical region of a two tail two sample t-test the
final conclusion will be there is no significant difference between the test of rats that
undergo stimulation and rats that didn’t undergo stimulation
2. An education researcher is comparing four different algebra curricula.
Eighth grade students are randomly assigned to one of the four groups.
Their state achievement test scores are compared at the end of the year. Use
the appropriate statistical procedure to determine whether the curricula
differ with respect to math achievement. An alpha criterion of 0.05 should be
used for the test.

n Mean sd

Curriculum 1 50 170.5 14.8

Curriculum 2 50 168.3 12.8

Curriculum 3 50 167.6 17.7

Curriculum 4 50 172.8 16.8

Solution:

SOURCE SUM OF SQUARES DEGREES OF VARIANCE F RATIO


FREEDOM ESTIMATE

Between 𝑆𝑆B K-1 MSB= 𝑆𝑆B/K-1 MSB / 𝑀𝑆W

Within 𝑆𝑆W N-K MSW= 𝑆𝑆W/N-k

Total SSR= 𝑆𝑆B+ 𝑆𝑆w N-1

Computational Procedure:

1. Define the Null and Alternative Hypothesis:


𝐻o: Curriculum 1= Curriculum 2 =Curriculum 3= Curriculum 4

Ha: There is a significant difference to each curricula with respect to math achievement

2. State Alpha

α=.05

3. df= n-1 = 40-1= 49

4. State Decision Rule

One-Tailed Test: ȁ𝑓ȁ > 𝑓𝛼 , 𝑑𝑓1 , 𝑑𝑓2 Reject Ho

Two-Tailed Test: ȁ𝑓ȁ > 𝑓𝛼 2 , 𝑑𝑓1 , 𝑑𝑓2 Reject Ho

5. Calculate Test Statistic

𝑆𝑆b=n∑ki=1 (ӯ𝑖 – ӯ)^2

Ӯ= ∑ x̄ /N = (170.5+168.3+167.6+172.8)/3 = 226.4

𝑆𝑆b = 50(170.5-226.4)^2+50(168.3-226.4)^2+50(167.6-226.4)+50(172.8-226.4)^2

𝑆𝑆b = 12819.21

𝑆𝑆W

Sd Sd^2 or Variance n-1(sd)^2 or ∑ n-1(sd)^2


n(Variance)

14.8 219.04 10732.96 47942.09

12.8 163.84 8028.16

17.7 313.29 15351.21


16.8 282.24 13829.76

𝑆𝑆W= 47942.09

Source of Sum of Degrees of Mean squares F Ratio


Variation Squares Freedom

Between 12819.21 k-1 =4-1=3 MSB= 𝑆𝑆B/K-1 MSB / 𝑀𝑆W=

MSB = 4273.07/244.6025
12819.21/3=
MSB / 𝑀𝑆W=
4273.07
17.46944532 or
17.50

Within 47942.09 N-k=200-4= MSW= 𝑆𝑆W/N-k


196
MSW=
47942.09/196

MSW= 244.6025

Total 3685.778 199

F crit

𝑓 α/2 , 𝑑𝑓1, 𝑑𝑓2= 𝑓0.025, 3, 196 = 3.183378 . This was calculated through Excel F.INV.RT

𝑓 α/2 , 𝑑𝑓1, 𝑑𝑓2= 3.18


3.183378 > 17.46944532

Conclusion: The calculated F critical is 3.188378. 3.188378 is the critical region


or the rejection region in our computation. If our F ratio or F value computed will lie
in this region then we will reject the null hypothesis. Our f ratio or f computed is
17.46944532 which lies in the rejected region which indicates to reject the null
hypothesis. Therefore, there is a significant difference to each curricula with respect
to math achievement.

3. The mental ages (x) and the score on the mathematics aptitude test (y) of
fifteen (15) boys were as follows:

X: 10 10 10 11 11 12 12 12 13 13 13 13 14 14 14

Y: 15 18 18 15 25 25 25 26 26 30 35 40 43 45 5

Compute the correlation coefficient using Spearman Rank-Order Correlation


Coefficient.

x Y Rank of x Rank of y D D ^2

10 15 14 14.5 -0.5 0.25

10 18 14 12.5 1.5 2.25

10 18 14 12.5 1.5 2.25

11 15 11.5 14.5 -3 9

11 25 11.5 10 1.5 2.25

12 25 9 10 -1 1

12 25 9 10 -1 1

12 26 9 7.5 1.5 2.25


13 26 5.5 7.5 -2 4

13 30 5.5 6 -0.5 0.25

13 35 5.5 5 0.5 0.25

13 40 5.5 4 1.5 2.25

14 43 2 3 -1 1

14 45 2 2 0 0

14 50 2 1 1 1

∑D^2= 29

Spearman Rank-Order Correlation Coefficient.

rs= 1- 6∑D^2/ n(n^2-1)

rs = 1 - 6(29)/ 15(15^2 -1)

rs= 1 -174/3360

rs= 1- 29/560

rs= 0.9482142857

rs= 0.9482 or 0.95 is the Spear man rank Coefficient

Let Us Reflect

Task A. Using the space below, write a reflective essay about your learning experience
on using statistical techniques in data analysis. Let your essay reveal how much you
learned about each concept behind each topic dealt with in this lesson. Express which
concepts are the most understood, slightly understood, and the least understood ones.
Data Analysis and Statistical Technique

(Reflective Essay)

In this module I was embarked to a new journey of analyzing and


interpreting data. Honestly, we didn’t learn all the topic in our statistics class because of
pandemic but this module helps me to refresh what I learn in statistics class. I learn also
new statistical technique such as the One Way Analysis of Variance. In our
Empowerment Technology class, we been taught about parametric and not parametric
statistical analysis through Microsoft Excel but we didn’t learn how to calculate it by hand.
This module helps me to understand further how was the ANOVA, Correlation,
Regression, Chi squared, Binomial Distribution, and T-test was calculated manually Upon
reading this module I was refresh on how to calculate the mean. The mean is simply
calculated by adding the corresponding values of the samples and dividing the number
of samples. I was also refreshed in calculating the standard deviation and variance of the
data. In calculating variance we need to have a mean and use the formula ∑(x-µ) ^2/n-1.
In calculating the standard deviation, we just need to apply square root to our variance.
This process is important so that we can proceed to another statistical test where mean,
variance, and standard deviation was required. The first lesson that I lean in this module
was scatter diagrams. Scatter diagrams are just the plotted values of the correlated
variables on the x and y axis. Scatter diagrams gives picture of the relationship between
two variables Scatter diagrams may show perfect positive correlation, perfect negative
correlation, and very high correlation. Usually, researcher used inferential statistics
where hypotheses are included to make inference. A null hypothesis is a hypothesis that
states there is no significant difference between 2 or more variables while an alternative
hypothesis is a hypothesis that states that there is significant difference between two or
more variables. In a hypothesis testing it could be a one tail rejection region or two tail
rejection region.

Correlation test is a type of statistical tool in measuring the relationship of


variables. There are 3 types of correlation namely simple correlation, multiple correlation,
and partial correlation. A simple correlation is the relationship between 1 independent
variable and 1 dependent variable. This type of correlation can be linear (fix relationship)
or Curvilinear (unfix relationship). A multiple correlation involves two or more variables.
A multiple correlation can be nonlinear ( curvilinear) and joint relation ( fix relationship).
Partial correlation is the measure of relationship between the independent variable and
the dependent variable without considering the effect of the independent variable.
Pearson r is used to determine the linearity of the variable. It can be used to draw a
conclusion whether two variables are high, moderate, or low relationship. Another type
of correlation coefficient is the Spearman Rank-Order correlation. It is the measurement
between variables by ranking them according to their position. After the correlation what
I learn next is the regression. Regression is the equation of variables that predicts or
estimate the dependent variable. It is also used to draw the trend or continuous change
of variables. In order to perform a regression, we must create a linear equation y=ax+b.
This regression line can predict outcomes or dependent variables and can be use to
estimate values from a given x and y data. One sample t-test is another statistical t-test
that I learn. T-test is important for me because most of my data in my study will be analyze
by t-test and one-way ANOVA. T-test is used when variables are not normally distributed.
It is also used to compare variable to another variable. This is also used to determine if
there is a significant difference or no significant difference of data gathered from 2
variables. T-test can be paired, unpaired, dependent, and independent. Another
statistical test that I learned in this module was One-way ANOVA manual calculations.
One-way analysis of variance is used when you want to compare the means of more than
two groups. One-Way ANOVA compares the means of two or more independent groups
in order to determine whether there is statistical evidence that the associated population
means are significantly different. In this module, I did not just study the formula but I also
study they purpose and functions of each statistical test. I believe than anyone can do the
calculation but the harder part is the selection and understanding of statistical test that’s
why it is important to study understand the meaning and function of statistical test.

Upon learning and answering the module I encounter hard activities that I didn’t
encounter before such as the ANOVA with only given mean and standard deviation. It is
a hard part for me especially finding the values of my degrees of freedom for my f ratio. I
cannot guarantee that all my calculation is correct but what I can guarantee to myself that
I learn something new about statistics. The slightly understood topic I encounter this
module is T-test. What makes it slightly understood is that you need first to determine if it
is dependent, independent, equal variance, unequal variance, paired, or unpaired.
Therefore, t-test is not just actually calculating by just following the formula but its also
about appropriateness of the calculation. The least topic I understood in this module is
finding the means and standard deviation of a Linkert Scale Data. I encountered an
activity where Linkert Scale was used as a research instrument. It was hard for me
because from what I know in my high school we cannot use the Linkert scale to measure
the central tendency. I was taught to analyze this type of scale to measure response in
mean, median, and mode. Thankfully, I understand how to calculate it Over in all, this
module taught me data analyzing, summarizing, and interpreting. This module taught me
also the importance of statistical test, the usage of statistical test, and performing a
statistical test.

You might also like