Statistics in Research

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 95

STATISTICS IN RESEARCH

1
Meaning of Statistics
Statistics refers to numerical facts. Examples of
statistics in this sense are:the number that
represents the income of a family, the
number of students enrolled in a class and
the like
Statistics is a group of methods that are used
to collect, organize, present, analyze, and
interpret data to make desicions. –( refers to
the field of study)

2
Types of Statistics
 Descriptive Statistics – consists of methods
for organizing, displaying, and describing
data by using tables, graphs, and summary
measures.

 Inferential Statistics – consists of methods


that use sample results to help make
prediction. This is also called inductive
reasoning or inductive statistics.

3
Population Versus Sample
 Population – consists of all elements
(individuals, items, or objects) whose
characteristics are being studied. The
population being studied is called the target
population.

 Sample – a portion of the population selected


for study

4
 A sample that represents the characteristics
of the population as closely as possible is
called a representative sample.

 A sample drawn in such a way that each


element of the population have equal chances
of being selected is called random sample.

5
Four Basic Methods of Sampling
1. Random Sampling. This is done by using
chance methods or random numbers. For
example, number each subject in the
population. Place each number in a bowl, and
select as many card numbers as needed. The
subjects whose numbers are selected
composes the sample.

6
2. Systematic Sampling. This is done by
numbering each subject of the population
and then selecting every kth number. For
example, there are 5000 families in a city.
Fifty families are needed as sample for an
experiment. Since 5000/50 = 100, then k =
100. This means that every 100th subject
would be selected. However, the first subject
would be selected at random from subjects 1
to 100. Suppose the subject 88 was selected,
then the sample would consist of subjects
whose numbers were 88, 188, 288, and so
on until 50 families were obtained.

7
3. Stratified Sampling. If a population has
distinct groups, it is possible to divide the
population into these groups and to draw
SRS’s from each of the groups. The groups
are called strata. Strata are designed so that
members in each strata are more
homogeneous, that is, more similar to each
other. The results are then grouped together to
form the sample. This technique is particularly
useful in populations that can be stratified into
groups by gender, race, geography.

8
Cluster Sampling. This method uses intact
groups called clusters. Suppose a medical
researcher wants to study the patients in Metro
Manila. It would be very costly and time-
consuming to obtain a random sample since
they would be spread over different parts of
Metro Manila. Rather, a few hospitals could be
selected at random and the patients in these
hospitals would be studies in a cluster.

9
Basic terms
 Element or Members of a sample or
population is a specific subject or object
about which the information is collected.
 Variable is a characteristic under study that

assumes different values for different


elements.
 Observation or measurement is the value of

a variable for an element.


 Parameter is any characteristic of a
population and is measurable.

10
 Data are numbers or measurements that are
collected as a result from observation,
interview, questionnaire, experiment and so
forth.

11
Types of Variables
 A. Quantitative Variable – a variable that can be
measured numerically.

 1. Discrete Variable – a variable whose values are


countable.

 2. Continuous Variable – a variable that can assume any


numerical value over a certain interval or intervals

B. Qualitative or Categorical Variable – a


variable that cannot assume a numerical
value but can be classified into two or more
nonnumeric categories.
12
Determine if the following variables is
quantitative or qualitative
 Educational attainment
 Brand of Softdrinks
 ID Number
 Student Number
 IQ Score
 Height of a Building
 Number of years in service
 Rank of Teachers
 Number of provinces in the Philippines

13
Classify each variable as discrete or
continuous
1. The number of bread baked each day
2. The air temperature in a city
yesterday
3. The daily wage of a construction
worker
4. The weights of newborn infants.
5. The capacity (in liters) of water in a
swimming pool

14
Frequency Distribution
 The frequency distribution is an arrangement
of the data which shows the frequency of
different values or groups of values of a
variable. It can be done direct from the raw
data.

 Class Frequency – refers to the number of


observation belonging to a class interval of
the number of items within a category.

15
 Class interval – is a grouping or category
defined by a lower limit and upper limit such
as 12-14; 15-17, 18-20 and so forth
◦ In the class interval 21-23 for example, 21 is the
lower limit and 23 is the upper limit.

Class Marks – are the midpoints of the


classes and they are found by adding the
lower and the upper limit and dividing by 2.

Range – the difference between the highest


and lowest values in the set of data.

16
 The following are raw data collected from a sample of 100
residents of Vigan City. These are the observations (ages)
gathered of 1oo persons.
14 27 27 23 29 21 20 12 22 17
23 24 18 20 27 16 12 22 19 19
15 20 29 25 24 20 20 17 18 18
12 22 23 17 23 26 16 21 21 20
17 18 26 18 28 27 18 22 19 16
14 16 19 20 20 18 25 19 26 15
28 13 18 17 14 27 24 20 18 25
17 20 23 18 18 24 19 19 14 18
21 21 25 24 14 25 20 17 17 17
15 12 26 23 17 20 24 25 18 15
 What is the highest value? The lowest?

17
Suggested Steps in the Construction of
Frequency Distribution
1. Find the range.
2. Determine the tentative number of groups
(called classes) to use. The maximum
number of classes is 15-20 no matter how
many observation there are. The ideal
number of class interval is somewhere
between 5 and 15.
3. Determine the approximate size of the class
interval by dividing range by the desired
number of class intervals.

18
4. Write the class intervals starting with the
lowest lower limit as determined by
researcher’s choice.
5. Determine the class frequency for each class
interval by referring to the tally column.
6. Compute for the class mark.

 Summation Notation-is used to denote the


sum of values. (Ʃ) - symbol

19
Class Interval Tally Class Marks
Frequency
(f) X

27-29 iiiiiiii 9 28

24-26 15 25

21-23 16 22

18-20 31 19

15-17 18 16

12-14 11 13
Ʃf = 100
20
Construct a distribution of the following monthly income of
150 households, Region I, in thousand pesos.

67 57 56 46 13 82 51 83 71 58 43 54 70 46 82

73 68 6 44 51 55 45 67 60 57 62 62 61 76 41

41 54 71 45 78 59 44 25 73 69 55 40 57 48 49

68 63 49 26 55 75 52 42 56 51 51 54 89 44 43

61 61 60 44 48 41 53 88 91 39 40 53 54 79 80

57 55 72 29 55 62 27 51 56 56 69 43 71 27 35

53 68 59 44 58 58 60 65 71 31 94 60 69 53 59

86 54 61 46 47 49 51 60 76 61 61 48 64 70 62

45 77 69 39 40 59 42 36 42 75 73 80 65 57 33

35 85 36 69 59 85 53 54 69 52 62 39 45 27 55
21
Cummulative Frequency
 Is obtained by cumulating absolute frequencies.
C.I f X F≤ F≥
90-99 2 94.5 150 2
80-89 10 84.5 148 12
70-79 17 74.5 138 29
60-69 32 64.5 121 61
50-59 42 54.5 89 103
40-49 31 44.5 47 134
30-39 9 34.5 16 143
20-29 6 24.5 7 149
10-19 1 14.5 1 150
Ʃf = 150
22
Relative frequency Distribution
 is a tabular arrangement of data showing the
proportion in percent of each frequency to the total
frequency.
Score Percentage
90-99 1.3
80-89
70-79
60-69
50-59
40-49
30-39
20-29
10-19
23
Graphical devices to represent a
frequency distribution
 Histogram – consists of a set of rectangles
having bases on a horizontal axis which
centers on the class marks. The base widths
corresponds to the class size and the
heights of the rectangles correspond to the
class frequencies.

24
45

40

35

30

25

20

15

10

0
14.5 24.5 34.5 44.5 54.5 64.5 74.5 84.5 94.5

25
 Frequency Polygon-is a line graph of class
frequencies plotted against the class marks. It
is made by connecting the midpoints of the
rectangular tops in the histogram or joining
the plotted points for the class marks and
their corresponding frequencies.

26
45

40

35

30

25

20

15

10

0
14.5 24.5 34.5 44.5 54.5 64.5 74.5 84.4 94.5

27
Pie Graph
It is a circle that is divided into sections of wedges according to the percentage of
frequencies in each category of the distribution.

Example:
A survey of 500 families were asked the question “Where are you planning to
spend your vacation this summer?” It resulted in the following distribution.
Place Number of People
Davao 50
Boracay 200
Palawan 125
Tagaytay 90
Baguio 35
Construct a pie graph for the data and summarize the results.

28
 Step 1: Since there are 360° in a circle, the frequency for
each class must be converted into a proportional part of
the circle. This conversion is done using the formula,

degree = f/n (360°) where f is the frequency and n = sum


of the frequencies.

For Davao for example, 50/500 (360°) = 36°

Palawan: 125/500 (360°) = 90°

Same formula will be applied to the rest of the places.

29
 Step 2. Convert each frequency to percentage.

Davao: 50/500 (100%) = 10%

Palawan: 125/500 (100%) = 25%

Same process will be applied to the other places.

 Step 3. Using a protractor, draw the graph and label


each section with the name and percentage.

30
31
In a survey of 100 males concerning the sports they play,
the following data were obtained. Construct a pie chart.

Sports Number

Golf 45

Tennis 20

Swimming 10
32
Determine if the following refer to
population or sample
1. A group of 25 students selected to test a
new teaching technique
2. The total machines produced by a factory in
one week
3. The yearly expenditures on food for 10
families
4. The ages of employees of all companies in
Metro Manila
5. The number of subscribers of telephone
companies

33
Classify each variable as quantitative
or qualitative
6. The height of giraffe living in India
7. The religious affiliation of the people in the
Philippines
8. Favorite movie
9. Marital status
10. The days absent from school

34
Classify each variable as discrete or
continuous
11. The number of bread baked each day
12. The air temperature in a city yesterday
13. The income of single parents living in
Quezon City
14. The weights of newborn infants
15. The capacity (in liters) of water in a
swimming pool

35
Indicate whether each statement is a
descriptive or inferential statistics
16. Last semester, the ages of students at a
certain college ranged from 16 to 25 years old
17. Based on the survey conducted by the
National Statistics Office, it is estimated that
24% of unemployed people are women.
18. A survey says that 1 out of 10 Filipinos is a
member of a fitness center.
19. Cigarettes were associated with 31% of the
4,700 civilian fire deaths in 2000
20. A recent study showed that eating garlic can
lower blood pressure.
36
Four Basic Methods of Sampling
 Random Sampling-this is done by using chance
methods or random numbers.

 Systematic Sampling-This is done by numbering


each subject of the population and then selecting
every nth number.
Example: There are 5000 families in a city. Fifty families
are needed as sample for an experiment. Since 5000 ÷
50 = 100, then n = 100. This means that every 100th
subject would be selected. However, the first subject
would be selected at random from subjects 1 to 100.

37
 Stratified Sampling-If a population has
distinct groups, it is possible to divide the
population into these groups and to draw
SRS’s from each of the groups. The groups
are called strata.

38
Measures of Central Tendency
The number which gives a summary of the
characteristics of a given set of data is called the
measure of central tendency or measure of
central location.
-such measures of central tendency can be
computed in two ways, one in ungrouped data
form and grouped data.

Ungrouped Data or Raw Data – are those data


which are not yet organized or arranged into
frequency distribution.

39
1. Arithmetic Mean
 The arithmetic mean or arithmetic average is
defined as the sum of the values of a variable
divided by the number of observations.
 The symbol for the sample mean is x bar (x),
and for the population mean is the Greek
letter mu (μ)

X or μ = Ʃx/N or n, where x bar is the sample mean,


μ is the population mean,
N or n is the total number of items in the population or
in the sample,
x is the observed value
Ʃ is the summation notation 40
 Example 1. What is the mean quiz score of a
group of 6 students whose quiz scores are
85, 80, 74, 81, 76 and 84?
Sloution: Given: n = 6
x = 85+80+74+81+76+84
6
= 480/6
x = 80
2. What is the mean age of a group of 10
students whose ages are 15, 21, 17, 17, 20,
21, 18, 21, 17 1nd 19?
x=15+3(21)+18+2(17)+20+16+19=18.5
10
41
Solve:
 A man bought 10 liters of premium gasoline
at 11.50 per liter, 12 at 12.01 per liter and 18
liters at 11.78 per liter from three different
gasoline stations. Find the average price per
liter.

42
Mean for Grouped Data
 To compute for the arithmetic mean of
grouped data, we need to determine the
midpoint of each class interval. The mean
assumed that the class mark of each class is
the average value of all items falling in that
class.
 1. Long method x = Ʃfx/n where:

x = sample mean
x = the class midpoint or class mark
f = the corresponding frequencies
n = the total number of items
43
 Example: The following is the distribution of
length of service in years of 50 employees of
United Laboratories Inc.
Length of Service ( Class Number of Employees
Intervals) C.I (frequency) f
1-5 5

6-10 7

11-15 12

16-20 13

21-25 6

26-30 4

31-35 3
44
Length of Number of Class fx
Service Employees Midpoint (x)
( Class (frequency) f
Intervals) C.I

1-5 5 3 15
6-10 7 8 56
11-15 12 13 156
16-20 13 18 234
21-25 6 23 138
26-30 4 28 112
31-35 3 33 99 45
x = 810/50 = 16.2

Quiz 2. Prepare frequency distribution out of


the purchase made from a supplier.
a. Group these figures into a distribution
having an interval size of 10 starting from the
lowest class interval of 10-19.
b. Compute for the arithmetic mean using the
long method.

46
35 25 18 31 29

43 31 66 50 34

68 30 52 55 49

38 64 62 45 27

54 64 41 30 66

35 43 25 14 24

57 43 58 42 33

37 28 37 75 74

47
 2. Short Method (Mean for Grouped Data)

x= xA + [Ʃfd/n]i
where, x=sample mean
xA = assumed mean (usually the value of the midpoint
with the highest frequency)
d = deviation of the values from the assumed mean
d = (x-xA)i
f = the corresponding frequency
i = class interval or class size
n = number of items

48
Length of Number Class d fd
Service of Midpoint
( Class Employee (x)
Intervals) s
C.I (frequenc
y) f

1-5 5 3 -3 -15
6-10 7 8 -2 -14
11-15 12 13 -1 -12
16-20 13 18 0 0
21-25 6 23 1 6
26-30 4 28 2 8 49
X = 18 + (-18/50) 5
= 18 – 1.8
= 16.2

50
Median
 The median of ungrouped data
arranged in array (increasing or
decreasing order of magnitude) is the
middle value when the number of
items is odd or the arithmetic average
of the two middle values when the
number of items is even. The median
is usually denoted by Mdn.

51
 Example: 1. Find the median from the
following set of scores: 6,4,5,3 and 2.
Solution: Arrange the scores
2,3,4,5,6
The median is 4, which is the middle term.

2. Find the median of 8,12,5,6,13 and 15.


5,6,8,12,13,15
Mdn = (8+12)/2
= 10

52
Median for Grouped Data
Mdn = Lmd + [(n/2 – cf)/f] i
where, Mdn = median
Lmd = lower class boundary of median class
n = total number of observations
f = frequency of the median class
cf = cumulative frequency
preceding/before the median class
i = class size

53
Example: Find the median of the frequency
distribution of length of service in years of 50
employees of United Laboratories Inc.
Length of Number of Class <cf
Service Employees (f) Boundaries (CB)

1-5 5 0.5 – 5.5 5

6-10 7 5.5 – 10.5 12

11-15 12 10.5 – 15.5 24 (cf)

16-20 Mdn 13 (f) 15.5 – 20.5 37


class (Lmd)
21-25 6 20.5 – 25.5 43

26-30 4 25.5 – 30.5 47

31-35 3 30.5 – 35.5 50


54
 To obtain the median class, solve for n/2, so
50/2 = 25th.
 Locate where the 25th term is equal or nearest

to but not less than the value in the less than


cumulative frequency (<cf) distribution. Thus,
the median class is the 16-20 class interval.
 Use the formula to compute for the median.

Mdn = 15.5 + [(50/2 – 24)/13] 5


= 15.5 + [(25-24)/13] 5
= 15.5 + 0.38
Mdn = 15.88

55
Solve for the median
Class Intervals Frequencies
108-118 5
97-107 14
86-96 18
75-85 25
64-74 12
53-63 6
56
Mode
 The mode of a grouped data is defined as the
midpoint of the class interval with the highest
frequency (modal class). The mode obtained
in this manner is called a crude mode,
because it is just a rough approximation of
the actual mode. So, to determine the true
mode, we use the formula

Mo = LMo + [d1/(d1 + d2)] i

57
Where, Mo = mode
LMo = lower class boundary of modal
class
d1 = difference between the frequency of
the modal class and the frequency of the next
class lower in value

d2 = difference between frequency of the


modal class and the frequency of the class next
higher in value

i = class interval or class width

Note: the modal class is the class interval with


the highest frequency 58
 Find the mode of the frequency distribution
of length of service in years of 50 employees
Class Interval Frequency (f) Class Boundary
(CB)

1-5 5 0.5 – 5.5


6-10 7 5.5 – 10.5
11-15 12 10.5 – 15.5
16-20 13 15.5 – 20.5
21-25 6 20.5 – 25.5
26-30 4 25.5 – 30.5
31-35 3 30.5 – 35.5

59
 The modal class is the 16-20 class interval,
since it is the class interval with highest
frequency.

 Mo = 15.5 + [1/(1+7)] 5
=15.5 + 5/8
= 16.125

60
Solve for the mode
Classes Frequencies
15-19 5
20-24 8
25-29 18
30-34 14
35-39 3
40-44 2 61
QUIZ #2
 Group the following figures into a
frequency distribution having an
interval size of 6 starting from the
lowest interval of 10-15.

 Find the mean, median and mode


for the grouped data.

62
10 67 53 35 17 29 23 41 50 60

14 17 45 43 39 21 19 17 11 19

35 63 65 49 55 31 39 59 70 54

12 28 36 61 69 30 28 29 31 30

17 16 22 34 44 55 66 33 43 15

63
HYPOTHESIS TESTING
 Hypothesis testing is a decision-making
process for evaluating claims about a
population.

 The z-test and the t-test are the statistical


tests for hypothesis testing.

64
 Null hypothesis (Ho) – states that there is no
difference between a parameter and a specific
value

 Alternative hypothesis (Ha) – states a specific


difference between a parameter and a specific
value

65
Possible sets of statistical hypothesis
1. two-tailed test
Ho : parameter = specific value
Ha : parameter ≠ specific value
2. left-tailed test
Ho: parameter = specific value
Ha: parameter < specific value
3. right-tailed test
Ho: parameter = specific value
Ha: parameter > specific value

66
When to reject and to accept Ho
reject Ho accept Ho
Ho is true Type I error correct decision
Ho is false correct decision Type II error

67
Steps in hypothesis testing
1. State the null and alternative hypothesis.
2. Select the level of significance. (0.10, 0.05,
0.01) Note: the level of significance is the
maximum probability of committing a type I
error.
3. Determine the critical value and the rejection
region/s.
4. State the decision rule.
5. Compute the test statistic.
6. Make a decision, whether to reject or accept the
null hypothesis.

68
How to find the critical value
 Example1.
Using the z table, find the critical value of a two-
tailed test with α = 0.05.
Solution:
Draw the figure and indicate the appropriate area. Since
this is a two tailed test, there are two areas equivalent to
α/2 or 0.05/2 = 0.025
Subtract 0.025 from 0.5 to get 0.475. Find the z value that
corresponds to 0.475. In this case, it is 1.91. Since this is a
two-tailed test, there are two critical values: +1.91 and -
1.91.

69
Using the z table, find the critical
value/s for each.
1. α = 0.01, two-tailed test
2. α = 0.03, left tailed
3. α = 0.05, right tailed

70
The z-test
It is a statistical test for the mean of a population. It can
be used when the sample size is greater than or equal
to 30, or when the population is normally distributed
and standard deviation is known. The formula is,

Where x –sample mean


μ – hypothesized mean
σ – population deviation
n – sample size

71
 A manufacturer claims that the average
lifetime of his lightbulbs is 3 years or 36
months. The standard deviation is 8 months.
Fifhty bulbs are selected, and the average
lifetime is found to be 32 months. Should the
manufacturer’s statement be rejected at α =
0.01?

72
 A test on car braking reaction times for men
between 18 and 30 years old have produced a
mean and standard deviation of 0.610 sec
and 0.123 sec, respectively. When 40 male
drivers of this age group were randomly
selected and tested for their braking reaction
times, a mean of 0.587 second came out. At
the α = 0.10 level of significance, test the claim
of the driving instructor that his graduates
had faster reaction times.

73
 A diet clinic states that there is an average
loss of 24 pounds for those who stay on the
program for 20 weeks. The standard
deviation is 5 pounds. The clinic tries a new
diet, reducing salt intake to see whether that
strategy will produce a greater weight loss. A
group of 40 volunteers loses an average of
16.3 pounds each over 20 weeks. Should the
clinic change the old diet? Use α = 0.05

74
Quiz
 A study claims that all adults spend an average
of 8 hours on chores during a weekend. A
researcher wanted to check it this claim is
true. A random sample of 200 adults taken by
this researcher showed that these adults
spend an average of 8.20 hours on chores
during a weekend with a standard deviation of
2.1 hours. Using the 1% significance level, can
you conclude that the claim that all adults
spend an average of 8 hours on chores during
a weekend is false?

75
 The label on a can of pineapple slices states
that the mean carbohydrate content per
serving of canned pineapple is 50 grams. It
may be assumed that the standard deviation
of the carbohydrate content is 4 grams. A
random sample of forty servings has a mean
carbohydrate content of 52.3 grams. Is the
company correct in its claim? Use α = 0.05

76
The t-test
 The t-test is a statistical test for the mean of
a population and is used when the population
is normally or approximately normally
distributed, standard deviation is known, and
n<30. The formula for the t test is

 The degrees of freedom are d.f. = n – 1

77
 The formula for the t test is similar to the z
test. But since the population standard
deviation is unknown, the sample standard
deviation is used. The critical values for a t
test are found in a t table.

78
 In order to increase customer service, a
muffler repair shop claims its mechanics can
replace a muffler in 12 minutes. A time
management specialist selected six repair
jobs and found their mean time to be 11.6
minutes. The standard deviation of the
sample was 2.1 minutes. At α = 0.025, is
there enough evidence to conclude that the
mean time in changing a muffler is less than
12 minutes?

79
 A new laboratory technician read a report that
the average number of students using the
computer laboratory per hour was 16. To test
this hypothesis, he selected a day at random
and kept track of the number of students who
used the lab over an eight-hour period. The
result were as follows:
20,24,18,16,16,19,21,23
At α = 0.05, can the technician conclude that
the average is actually 16?

80
 In a certain city, a researcher wishes to
determine whether the average age of its
citizens is really 61.2 years. A sample of 22
residents has an average age of 59.8. The
standard deviation of the sample is 1.5 years.
At α = 0.01, is the average age of the
residents really 61.2 years?

81
 A machine is designed to fill jars with 16
ounces of coffee. A consumer suspects that
the machine is not filling the jars completely.
A sample of 8 jars were observed to have the
following content (in ounces)
14,16,15,13,16,17,14,18
Is there enough evidence to support the
consumer’s claim at α = 0.10?

82
Sample Size
 It is necessary to determine the size of the
sample to make an accurate estimate. It
depends on the maximum error of estimate,
the population standard deviation, and the
degree of confidence.

83
Example
 A university dean wishes to estimate the
average number of hours his part-time
instructors teach per week. The standard
deviation from a previous study is 2.6 hours.
How large a sample must be selected if he
wants to be 99 % confident of finding whether
the true mean differs from the sample mean
by 1 hour?

84
 A health care professional wishes to estimate
the birth weights of infants. How large a
sample must she select if she desires to be
99% confident that the true mean is within 10
ounces of the sample mean? The standard
deviation of the birth weights is known to be
4 ounces.

85
Test of Hypothesis (Two Population)
The null hypothesis under test is,
Ho: μ1 = μ2
The test statistic appropriate for the purpose is
z= x–y
√ σ /m + σ /n
1
2
2
2

where:
m = sample for group 1
n = sample for group 2
x = mean of group 1
y = mean of group 2
σ2 = variance of group 1 and 2 86
Example:
 A placement exam in mathematics was given to
15 who had a modern math background and to
10 students who had a traditional math
background. The mean score of the modern
math students was 90 points and that of the
traditional math students was 94 points.
Suppose it may be assumed that the variances
of the scores for modern math and traditional
math are known and are, respectively, 22 and
13. At the 5 percent level of significance, do
the two groups differ significantly?

87
 Suppose there are two normally distributed
populations. One population has variance of
12, and when a sample of size 22 was picked
from it, the mean x was 9. The second
population has a variance of 14, and a sample
of 18 yielded a mean y of 7.8. Assuming that
the samples are independent, at the 10
percent level of significance, test the null
hypothesis Ho: μ1 = μ2 against the alternative
hypothesis that μ1 > μ2

88
 The following figures refer to the amounts of
carbohydrates (in grams) per serving of two
varieties of canned peaches.
Variety 32 32 27 38 29
A
Variety 27 28 33 30 26 29 28
B
Based on this sample data, at the 5 percent
level of significance, do the two varieties
significantly differ in their true mean
carbohydrate content?

89
QUIZ
 A company has two branches, one on the East
Coast and the other on the West Coast. The
mean daily sales of the company on the East
Coast, when observed on 150 days were
found to be P250,000 with sd of P35,000. For
the company on the West Coast, the mean
daily sales when observed for 200 days were
P245,000 with sd of P25,000. At the 1
percent level of significance, do the two
branches differ in their true mean daily sales?

90
 Two types of soft drinks, a non-cola and a
cola, were tested for their amount of glucose
(g/100). The findings are as follows:

soft
drinks Amount of Glucose

non-cola 4.50 4.25 4.72 4.65 4.15 4.30 5.00 5.47

cola 4.60 4.00 4.55 4.45 3.20 3.80   

At the 5 percent level of significance, is the


true mean amount of glucose for non-cola
significantly greater than that for cola?
91
Paired t-Test
 The method of paired t-test reduces the
problem of comparing the means of two
populations to that of a one-sample t-test.

 A common method of obtaining matched


pairs is to match an experimental unit with
itself, leading to what is widely reffered to as
a “before-after” study.

92
t = d/(sd/√n)
Where
d = ∑d/n, d=difference of the pair

sd = ∑d2 – (∑d)2/n
n–1

93
 Example: Score of eight students before and after a
special coaching benefits students in that, are the
students scores higher after coaching?
Before After d = after-
Student Coaching coaching before d2

1 91 82 -9 81

2 78 80 2 4

3 47 62 15 225

4 37 49 12 144

5 64 55 -9 81

6 54 73 19 361

7 43 59 16 256

8 33 58 25 625

      ∑d = 71 ∑d2 = 1777 94
95

You might also like