Advance Research and Statistics

ADVANCE
RESEARCH
AND STATISTICS
by sir lee knows
The Importance of Statistics
It is obvious that society can’t be run effectively on the
basis of hunches or trial and error, and that in business and
economics much depends on the correct analysis of
numerical information. Decisions based on data will provide
better results than those based on intuition or gut feelings.
What applies to this wider world applies to undertaking
research into the wider world. And learning to use statistics in
your studies/researches will have a wider benefit than helping
you towards a qualification. Once you have mastered the
language and some of the techniques in order to make sense
of your investigation, you will have supplied yourself with a
knowledge and understanding that will enable you to cope
with the information you will encounter in your everyday life.
Statistical thinking permeates all social interaction.
For example, take these statements:
*‘The earlier you start thinking about the topic of your
research project, the more likely it is that you will produce
good work.’
*‘You will get more reliable information about that from a
refereed academic journal than a newspaper.’
Or these questions:
*‘Which university should I go to?’
*‘Should I buy a new car or a second-hand one?’
*‘Should the company buy this building or just rent it?’
*‘Should we invest now or wait till the new financial year?’
*‘When should we launch our new product?’
All of these require decisions to be made, all have costs
and benefits either financial or emotional), all are based upon
different amounts of data, and all involve or necessitate some
kind of statistical calculation. This is where an understanding
of statistics(knowledge of statistical techniques) and research
will come in handy.
Why You Need to Use Statistics in Research
Much of everyday life depends on making forecasts, and

business can’t progress without being able to audit change or
plan action. In your research, you may be looking at areas
such as purchasing, production, capital investment, long-term
development, quality control, human resource development,
recruitment and selection, marketing, credit risk assessment
or financial forecasts or others. and that is why the informed
use of statistics is of direct importance to you while you are
collecting your data and analysing them. If nothing else, your
results and findings will be more accurate, more believable
and, consequently, more useful.
Some of the reasons why you will be using statistics to

analyse your data are the same reasons why you are doing
the research. Ignoring the possibility that you are researching
because the project or dissertation element of your
qualification is compulsory, rather than because you very
much want to find something out, you are likely to be
researching because you want to:
* measure things;
* examine relationships;
* make predictions;
* test hypotheses;
* construct concepts and develop theories;
* explore issues;
* explain activities or attitudes;
* describe what is happening;
* present information;
* make comparisons to find similarities and differences;
* draw conclusions about populations based only on
sample results.
If you didn’t want to do at least one of these things, there

would be no point to doing your research at all.
WHAT IS STATISTICS?
Statistics is a range of procedures for gathering, organising,

analysing and presenting quantitative data. ‘Data’ is the
term for facts that have been obtained and subsequently
recorded, and, for statisticians, ‘data’ usually refers to
quantitative data that are numbers
Statistics is a scientific approach to analysing numerical
data in order to enable us to maximise our
interpretation, understanding and use. This means that
statistics helps us turn data into information; that is, data
that have been interpreted, understood and are useful to
the recipient.
Statistics is defined as a science that deals with the
collection, presentation, analysis and interpretation of
quantitative or numerical data.
Division of Statistics
1.Descriptive Statistics refers to the gathering,
tabulation and organization of data. It discusses the
characteristics, attributes, position and dispersion of
data.
2. Inferential Statistics is the logical process from
sample analysis to a generalization or conclusion about a
population.
Sources of Data
1.Primary Data are data that come from an original
source, and are intended to answer specific research
questions.
2. Secondary Data are data that are taken from
previously recorded data.
Types of Data
1.Qualitative Data is the data that is mere description
of the data being talked about. It depicts the
characteristics of the data, the attributes or description
of certain behaviours.
2. Quantitative Data is something that can be
measured or counted.
Constants and Variables
Constant. A constant is a characteristic of objects,
people or events that does not vary.
Variable. A variable is a characteristic of objects,
people or events that do vary.
Classification of Variables
Experimental Classification. A researcher may
classify variables according to the functions they
serve in the experiment.
1.Independent Variable is a variable controlled by
the experimenter/researcher, and expected to have
an effect on the behaviour of the subject.(explanatory
variable).
2. Dependent Variable is some measure of the behaviour of
subjects and expected to be influenced by the independent
variable.
Mathematical Classification. Variables may also be classified

in terms of the mathematical values they may take on within
a given interval.
1.Continuous variable is a variable which can assume any of
an infinite number of values, and can be associated with
points on a continuous line interval.
2. Discrete variable is a variable which consists of either a
finite number of values.
Levels of Data/Measurement
1. Nominal data is used to differentiate classes or
categories for purely classification or identification purposes.
It is the weakest form of measurement because no
attempt can be made to account for differences within
the particular category or to specify any ordering or
direction across the various categories. Nominal data are
discrete variables.
2. Ordinal data is used in ranking. Though ordinal data
is somewhat stronger than nominal data, it is still a weak
form of measurement because no meaningful numerical
statements can be made about differences between the
categories. The ordering implies only which category is
“greater” or “lesser” – not how much “greater” or “lesser”.
These are also discrete variables.
3. Interval data is used to classify order and differentiate
between classes or categories in terms of degrees of
differences. Interval data are either discrete or
continuous variables.
4. Ratio data differs from interval data only in one aspect; it
has a true zero point. It represents distances from a natural
origin like the length, weight, height, etc. Ratio data are
discrete or continuous variables.
Characteristics of Levels of Levels of Data/Measurement
Nominal – indicates distinction.

Ordinal – indicates distinction, indicates the direction of the
distinction.
Interval – indicates distinction, indicates the direction of the
distinction; indicates the amount of distinction
Ratio – indicates distinction, indicates the direction of the
distinction; indicates the amount of distinction,
indicates an absolute zero.
POPULATION AND SAMPLE
A population or a universe is referred to as the total
number of objects being considered in the study.
A researcher must be very clear on who will be his
population, and from where he will choose his samples.
There are situations where the population are also the
samples.
A sample is referred by most researchers as
respondents. Samples are part of the population. Since the
sample is a representative of the population, important
conclusions about the population can be inferred from the
analysis of the sample which is called statistical inference.
Ways of Obtaining Samples

In obtaining samples, it is very important that the
process used is valid since the conclusion to be made will
be affected by such.
1. Percentage Method.
Example: How many respondents should a reseracher
use if the population is 5,000 and he is required to use
3%?
2. Slovin’s Method. A method that uses margin of error,

where the researcher tries to set the minimal percentage
where the study has to commit an error.
Margin of error ranges from 1% to 10%.

Examples.
• How many respondents are needed by the researcher if
he would like to study the perception of farmers of the
application of technology if there are about 1,500 farmers
and he wants to employ 5% margin of error?
2. By how many percent precise is the study if 85 students
were interviewed out of 220 population?
Formula:
N
n = ------------------, where: n = sample
1 + Ne2 N = population
e = margin of error
Solutions: 1.
1,500
n = ------------------------
1 + 1500(0.05)2
1,500
n = ---------------
1 + 3.75
n = 315.79 afrmers or 316 farmers

2. Given:
n = 85 N = 220
Using the formula:

220
85 = ------------------ e 2 = 0.0072
1 + 220 e2
e = 0.085 or 8.5%
85 (1 + 220 e2 ) = 220
Since the problem is
85 + 18700 e2 = 220 asking for the
precision, then 100% -
18700e2 = 220 - 85 8.5% = 91.5%.
135 Thus, the sample of
e2 = ----------- 85 out of 220 population
18700 is 91.5% precise.
SAMPLING TECHNIQUES
A. RANDOM SAMPLING is a process wherein members
had an equal chance of being selected from the population.
It is also called probability sampling.
1. Simple Random Sampling is a process of selecting
n sample size via random numbers or through lottery.
2. Systematic Sampling is a process of selecting a kth
element in the population until the desired number of
respondents is attained.
3. Stratified Sampling is a process of subdividing the
population into subgroups or strata and drawing members
at random from each subgroup or stratum.
4. Cluster Sampling is a process of selecting clusters
from a population which is very large or widely spread out
over a wide geographical area.
STRATIFIED RANDOM SAMPLING
Below is the population of five (5) selected barangays of
Lipa City.
Brgy Population Strata Sample

1 2,150 215/800 215/800 x 381 = 102.39 or 102
2 1,240 124/800 124/800 x 381 = 59.06 or 59
3 3,400 34/80 34/80 x381 = 161.93 or 162
4 900 9/80 9/80 x 381 = 42.86 or 43

5 310 31/800 31/800 x 381 = 14.77 or 15
Total 8,000 381
8,000
n = ------------------------
1 + 8,000(0.05)2
8,000
n = ---------------
1 + 20
n = 380.95 or 381 community members

B. NON-RANDOM SAMPLING is a sampling procedure
where samples are selected in a manner with little or no
attention to randomization. It is also called non-probability
sampling.
1. Convenience Sampling is a process of selecting a group
of individuals who (conveniently) are available for study.
2. Purposive Sampling is a process of selecting based from
judgement to select a sample which the researcher
believed, based on prior information, will
provide the
data they need. The disadvantage of this technique is
that the researcher’s judgment may be in error – he may
not be correct in estimating the representativeness of a
sample or their expertise regarding the information
needed. It is also called judgment sampling.
3. Quota Sampling is applied when a researcher collects
information from an assigned number, or quota of
of individuals from one of several sample units fulfilling
certain prescribed criteria or belonging to one stratum. The
advantage of this technique is that it is cheaper to
administer.
4. Snowball Sampling is a technique in which one or more
members of a population are located and used to lead the
researcher to other members of the population.
Methods of Collecting Data
1. Direct or Interview Method. It is a face-to-face encounter
between the interviewer and the interviewee. The interview
may vary according to the preference of either or both
parties. However, this type is time-consuming, expensive,
and has limited field coverage.
2. Indirect or Questionnaire Method. This method utilizes
questionnaires to obtain information.
3. Registration Method. This method of gathering information
is governed by laws.
4. Observation Method. This method is used to data that are
pertaining to behaviors of an individual or a group of
individuals at the time of occurrence of a given situation
are best obtained by observation. One limitation of this
method is made only at the time or occurrence of the
appropriate events.
5. Experiment Method. This is used to determine the cause
and effect relationship of certain phenomena under
controlled conditions.
WAYS OF PRESENTING DATA
1. TEXTUAL PRESENTATION. Data collected are presented
in paragraph form if it is purely qualitative or when there are
very few numbers involved. This is always adopted
particularly after presenting a table wherein the researcher
describes the things found in the table.
2. TABULAR PRESENTATION. The more effective way of
presenting the data is by means of a table that appears in the
form of rows and columns. Data presented in tabular form
can be easily used for comparison and emphasis. One can
simply draws relationships from the presented table.
Table 1
Frequency Distribution of Respondents’ Sex
Sex Frequency Percentage

Male 20 40
Female 30 60
Total 50 100
As
3. GRAPHICAL PRESENTATION. The data are presented
in visual form. Graphs may appear in many forms.
TABULAR PRESENTATION OF DATA
ARRAY is the simplest arrangement of data. It is merely to
list the scores from highest to lowest and from lowest to
highest.
Ex. Arrange the following scores:
18, 10, 6, 15 23, 22,11 12, 14, 17, 19, 16, 7, 10
Ascending: 6, 7, 10,10, 11, 12, 14, 15, 16, 17, 18, 19, 22, 23
Descending: 23, 22, 19, 18, 17, 16, 15, 14, 12, 11, 10,10,7,6
RANKING OF SCORES is important in order
to identify the position of an observation, an
individual, an object in relation to the others
in the group according to some
characteristics, such as magnitude, quality
or importance.
Rank symbols are denoted by numbers,
thus, a rank of 1 is given to the highest
score, a rank of 2 to the next,etc.
Example: Below are scores of selected students in Algebra.
15 12 12 12 18 11 15 12 19 16
16 18 17 16 14 14 14 12 11 16
15 12 12 15 11 20 21 20 23 11
16 16 16 16 19 19 19 19 17 17
Rank the scores.

Scores frequency Number Ranks
23 1 1 1
21 1 2 2
20 2 3, 4 3.5
19 5 5,6,7,8,9 7
18 2 10,11 10.5
17 3 12, 13, 14 13
16 8 15, 16, 17, 18, 19, 20, 21, 22 18.5
15 4 23,24,25,26 24.5
14 3 27,28,29 28
12 7 30, 31, 32, 33, 34, 35, 36 33
11 4 37, 38, 39, 40 38.5
Total (N) 40
Limitations to Ranking
1.Rank symbols provide for only limited amount of comparison.
2. Rank symbols cannot indicate the extent of difference between
adjacent ranks.
3. Rank symbols are limited as to what can be done to them
mathematically.
4. Scale symbols can be changed to rank symbols but rank symbols
cannot be converted into scale symbols.
1st
2nd
3rd
DESCRIPTIVE STATISTICS
Data gathered to answer problems being passed by
the researcher need to be presented carefully and
systematically.
Components of a Class Frequency Distribution
1. Class intervals.
2. Class frequencies.
3. Class boundaries.
4. Class marks/midpoints.
5. Relative frequencies.
6. Cumulative frequencies.
Example
1. Class Interval-end numbers of an interval
Class Interval/class limit
15 - 19
10 - 14
5- 9
2. Class Frequency
Class Interval f
15 - 19 2
10 - 14 7
5- 9 1
n = 10
3. Class Boundaries - are the true class limits
Descending Order
CI f CB 15 - 14 = 1/2 = 0.5
15 - 19 2 14.5 - 19.5
10 - 14 7 9.5 - 14.5
5- 9 1 4.5 - 9.5
Ascending Order
CI f CB 200 - 150 = 50/2 = 25
100 - 150 4 75 - 175
200 - 250 2 175 - 275
300- 350 4 275 - 375
400 - 450 375 - 475
4. Class Marks
CI f CB M
15 - 19 2 14.5 - 19.5 17 15, 16, 17, 18, 19
10 - 14 7 9.5 - 14.5 12 10, 11, 12, 13, 14
5- 9 1 4.5 - 9.5 7
CI f CB M
100 - 150 4 75 - 175 125
200 - 250 2 175 - 275 225
300 - 350 4 275 - 375 325
5. Relative Frequencies (F)
CI f CB M F %F
15 - 19 2 14.5 - 19.5 17 0.2 20
10 - 14 7 9.5 - 14.5 12 0.7 70
5- 9 1 4.5 - 9.5 7 0.1 10
n = 10 1.0 100
5. Cumulative Frequencies (CF)
CI f CB M F %F <F >F
15 - 19 2 14.5 - 19.5 17 0.2 20 10 2
10 - 14 7 9.5 - 14.5 12 0.7 70 8 9
5- 9 1 4.5 - 9.5 7 0.1 10 1 10
n = 10
Constructing Frequency Distribution Table
Steps:
1. Determine the range of values. R = HS - LS
2. Determine the desired number of classes.
3. Locate the desired class interval.
a. if desired TNC is given
b. if TNC is not given
4. Formulate a frequency table making class intervals
starting the lower limit of first class interval with the lower value.
5. Get the number of data (frequency) for every class
interval.
6. Compute the class mark/class midpoint of each class
interval.
7. Get the class boundaries.
8. Obtain the relative frequencies (F).
9. Find the greater (>F) and less than (<F) cumulative
frequencies.
Example: Below are ages (in years) of selected employees of
a certain school in Lipa City.
19 21 23 34 38 36 25 28 21 23 25 29 39 42 48 50 52
28 27 22 25 26 29 38 39 42 40 43 34 36 51 49 48 47
23 20 25 27 29 33 32 39 39 38 19 54 50 41 42 48 36
26 32 28 27 23 25 23 24 29 36 42 51 48 49 39 25 28
Prepare a CFDT using:
a. Six (6) Tentative number of cases in descending order.
b. Using no TNC in ascending order.
Follow the Steps in Constructing CFDT
a. TNC = 6 and Descending Order
1st: R = Highest Age- Lowest Age; R = 54 - 19; R = 35
2nd: TNC = 6
3rd: Interval = R/TNC; I = 35/6; I = 5.83 or 6
4th: Formulate a Table in descending
order.
5th: Make the lowest age as the lowest
limit.
CL/CI f CB M/X F %F <F >F
49 - 54 8 48.5 - 54.5 51.5 0.1176 11.76 68 8
43 - 48 6 42.5 - 48.5 45.5 0.0882 8.82 60 14
37 - 42 14 36.5 - 42.5 39.5 0.2059 20.59 54 28
31 - 36 9 30.5 - 36.5 33.5 0.1324 13.24 40 37
25 - 30 19 24.5 - 30.5 27.5 0.2794 27.94 31 56
19 - 24 12 18.5 - 24.5 21.5 0.1765 17.65 12 68
Total 68 1.0000 100.00

Follow the Steps in Constructing CFDT
1st: R = Highest Age- Lowest Age; R = 35
2nd: No TNC
3rd: Solve the class interval
Formula:
Range 35
Interval = -----------------------; I = --------- ; I = 4.94 or 5
1 + 3.322 Log n 1+6.09
4th: Formulate a Table in descending
order.
5th: Make the lowest age as the lowest
limit.
CL/CI f CB X F %F <F >F
19 - 23 11 18.5 - 23.5 21 0.1618 16.18 11 68

24 - 28 16 23.5 - 28.5 26 0.2353 23.53 27 57
29 - 33 7 28.5 - 33.5 31 0.1029 10.29 34 41
34 - 38 9 33.5 - 38.5 36 0.1324 13.24 43 34
39 - 43 12 38.5 - 43.5 41 0.1765 17.65 55 25
44 - 48 5 43.5 - 48.5 46 0.0735 7.35 60 13
49 - 53 7 48.5 - 53.5 51 0.1029 10.29 67 8
54 - 58 1 53.5 - 58.5 56 0.0147 1.47 68 1
Total 68 1.0000 100.00
MEASURES OF CENTRAL
TENDENCY
The measures of central tendency (MCT) shows the location
or position of the data in the given distribution.
MOST COMMONLY USED MCT
1. ARITHMETIC MEAN/MEAN (Mn) is synonymous with the
average.
µ = population mean
x = sample mean
2. MEDIAN (Md) is the middlemost data.
3. MODE (Mo) is the most frequent data.

UNGROUPED DATA
Measures of Central Tendency
Examples: Below are scores of selected students in a test in

Mathematics.
A. 15, 12, 13, 19, 10, 6, 7, 14, 11, 15, 18, 17, 11, 11, 10
Required: Compute the mean, median and mode.

Solutions :
A. 15, 12, 13, 19, 10, 6, 7, 14, 11, 15, 18, 17, 11, 11, 10
15+ 12+ 13+ ... +10
1st: Mn = ------------------------- ; Mn = 12.6
15
2nd: Mdn: 6,7,10,10,11,11,11,12,13,14,15,15,17,18,19
15 + 1
Mdn = ----------; Mdn = 8th score; Mdn = 12
2
Mo = 11
Seatwork: Compute the measures of central tendency of the
given set of data.
A. 10,14,14,14,15,16,16,16,17,17,18,18,19,20 (Male)
B. 6,7,9,10,12,13,15,16,17,19,20,21,22,23 (Female)
WEIGHTED MEAN
Another way of solving the mean wherein the weight of the
score/data are considered.
Example: 1. If a final examination in a class in Statistics
is given the weight of 5, class standing 2, average of
quizzes the weight of 4 and a student got the grades
of 88, 93 and 82, respectively, what would be the
student’s:
a. unweighted grade,
b. weighted grade,
c. median grade, and
d. modal grade?
Solutions:
a. Unweighted Mean
88+ 93 + 82
Mn = ------------------ ; Mn = 87.67
3
b. Weighted Mean
88 (5) + 93 (2) + 82 (4) 954
WM =------------------------------- ; WM = ------- ; WM = 86.73
5+2+4 11
c. Median
Arrange first the data: 82,82,82,82,88,88,88,88,88,93,93
11 + 1
Mdn = --------------; Mdn = 6th data; Mdn = 88
2
d. Mode:
Mo = 88
2. Below are the final grades of Student A during the second
semester of SY 2019 – 2020 in six subjects with the
number of credit units for each subject.
Subject Credit Units Grade
Accounting 1 6 2.25
Chemistry 5 1.75
Math 1 3 1.50
English 1 3 1.25
NSTP 102 2 1.25
PE 2 1 1.50
Required: Compute the mean grades (unweighted and
weighted), median grade and modal grade.
Solutions:
a. Unweighted Mean
2.25+1.75+1.50+1.25+1.25 +1.50
Mn = ------------------------------------------- ; Mn = 1.58
6
b. Weighted Mean
6(2.25)+5(1.75)+4(1.5)+5(1.25) 34.5
WM =------------------------------------------- ; WM = --------
6+5+4+5 20
WM = 1.725
c. Median:
1st: 1.25, 1.25, 1.25, 1.25, 1.25, 1.5, 1.5, 1.5, 1.5, 1.75,
1.75, 1.75, 1.75, 1.75, 2.25, 2.25, 2.25, 2.25, 2.25, 2.25
20 + 1
Mdn = --------------; Mdn = 10.5th data; Mdn = 1.75
2
d. Mode:
Mo = 88
3. Below is the result of the survey gathered by Ms. Tan in her
thesis entitled “Factors Affecting the Performance of
Selected Employees in Lipa City”. Compute the WM.
A. Salary and Other Financial Benefits. (N = 150)
Items 5 4 3 2 1 WM
The employes are motivated

by:
1. loyalty pay 36 74 20 15 5
2. performance-enhancement 99 41 10 0 0
incentives
3. Christmas bonus 30 80 40 0 0
4. clothing allow. 25 37 58 20 10
5. productivity pay 98 36 16 0 0
6. across- board incentives 87 46 41 0 0
Total/Composite Mean
Solutions:
Items 5 4 3 2 1 WM
The employes are

motivated by:
1. loyalty pay 36 74 20 15 5 150 571 3.81
2. performance- 99 41 10 0 0 150 689 4.59
enhancement
incentives
3. Christmas bonus 30 80 40 0 0 150 590 3.93
4. clothing allow. 25 37 58 20 10 150 497 3.31
5. productivity pay 98 36 10 0 0 144 664 4.61
6. across- board 87 46 41 0 0 174 742 4.26
incentives
Total/Composite 918 3753 4.09
Mean
GROUPED DATA
1. Mean
A. Midpoint Method
Formula:
∑ fM
Mn = -------------;
N
where f = frequency; M = midpoint;

and N = number of cases
b. Deviation Method:
∑ fd
Mn = Xo + -------- (i)
N
where: Xo = midpoint of the assumed mean
f = frequency
d = deviation
N = number of cases
i = interval
2. Median
Formula:
N/2 - <F
Mdn = LMdn + -------------------- (i)
fMdn
where: LMdn = Lower boundary of the median class
N/2 = half-sum
fMdn = frequency of the median class
i = interval
Ex: Using the example on ages of selected employees of a
certain school in Lipa City.
CL/CI f M/X fM/fX d fd d fd CB <F
49 - 54 8 51.5 412 2 16 0 0 48.5 - 54.5 68
43 - 48 6 45.5 273 1 6 -1 -6 42.5 - 48.5 60
37 - 42 14 39.5 553 0 0 -2 -28 36.5 - 42.5 54
31 - 36 9 33.5 301.5 -1 -9 -3 -27 30.5 - 36.5 40 Mdn
25 - 30 19 27.5 522.5 -2 -38 -4 -76 24.5 - 30.5 31 Mo
19 - 24 12 21.5 258 -3 -36 -5 -60 18.5 - 24.5 12
Total 68 2320 -61 -

197
1. Mean
a. Midpoint Method
2320
Mn = -------- ; Mn = 34.12 years
68
b. Deviation Method
- 197
Mn = 51.5 + ----- (6); Mn = 51.5 - 17.38; Mn = 34.12years
68
143
_
Mn = 21.5 + -------- (6) ; Mn = 21.5 + 12.62;
68
Mn = 34.12 years
b. Median:
Compute first the halfsum: 68/2 = 34th age
34 - 31
Mdn = 30.5 + ----------- (6) ; Mdn = 30.5 + 2 ;
9
Mdn = 32.5 years
c. Mode.
Formula:
fMo - f1
Mo = LMo +------------------ (i)
2fMo - f1 - f2
where: LMo = lower boundary of the modal class;
fMo = frequency of the mal class
f1 = frequency second higher in value
f2 = frequency third higher in value
i = interval
c. Mode:
19 - 14
Mo = 24.5 + --------------------- (6)
2(19) - 14 - 12
5
Mo = 24.5 + ------ (6)
12
Mo = 24.5 + 2.5 ; Mo = 27 years

POINT MEASURES
Point Measures also known as Quantiles or
Fractiles are positional measures that divide the
distribution into the desired number of equal parts.
MOST COMMON POINT MEASURES
1. Quartiles. These are positional measures that divide
the distribution into four equal parts. Q1, Q2, Q3, Q4.
2. DECILES. These are positional measures that divide
the distribution into ten equal parts. D1, D2, D3,..., D10.
2. PERCENTILES. These are positional measures that
divide the distribution into one hundred equal parts. P1, P2,
P3,... P100.
UNGROUPED DATA
EXAMPLES: Below are scores of selected students in a quiz in
Advance Statistics.
A. 12, 15, 13, 11, 10, 9, 10, 12, 8, 10, 15, 14.
Compute Q1,D8 and P60
a. Array the scores:
8, 9, 10, 10, 10, 11, 12, 12, 13, 14, 15, 15
b. Q1 = nN/4; Q1 = 12/4 ; Q1 = 3rd score; Q1 = 10
Q2 = nN/4; Q2 = 2(12)/4; Q2 = 6th score; Q2 = 11
c. D8 = nN/10; D8 = 8(12)/10 ; D8 = 9.6th score or 10th
score; D8 = 14
d. P60 = nN/100; P60 = 60(12)/100 ; P60 = 7.2nd score or
7th score; P60 = 12
2. Solve the values of Q2, D2 and P20 of the following scores:
12, 12, 14, 15, 16, 17, 18, 19, 19, 20, 21, 22, 23, 24, 25
a. Q2 = nN/4; Q2 = 2(15)/4; Q2 = 7.5th score;
Q2 = (18 + 19)/2; Q2 = 18.5
b. D2 = nN/10; D2 = 2(15)/10 ; D2 = 3rd score; D2 = 14
c. P20 = nN/100; P20 = 20(15)/100 ; P20 = 3rd score;

P20 = 14
GROUPED DATA
nN/4 - <F
Qn = LQn + ------------------------- (i)
fQn
where: LQn = Lower boundary of the
desired quartile class
fQn = frequency of the desired
quartile class
i = interval
nN/10 - <F
Dn = LDn + ------------------------- (i)
fDn
where: LDn = Lower boundary of the
desired decile class
fDn = frequency of the desired
decile class
i = interval
nN/100 - <F
Pn = LPn + ------------------------------ (i)
fPn
where: LPn = Lower boundary of the
desired percentile class
fPn = frequency of the desired
percentile class
i = interval
Ex: Using the example on ages of selected employees of a certain
school in Lipa City. Find Q3, D6 and P85.
CL/CI f CB <F
49 - 54 8 48.5 - 54.5 68
43 - 48 6 42.5 - 48.5 60 P85
37 - 42 14 36.5 - 42.5 54 Q3/D6
31 - 36 9 30.5 - 36.5 40
25 - 30 19 24.5 - 30.5 31
19 - 24 12 18.5 - 24.5 12
Total 68
Solution:
a. Q3
1st: Find the value of nN/4.
Q3 = 3(68)/4; Q3 = 51st age
2nd: Use the table to look where the 51st age belongs.
3rd: Use the formula to solve Q3

nN/4 - <F
Q3 = LQ3 + ----------------------------- (i)
fQ3
51 - 40
Q3 = 36.5 + --------------- (6); Q3 = 36.5 + 4.71; Q3 = 41.21 years
14
Solution:
b. D6
D6 = 6(68)/10; D6 = 40.8th age
2nd: Use the table to look where the 40.8th age belongs.
3rd: Use the formula to solve D6
nN/10 - <F
D6 = LD6 + ------------------------------- (i)
fD6
40.8 - 40
D6 = 36.5 + ----------------- (6); D6 = 36.5 + 0.34; D6 = 36.84 years
14
Solution:
b. P85
P85 = 85(68)/100; P85 = 57.8th age
2nd: Use the table to look where the 57.8th age belongs.
3rd: Use the formula to solve P85
nN/100 - <F
P85 = LP85 + ------------------------------ (i)
fP85
57.8 - 54
P85 = 42.5 + --------------- (6); P85= 42.5 + 3.8; P85 = 46.30 years
6
MEASURES OF VARIABILITY
In summarizing a given set of data, sometimes, the MCT
are not enough to give useful information. They have to be
supplemented by other measures of description such as
the measures of variability which indicate the extent to
which values in a distribution are spread out around a
central tendency.
MOST COMMONLY USED MEASURES OF VARIABILITY
1. RANGE. it is the diference between the highest and the
lowest data.
Formula: R = Highest Data - Lowest Data.
2. QUARTILE DEVIATION. This maybe used to minimize the
effect of extremely low and high values on the measure of
spread.
Formula:
Q3 - Q 1
QD = ---------------
2
3. MEAN ABSOLUTE DEVIATION (MAD). It is the
average deviation of the absolute values in a distribution
from the mean.
Absolute values are the values of the number
irrespective of signs.
Formula: ∑ I x - MnI
MAD = ------------------; x = given data
N
Example
1. Ms. A and Ms. B are applying for a
secretarial position in a well known
company in Gumaca Quezon. They are
required to undergo a test to determine
their typing ability. The results of the test
were as follow:
Ms. A 20, 22, 25, 25, 25, 28, 30
Ms. B. 21, 23, 25, 25, 25, 27, 29
Who performed better?
Using the MCT, the mean, median and mode of Ms. A and
Ms. B are the same.
Ms. A: Mn = 25; Md = 25 and Mo = 25
Ms. B: Mn = 25; Md = 25 and Mo = 25
Measures of Variability
1. Range
RA = 30 - 20 RB = 29 - 21
RA = 10 RB = 8
2. QD
MAD
Typing Typing
Score (Ms A) lx-Mnl Score (Ms B) lx-Mnl
20 5 21 4
22 3 23 2
25 0 25 0
25 0 25 0
25 0 25 0
28 3 27 2
30 5 29 4
∑lx-Mnl = 16 ∑lx-Mnl = 12
Formula: ∑ I x - MnI
MAD = ------------------; x = given data
N
HOMEWORK:
Below are the daily wages of selected employees (in peso) of a
factory in Lucena City.
450, 400, 325, 375, 475, 455, 440, 460, 465, 450, 425, 385, 390, 405
405, 425, 430, 435, 440, 495, 505, 510, 505, 510, 495, 400, 395, 390
380, 350, 355, 350, 360, 365, 375, 385, 395, 345, 385, 425, 400, 415
365, 385, 385, 390, 400, 400, 400, 485, 390, 365, 405, 450, 455, 465
Required:
1. Construct a CFDT using 7 TNC in descending order.
2. Compute the MCT (use 2 methods for mean).
3. Compute Q1, D4 and P90.

4. Find the MV

Advance Research and Statistics

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Advance Research and Statistics

Uploaded by

Copyright:

Available Formats

ADVANCE

Why You Need to Use Statistics in Research

Much of everyday life depends on making forecasts, and

Some of the reasons why you will be using statistics to

If you didn’t want to do at least one of these things, there

Statistics is a range of procedures for gathering, organising,

Mathematical Classification. Variables may also be classified

Characteristics of Levels of Levels of Data/Measurement

Nominal – indicates distinction.

Ways of Obtaining Samples

2. Slovin’s Method. A method that uses margin of error,

Margin of error ranges from 1% to 10%.

n = 315.79 afrmers or 316 farmers

Using the formula:

Brgy Population Strata Sample

4 900 9/80 9/80 x 381 = 42.86 or 43

n = 380.95 or 381 community members

Sex Frequency Percentage

Rank the scores.

49 - 54 8 48.5 - 54.5 51.5 0.1176 11.76 68 8

43 - 48 6 42.5 - 48.5 45.5 0.0882 8.82 60 14

37 - 42 14 36.5 - 42.5 39.5 0.2059 20.59 54 28

31 - 36 9 30.5 - 36.5 33.5 0.1324 13.24 40 37

25 - 30 19 24.5 - 30.5 27.5 0.2794 27.94 31 56

19 - 24 12 18.5 - 24.5 21.5 0.1765 17.65 12 68

Total 68 1.0000 100.00

19 - 23 11 18.5 - 23.5 21 0.1618 16.18 11 68

3. MODE (Mo) is the most frequent data.

Examples: Below are scores of selected students in a test in

Required: Compute the mean, median and mode.

The employes are motivated

The employes are

where f = frequency; M = midpoint;

49 - 54 8 51.5 412 2 16 0 0 48.5 - 54.5 68

43 - 48 6 45.5 273 1 6 -1 -6 42.5 - 48.5 60

37 - 42 14 39.5 553 0 0 -2 -28 36.5 - 42.5 54

31 - 36 9 33.5 301.5 -1 -9 -3 -27 30.5 - 36.5 40 Mdn

25 - 30 19 27.5 522.5 -2 -38 -4 -76 24.5 - 30.5 31 Mo

19 - 24 12 21.5 258 -3 -36 -5 -60 18.5 - 24.5 12

Total 68 2320 -61 -

Mo = 24.5 + 2.5 ; Mo = 27 years

b. D2 = nN/10; D2 = 2(15)/10 ; D2 = 3rd score; D2 = 14

c. P20 = nN/100; P20 = 20(15)/100 ; P20 = 3rd score;

43 - 48 6 42.5 - 48.5 60 P85

37 - 42 14 36.5 - 42.5 54 Q3/D6

3rd: Use the formula to solve Q3

3. Compute Q1, D4 and P90.

You might also like