Professional Documents
Culture Documents
Handout-A-Preliminaries (Advance Statistics)
Handout-A-Preliminaries (Advance Statistics)
TELLERMO-Course Professor
Used to mean numerical facts presented in forms
such as in tables or graphs.
a calculation on a collection of numerical values.
a methodology for arranging data in a format
useful for decision making.
Statistical procedures can be very analytical and
use theoretical information from probability
functions to make decisions where randomness
plays a part in observed outcomes.
experimental analysis in the traditional sciences and
social sciences.
quality control.
forecasting for the purpose of planning (business,
government, etc.)
statistical reports of business activities
estimating
testing
any procedure which relies on sampling
simulation and experimentation
useful for communicating information through
statistical data presentation techniques.
useful in understanding the techniques based on
sampling which are used by decision-makers in
your field of study, your workplace, and the
world around you and to apply them yourselves.
To be technically literate in a complex technical
world, a person should understand the meaning
of the statistical measures on which decisions
are based.
-is concerned with collecting, organizing, summarizing,
analyzing and interpreting data for the purpose of
making accurate decisions in the face of uncertainty.
Descriptive Statistics
-involves methods of organizing, picturing and
summarizing information form samples/population.
Inferential Statistics
-involves methods of using information from a
sample to draw conclusions regarding the population.
Population
A collection of all possible individuals, objects, or
measurements of interest.
Sample
A selection of some of the objects from the population.
Measurements/observations from part of the population.
Population parameter
A numerical measure that describes an aspect of population.
Sample statistic
A numerical measure that describes an aspect of a sample.
-refers to the kinds of information researchers
obtain on the subjects of their research.
Qualitative Data (Usually non-numerical labels or categories called attributes) -
referred to as being qualitative when the observations made are arrived at by
classifying according to category or description.
Example:
ˆThe religious denomination of community members is to be recorded and analyzed.
(Example Data: “Christians”, “Muslim”, ...)
The ranks of military personnel at an armed forces base are recorded. (Example Data:
“Corporal”, “Sergeant”, “Corporal”, ...)
Disadvantages:
-Mean does not supply information about the
homogeneity of the group. The more heterogeneous the
set of observation or group of individuals is, the less
satisfactory is the mean as measure of central
tendency.
• The median is determined by sorting the data set
from lowest to highest values and taking the data
point in the middle of the sequence. There is an
equal number of points above and below the median.
• The median is that value in the distribution such that
half of the observations are less than this value and
half are greater than this value.
• For example, in the data set {1,2,3,4,5} the median is
3; there are two data points greater than this value
and two data points less than this value. In this case,
the median is equal to the mean
Determining the Position of Median:
N+1
Position =
2
Example:
4, 6, 7, 8, 10, 12, 15, 21
Advantages:
-Is the best measure of central tendency when the
distribution is irregular / skewed. It may be located in an
open-end distribution or when the data are incomplete
Disadvantages:
- It has larger possible error than the mean. It
does not lend itself to algebraic treatment.
• The mode is that value of the variable that
occurs the most often. Since it occurs the most
often, it is the x value with the greatest
frequency on the frequency polygon.
• The mode is the most frequently occurring
value in the data set. For example, in the data
set {1,2,3,4,4}, the mode is equal to 4. A data
set can have more than a single mode, in
which case it is multimodal. In the data set
{1,1,2,3,3} there are two modes: 1 and 3
• The mode may not be unique since a
distribution may have more than one mode.
• There is no calculation required to find the
mode since it is obtained by inspection of the
data.
• For data measured at the nominal level, it is
the only average that can be found.
Advantages:
- The mode is always a real value since it is simple
to approximate by observation.
Disadvantages:
- The mode is inapplicable to a small number of
cases when the values may not be repeated.
The range is obtained by computing the
difference between the largest observed value
of the variable in a data set and the smallest
one.
Range = Max −Min.
▪ Example 1.
2
(𝑥−𝑥)
sd =
𝑛
Compute for the standard deviation of the
scores of 10 randomly selected students in a
test: