Professional Documents
Culture Documents
CHP1 Mat161
CHP1 Mat161
1. WHAT IS STATISTICS?
Statistics is the science of planning studies and experiments for collecting, displaying, analyzing
and interpreting data, and making decisions based on the data. Statistics course, generally cover
all these aspects.
3. TYPES OF MEASUREMENT
Measurement consists of rules for assigning numbers to attributes of objects. By definition, any
set of rules for assigning numbers to attributes of objects is measurement.
Variables differ in how well they can be measured, i.e., in how much information their
measurement scale can provide. A factor that determines the amount of information that can be
provided by a variable is the type of measurement scale used.
The level of measurement of the data helps us to decide the appropriate statistical procedure to
use. There are four levels in the measurement scale of variables:
(i) nominal (ii) ordinal (iii) interval or (iv) ratio.
Nominal Scale
Nominal variables only allow for qualitative classification, i.e. they can be measured only in
terms of whether the individual items belong to some distinct categories. We cannot quantify nor
rank order the categories.
The nominal level of measurement are characterized by data that consist of names, labels or
categories only. Typical examples of nominal variables are gender, race, color, city, etc. For
example, the variable "race"; we can only say that 2 individuals in different categories are of a
different race; we cannot say which category has more of the quality represented by the variable.
A nominal scale is a measurement system that do not possess the properties of magnitude, it
lacks numerical significance and therefore should not be used for calculations.
Ordinal Scales
Data at the ordinal level of measurement can be arrange in some order, in terms of which has less
and which has more of the quality represented by the variable. However, we still cannot tell how
much more or how much less each category differ from the other.
An example of an ordinal variable is the socioeconomic status of families; we know that upper-
middle is higher than middle but we cannot say how much higher.
Rank ordering people in a classroom according to height and assigning the shortest person the
number "1", the next shortest person the number "2", etc. is another example of an ordinal scale.
Ordinal data allow us to make relative comparison, but cannot provide the magnitude of
difference. Like the nominal scale, computation of most of the statistics is not appropriate when
the scale type is ordinal.
Interval Scales
With interval data, we can rank order items that are measured plus quantify and compare the
sizes of differences between them, i.e. they possess the properties of magnitude and intervals.
Thus, it is appropriate to compute numerical statistics of data that is measured at the interval
level.
However, like the nominal and ordinal level, interval data does not have a rational (natural) zero
starting point.
For example, A’s height of 180cm is 1.2 times taller than B's height of 150cm. 0cm represents no
height.
4. TYPES OF VARIABLE
A variable is a characteristic of interest about individuals in a population that takes on different
values for different individuals. Gender and weight of a person, are examples of variables
because the value of these quantities vary from one individual to another.
There are two broad types of variables: qualitative (categorical, attribute) and quantitative
(numerical).
Qualitative Variable
Values of a qualitative variable are data that arise from observations that are separated into
distinct categories. Such data are discrete in nature and there are a finite number of possible
categories into which each observation may fall.
Qualitative data are classified as nominal if there is no natural order between the categories (eg:
eye colour) or ordinal, if an ordering exists (eg: grades of examination results, socio-economic
status)
Quantitative Variable
Quantitative or numerical data arise when the observations are counts or measurements. The data
are said to be:
discrete if the measurements are integers (eg: number of people in a household, number
of cigarettes smoked per day). When pictured on the number line, the set of all the
possible values consists only of isolated points.
continuous if the measurements can take on any value, usually within some range (eg:
weight). When pictured on the number line, the set of all values consists of intervals.
5. BASIC TERMS
Population and Sample
A population is the entire collection of individuals or measurements about which information is
desired.
A sample is a subset of the population selected for study. A random sample of size n from a
population is a subset of n elements from that population, chosen in such a way that every
possible subset of size n has the same chance of being selected as any other.
A statistic is a number calculated to describe an important feature of sample data. Statistics are
used as estimates for their corresponding, unknown population parameters.
The following are several parameters of importance in statistical analyses. Greek symbols are
usually used to represent parameters, and the symbols for the associated statistic are given on the
right.
In this course, we will concentrate on univariate and bivariate data sets only.