Module-1 1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 29


MODULE 1.1: Introduction to
Statistics and Data Analysis
In the plural sense, STATISTICS means a set of numerical
facts and figures. We say for example, “statistics on birth,
statistics on crime, statistics on unemployment or statistics
on marriage”

In the singular sense, STATISTICS is a body of knowledge

concerned with collection, presentation, analysis and
interpretation of data.
Descriptive Statistics: deals with the methods of
organizing, summarizing and presenting a mass data”

Inferential Statistics: concerned with making

generalizations about a body of data where only a part
of it is examined.
1. Universe- the set of all the individuals or entities under
2. Measurement- the assignment of numbers to objects or
events, observations according to logically accepted rules
3. Random Variable- a characteristic of interest
measurable on each and every individual of the
4. Quantitative Variable- a variable which may take on
numerical values; observations vary in degree
5. Qualitative variable- a variable which takes on non-
numerical values; observations vary in kind
6. Discrete variable- observations have a finite number
of values
7. Continuous variable- observations have an infinite
number of values; a variable which can assume any
value in a given interval of values.
8. Constant- observations do not vary
9. Population – the set of all possible values of a variable
10. Sample- a subset of the population
11.Parameter- a numerical characteristic of a population
12. Statistic- a quantity calculated from the observation
in a sample
Nominal: numbers are used as symbols to classify observations
Mathematical Property: equivalence

Ordinal: observations can be ranked or ordered

Mathematical Property: equivalence, > or <

Interval: degree of difference between levels on the scale can be specified although the
zero point is arbitrary.
Mathematical Property: equivalence, > or <, equality of distance

Ratio: ratios can be formed with levels of the scale because it has a true zero point.
Mathematical Property: equivalence, > or <, equality of distances, ratios can be formed
Sampling Procedures & Collection of Data
Sampling methods often use what are called
random numbers to select samples.
When the research problem and design were
identified, the process of collecting information is
There are two types of sampling techniques:
–Probability Sampling Technique
–Non-Probability Sampling Technique
Simple random sampling
In simple random sampling technique, every
item in the population has an equal and
likely chance of being selected in the sample.
Since the item selection entirely depends on
the chance, this method is known as “Method
of chance Selection”. As the sample size is
large, and the item is chosen randomly, it is
known as “Representative Sampling”.
Suppose we want to select a simple random
sample of 200 students from a school. Here, we
can assign a number to every student in the school
database from 1 to 500 and use a random
number generator to select a sample of 200
In the systematic sampling method, the items
are selected from the target population by
selecting the random selection point and
selecting the other methods after a fixed
sample interval. It is calculated by dividing
the total population size by the desired
population size.
Suppose the names of 300 students of a school are
sorted in the reverse alphabetical order. To select a
sample in a systematic sampling method, we have
to choose some 15 students by randomly selecting a
starting number, say 5. From number 5 onwards,
will select every 15th person from the sorted list.
Finally, we can end up with a sample of some
In a stratified sampling method, the total
population is divided into smaller groups to
complete the sampling process. The small
group is formed based on a few
characteristics in the population. After
separating the population into a smaller
group, the statisticians randomly select the

For example, there are three bags (A, B and

C), each with different balls. Bag A has 50
balls, bag B has 100 balls, and bag C has 200
balls. We have to choose a sample of balls
from each bag proportionally. Suppose 5
balls from bag A, 10 balls from bag B and 20
balls from bag C.
In the clustered sampling method, the cluster or
group of people are formed from the population
set. The group has similar significatory
characteristics. Also, they have an equal chance of
being a part of the sample. This method uses
simple random sampling for the cluster of
An educational institution has ten branches across the country
with almost the number of students. If we want to collect some
data regarding facilities and other things, we can’t travel to
every unit to collect the required data. Hence, we can use
random sampling to select three or four branches as clusters.
All these four methods can be understood in a better manner
with the help of the figure given below. The figure contains
various examples of how samples will be taken from the
population using different techniques.
In non-probability sampling, not every member of the
population has the equal chance of being selected. It can rely on the
subjective judgment of the researcher.
This method of sampling is resorted to when it is difficult to
estimate the population of the study because they are moving or
transition in a given location. This method is also useful in
exploratory or descriptive studies with a qualitative implication. It is
purposely to characterize the direction of the study, rather than to
qualify it.

Thus, the respondents are chosen on the basis of specific

criteria formulated by the researcher, rather than randomly selected.

This method is implemented by seeking at elements who

are readily available to respond to a question. In other words,
the first person who comes along who typifies the candidate
serves as the respondent of the study.
Purposive sampling is a useful methodology in qualitative
or exploratory studies, since the quality of the key person or
informant are identified by the researcher. The purpose is to get
an information from a respondent who are actually involved in
the situation.
Quota sampling entails grouping elements
according to certain characteristics and ensuring that
each group is represented. This is similar to stratified
sampling minus randomization.
This involves having a
respondents refer other people who
are in a position to answer some of
the questions of the researcher.
This sampling technique is
useful to highly sensitive topics
where the identity of respondents is
difficult to divulge. Hence, for the
topics that are highly confidential,
referral system is appropriate.

You might also like