Download as pdf or txt
Download as pdf or txt
You are on page 1of 6



Quantative Data Analysis




D.R Abdullah Fadl

Main Headlines & Sub Sections:


Univariate analysis
Bivariate analysis
Multivariate analysis
The range of statistical tests is enormous, so only some of the most often used are mentioned
here, An important factor to be taken into account when selecting suitable statistical tests is the
number of cases about which you have data, Generally, statistical tests are more reliable the
greater the number of cases. Usually, more than about twenty cases are required to make any
sense of the analysis, though some tests are designed to work with less.


In order to manipulate the data, they should be compiled in an easily read form. Although the
data will have been organized as part of the collection process, further compilation may be
needed before analysis is possible.

The use of rows and columns on a spreadsheet is the most common technique. Arow is given
to each record or case and each column is given to a variable, allowing each cell to contain the
data for the case/variable


The two major classes of statistics are parametric and non-parametric statistics. You need to
understand the meaning of a parameter in order to appreciate the difference between these two
types, A parameter of a population (i.e. the things or people you are surveying) is a constant
feature that it shares with other populations, The most common one is the ‘bell’ or ‘Gaussian’
curve of normal frequency distribution (See the figure below).
This parameter reveals that most populations display a large number of more or less ‘average’
cases with extreme cases tailing off at each end, Although the shape of this curve varies from
Case to case .

• There are two classes of parametric statistical tests: descriptive and inferential.

• Descriptive tests will reveal the ‘shape’ of the data in the sense of how the values of a
variable are distributed

• Inferential tests will suggest (i.e. infer) results from a sample in relation to a population.

• Univariate analysis – analyses the qualities of one variable at a time. Only descriptive
tests can be used in this type of analysis.

• Bivariate analysis – considers the properties of two variables in relation to each other.
Inferences can be drawn from this type of analysis.

• Multivariate analysis – looks at the relationships between more than two variables.
Again, inferences can be drawn from results.


A range of properties of one variable can be examined using the following measures :

Frequency Distribution :

Usually presented as a table, frequency distribution simply shows the values for each
variable expressed as a number and as a percentage of the total of cases .

Measure of Central Tendency :

Central tendency is one number that denotes various ‘averages’ of the values
for a variable, There are several measures that can be used, such as the arithmetic mean
(average), the median (the mathematical middle between the highest and lowest value)
and the mode (the most frequently occurring value). Normal distribution is when the
mean, median and mode are located at the same value.

Measures of Dispersion (or Variability) :

Measurements of dispersion can be expressed in several ways: range (the distance

between the highest and lowest value), interquartile range (the distance between the top
and bottom quarters of the values) and other more mathematical measures such as
standard deviation and standard error.
Graphical options to show and compare the measures :

• Bar graph – shows the distribution of nominal and ordinal variables.

• Pie chart – shows the values of a variable as a section of the total cases (like slices of a

• Standard deviation error bar – this shows the mean value as a point and a bar above
and below that indicates the extent of one standard deviation.


• Bivariate analysis considers the properties of two variables in relation to each other.

• An important aspect is the different measurement of these relationships, such as

assessing the direction and degree of association, statistically termed correlation

• Scatter grams are a useful type of diagram that graphically shows the relationship
between two variables by plotting variable data cases on a two- dimensional matrix.

• The closer the points are to a perfect line, the stronger the association.

• A line that is drawn to trace this notional line is called the line of best fit or regression
line. This line can be used to predict one variable value on the basis of the other

• Cross tabulation (contingency tables) is a simple way to display the relationship

between variables that have only a few categories.

Statistical Significanse :

To estimate the likelihood that the results are relevant to the population as a whole one
has to use statistical inference. The most common statistical tool for this is known as the
chi-square test. This measures the degree of association or linkage between two
variables by comparing the differences between the observed values and expected

Multivariate analysis looks at the relationships between more than two variables.


This tests the effect of a third variable in the relationship between two variables.


This is a technique used to measure the effects of two or more independent variables
on a single dependent variable measured on interval or ratio scales.


This method is a development of multiple regression, that has the added advantage of
holding certain variables constant in order to assess the independent influence of key
variables of interest.


Non-parametric statistical tests are used when:

• the sample size is very small

• few assumptions can be made about the data
• data are rank ordered or nominal
• samples are taken from several different populations.

The levels of measurement of the variables, the number of samples, whether they are
related or independent are all factors which determine which tests are appropriate

You might also like