Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 37

CHAPTER 11

QUANTITATIVE
DATA ANALYSIS
AND
INTERPRETATION
Research Methodology:
Tools, Methods and Techniques

Sundram, V.P.K., Chandran, V.G.R., Atikah, S.B., Rohani, M., Nazura, M.S., Akmal, A.O., & Krishnasamy, T.
Learning
Learning Objectives
Objectives
After
Aftercompleting
completingthis
thischapter,
chapter,you
youshould
shouldbe
beable
ableto:
to:
 Understand
Understandthe
theimportance
importanceof
ofediting
editingthe
thecollected
collectedraw
rawdata
datatotodetect
detecterrors
errorsand
andomissions
omissions
 Set
Setup
upthe
thecoding
codingkey
keyfor
forthe
thedata
dataset
setand
andcode
codethe
thedata
data
 Categorize
Categorizedata
dataand
andcreate
createdata
datafiles
files
 Get
Getaa‘feel’
‘feel’for
forthe
thedata
data
 Test
Testthe
thegoodness
goodnessof
ofdata
data
 Understand
Understandthe
theuse
useof
ofcontent
contentanalysis
analysistotointerpret
interpretand
andsummarize
summarizeopen
openquestions
questions
 Understand
Understandthe
theproblems
problemsand
andsolutions
solutionsfor
for“don’t
“don’tknow”
know”responses
responses
 Understand
Understandthe
theoptions
optionsfor
fordata
dataentry
entryand
andmanipulation
manipulation
 Interpret
Interpret the
the computer
computer results
results and
and prepare
prepare recommendations
recommendations based
based on
on the
the quantitative
quantitative data
data
analysis
analysis
Research Methodology: Tools, Methods and Techniques 2
Table of Content
11.1 DATA DIAGNOSIS AND TREATMENT
11.2 APPROPRIATE STATISTICAL ANALYSIS
11.3 INTERPRETING SELECTED DATA ANALYSIS

Research Methodology: Tools, Methods and Techniques 3


CHAPTER 11

11.1 DATA DIAGNOSIS AND


TREATMENT

Research Methodology: Tools, Methods and Techniques 4


11.1.1 Missing Data
 Missing data are a certifiably big deal in multivariate analysis,
and it is important to have some tools for dealing with them.
 A single missing value for a variable can cause either the
variable or the case to be excluded.
 When dealing with missing data, you may leave the cell blank
or assign value codes. If you choose the latter, then a number
of rules apply:
 Missing value codes must be of the same data type as the data they
represent.
 Missing codes cannot occur as data in the data set
 By convention, the choice of digit is usually 9.
Research Methodology: Tools, Methods and Techniques 5
11.1.1 Missing Data
Dealing With Missing Data
An example of this is to replace a sampled country with another country
Case substitution
not yet included in the sample.
Another way of dealing with missing data is by replacing missing data
points with mean value of the variable. This is done by substituting a
Mean substitution
variable’s mean value computed from available cases to fill in missing
data values on the remaining cases.
Cold deck This method replaces the missing value by a constant value from an
substitution external source (for example, from a previous survey).
Regression This is the best method if you have strong relationships and a moderate
substitution amount of missing data.
This is a composite estimation based on several methods. For example,
if you have multiple linear relationships, you could estimate the variable
Multiple methods
value from regression
Research of many
Methodology: Tools, Methodsdifferent variables and take the mean
and Techniques 6 of
11.1.2 Outliers
 An outlier is a value that lies outside the normal
range of data.
 Data values for the outliers are added, and
identifiers may be provided for interesting values.
 Box and whisker plots are particularly useful for
comparing group categories (e.g., men versus
women) or several variables (e.g., relative
importance levels of product attributes).

Research Methodology: Tools, Methods and Techniques 7


Boxplot Components

Largest observed
Smallest observed
value of upper
value of lower hinge
hinge

Outside value or Median Outside value


outlier
Whiskers or outlier

Research Methodology: Tools, Methods and Techniques 8


11.1.3 Normality Tests
 The assumption of normality is a perquisite for many
inferential statistical techniques.
 There are a number of different ways to explore this
assumption graphically:

Stem-and-leaf Normal
Histogram Box plot
plot probability plot

Kolmogorov-
Smirnov statistic,
Detrended with a Lilliefors
Skewness Kurtosis
normal plot significance level
and the Shapiro-
Wilks statistic
Research Methodology: Tools, Methods and Techniques 9
10 Research Methodology: Tools, Methods and Techniques
• This is another measure of central tendency for quantitative
variables.

Median
It is defined as the value that sits right in the middle of all data
entries when they are listed in ascending order.
• This is the most powerful measure of dispersion for quantitative
data.
deviation
• It permits very sophisticated descriptions of various
Standard
distributions.
• The square of the standard deviation.
Variance
• The mean of a quantitative variable is defined as the sum of all
entries divided by their number.
Mean
11.1.4 Feel of Data
11.1.5 Goodness of Fit
 Reliability – established by testing for both
consistency and stability.
 Consistency indicates how well the items measuring a
concept hang together as a set. Another measure of
consistency reliability used in specific situations is the split-
half reliability coefficient.
 The stability measure can be accessed through:
 parallel-form reliability – when a high correlation between two
similar forms of a measure is obtained
 test-retest reliability – a group of people (preferably 30 or more)
complete the questionnaire twice, with a reasonable time period
(e.g. a week) between the completions.
Research Methodology: Tools, Methods and Techniques 11
11.1.5 Goodness of Fit
 Validity
 Factorial validity – established by submitting the data
for factor analysis. The results of factor analysis (a
multivariate technique) will confirm whether or not the
theorized dimensions emerge.
 Criterion-related validity – established by testing for
the power of the measure to differentiate individuals
who are known to be different.

Research Methodology: Tools, Methods and Techniques 12


11.1.5 Goodness of Fit
 Convergent validity – established when there is a high
degree of correlation between two different sources
responding to the same measure.
Example
Both supervisor and subordinates respond in a similar way to a perceived reward
system measure administrated to them.

 Discriminant validity – established when two distinctly


different concepts are not correlated to each other.
Example

Courage and honesty; leadership and motivation; attitudes and behaviour.

Research Methodology: Tools, Methods and Techniques 13


CHAPTER 11

11.2 APPROPRIATE STATISTICAL


ANALYSIS

Research Methodology: Tools, Methods and Techniques 14


11.2.1 Parametric
 A t-test is used to determine whether a set or sets
of scores are from the same population.
 Three main types of t-test may be applied:
 One sample
 Independent groups
 Repeated measures

Research Methodology: Tools, Methods and Techniques 15


11.2.2 Assumption Testing
 Each statistical test has certain assumptions that must be met
prior to analysis.
 These assumptions need to be evaluated, because the accuracy of
test interpretation depends on whether assumptions have been
violated.
 The generic assumptions underlying all types of t-test are:
1. Scale of Measurement – the data should be at the interval or ratio level
of measurement.
2. Random sampling – the scores should be randomly sampled from the
population of interest.
3. Normality – the scores should be normally distributed in the population.

Research Methodology: Tools, Methods and Techniques 16


11.2.2 Assumption Testing

One-sample t-test

Research Methodology: Tools, Methods and Techniques 17


11.2.2 Assumption Testing

One-way ANOVA

Independence of groups Homogeneity of variance

Research Methodology: Tools, Methods and Techniques 18


11.2.3 Non-parametric Test

Wilcoxon The test is used when you would use a repeated measures or paired t-test – that

is, when the same participants perform under each of the independent variable.

The test is used to compare two or more related samples, and


Friedman

is equivalent to repeated measures or within-subject’s ANOVA.

Mann-Whitney

● It tests the hypothesis that two independent samples come from populations having
the same distribution. This test is equivalent to the independent groups t-test.

The test is equivalent to the one-way between-groups ANOVA and thus


Kruskal-Wallis

allows us to examine possible differences between two or more groups.

A non-parametric alternative to the parametric bivariate


Spearman rho

correlation (Pearson’s r) is Spearman’s rho.

Research Methodology: Tools, Methods and Techniques 19


CHAPTER 11

11.3 INTERPRETING SELECTED


DATA ANALYSIS

Research Methodology: Tools, Methods and Techniques 20


11.3.1 Interpretation of Descriptive Analysis
 Descriptive statistics are used to describe, examine
and summarize the main features of a collected
data quantitatively.

 Model case 1

Research Methodology: Tools, Methods and Techniques 21


11.3.1 Interpretation of Descriptive Analysis

The table below shows the study of the relationship


between economic fundamentals and the money
supply.
Table 1: Results of Descriptive Analysis (Money Supply)

Variable Mean Std Dev Max Min


Money Supply 38180.5 1580 43235.5 33862.7
Inflation (CPI) 101.2 3.5 112.5 88.9
Government Debt 11264.58 1871 11578.86 9435.17
National Income (GDP) 20359.8 2002 21802.6 19189.3

Research Methodology: Tools, Methods and Techniques 22


11.3.1 Interpretation of Descriptive Analysis

(i) Mean
 Mean is used to measure the center tendency of the arithmetic
average of the scores. To compute the mean, all the values are
added up and divided by the number of values.
 The maximum amount for money supply is RM43,235.50 and a
minimum of RM33,862.70. Also, it has a mean of RM38,180.50.
 The inflation (CPI) has the maximum score of 112.5% and a
minimum score of 88.9%. While it has an average score of 101.2%.
 The score for government debt is within the range of RM9,435.17 to
RM11,578.86 and the mean is at RM11,264.58.
 The national income variable has a minimum score of RM19,189.30
and a maximum of RM21,802.60. While the mean is RM20,359.80.
Research Methodology: Tools, Methods and Techniques 23
11.3.1 Interpretation of Descriptive Analysis
(ii) Standard deviation
 Standard deviation is used to measure variability of the square root of
variance providing an index of variability in the distribution of scores.
 The standard deviation for the variables of money supply, inflation,
government debt, and national income is RM1,580, 3.5%, RM1,871,
and RM2,002 respectively.
 In the case inflation variable, the standard deviation is 3.5/101.2 or
3.46% of the mean where this value can be considered as small. On
the other hand, for government debt variable, the standard deviation
is 16.61% (1871/11264.58) of the mean, where this score is perceived
as a large deviation.

Research Methodology: Tools, Methods and Techniques 24


11.3.2 Interpretation of Correlation Analysis
 The correlation analysis determines whether and to
what degree a relationship exists between two or
more quantifiable variables.
 For example, it is used to measure the relationship
strength between the dependent and independent
variables.

 Model case 1

Research Methodology: Tools, Methods and Techniques 25


11.3.2 Interpretation of Correlation Analysis
The table below shows the correlation coefficients for the
variables average income, total expenditure and number of
people living in the households.
Table 2: Results of Correlation Analysis (Firm’s Employees in SME Malaysia)
Average Number of
  Total expenditure
Employee Salary Employee
Average Pearson Correlation 1 0.539** 0.293**
Employee Salary Sig (2 tailed)   0.000 0.034
Pearson Correlation 0.539** 1 0.373**
Total expenditure
Sig (2 tailed) 0.000   0.000
Number of Pearson Correlation 0.293** 0.373** 1
Employee Sig (2 tailed) 0.034 0.000  
** correlations are significant
Research Methodology: Tools, Methods and Techniques 26
11.3.2 Interpretation of Correlation Analysis

(i) Definition of correlation coefficient


 Correlation is used to look at the ‘net strength’ relationship
between two continuous variables (Sweet and Martin, 2008).
 A correlation coefficient shows the direction, strength, and
significance of the bivariate relationship among all the variables
that were measured at an interval or ratio level.
 There could be a perfect positive correlation between two
variables, represented by 1.0 (plus 1) or a perfect negative
correlation, which would be -1.0 (minus 1).
 It does not tell us which variable causes which, but it tells us that
the two variables are associated with each other.

Research Methodology: Tools, Methods and Techniques 27


11.3.2 Interpretation of Correlation Analysis

(ii) Explanation of the study’s correlation analysis


 There is a positively moderate correlation ( = 0.539) or
substantial relationship between the average employee salary and
total expenditure. Also, this relationship is significant at the 0.01
level.
 While average employee salary have a low correlation ( = 0.293)
which is definite but small relationship with number of employee.
However, it has a significant relationship at the 0.01 level.
 The total expenditure and number of employee also have a
definite but small relationship. In other words, have a low
correlation ( = 0.373) and significant at the 0.01 level.

Research Methodology: Tools, Methods and Techniques 28


Correlation Strength Based on Guilford’s Law

R Strength of relationship
< 0.20 Almost negligible relationship
0.20 – 0.40 Low correlation; definite but small relationship
0.40 – 0.70 Moderate correlation; substantial relationship
0.70 – 0.90 High correlation; marked relationship
> 0.90 Very high correlation; very dependable relationship

Research Methodology: Tools, Methods and Techniques 29


11.3.3 Interpretation of Regression Analysis
 Regression analysis is used to measure how many
percent dependent variables can be explain by the
independent variable.

 Model case 1

Research Methodology: Tools, Methods and Techniques 30


11.3.3 Interpretation of Regression Analysis
The table below shows the result of regression analysis of
four independent variables regressed against customer
satisfaction.
Table 3: Results of Regression Analysis (Customer Satisfaction)
Standardized
Unstandardized Coefficients
Model Coefficients t Sig
B Std. Error Beta
(Constant) 1.483 .290   5.114 .000
Product Quality .235 .069 .277 3.432 .001
Customer Service .024 .082 .026 .285 .776
Pricing .198 .076 .223 2.620 .010
Promotion .351 .080 .161 1.977 .025
F value 9.349
Sig .000
Adjusted R2 .181
R2 .203
Research Methodology: Tools, Methods and Techniques 31
11.3.3 Interpretation of Regression Analysis
(i) Model fit / Coefficient of determination (R2)
 R2 indicates the percentage variance in the dependent variable
that is explained by the variation in the independent variables.
 The R2 of 0.203 implies that all the independent variables explain
20 percent of the variance in dependent variable.
 79.7 percent of the variance in the dependent variable is not
explained by the independent variables in this study. This
indicates, there are other independent variables which are not
included in this study and could further strengthen the regression
equation.

Research Methodology: Tools, Methods and Techniques 32


11.3.3 Interpretation of Regression Analysis
(ii) Adjusted R2
 Adjustment of R-squared that penalizes the additional of
independent variable (IVs) to the model.
 Adjustment of R-squared penalizes the additional of 0.181 unit of
independent variable (IVs) to the model.

(iii) Model significance


 F-test is significant base on the value of 0.000. Hence all
independent variables significantly explained dependent variable.

Research Methodology: Tools, Methods and Techniques 33


11.3.3 Interpretation of Regression Analysis
(iv) Parameter significance (t-test)
 The result for product quality variable is 0.001 (0.1%), which is below the 5%
significant level. Therefore, product quality variable is significant. Hence,
explain that product quality is positively related with dependent variable.
 The variable for customer service is not significant. It is because the p-value for
customer service variable is 0.776 (77.6%), which is above the 5% significant
level. Hence, explain that customer service is not related with dependent
variable.
 Pricing variable has a p-value of 0.010 (1%), which is below the 5% significant
level. Therefore, pricing variable is significant. Hence, explain that pricing is
positively related with dependent variable.
 The promotion variable is significant with a p-value of 0.025 (2.5%). Thus,
shows it is below the 5% significant level. Hence, explain that promotion is
positively related with dependent variable.
Research Methodology: Tools, Methods and Techniques 34
11.3.3 Interpretation of Regression Analysis
(v) Unstandardized Beta Coefficients
 They are the value of regression equation function for predicting the dependent variable
from the independent variable.
 The column of estimates provides the value for 0 , 1 , 2 for this equation.
 Customer Satisfaction = 1.483 + 0.235 Product Quality + 0.024 Customer Service +
0.198 Pricing + 0.351 Promotion
 For each one-unit increase in product quality, customer satisfaction will increase by
0.235 units with holding other independent variable constant.
 For each one-unit increase in customer service, customer satisfaction will increase by
0.024 units with holding other independent variable constant.
 For each one-unit increase in pricing, customer satisfaction will increase by 0.198 units
with holding other independent variable constant.
 For each one-unit increase in promotion, customer satisfaction will increase by 0.351
units with holding other independent variable constant.

Research Methodology: Tools, Methods and Techniques 35


11.3.3 Interpretation of Regression Analysis
(vi) Standardized Beta Coefficients
 The beta uses a standard unit that is the same for all variables in the equation.
 It tells the same thing as unstandardized beta value but is expressed as standard deviation.
 As product quality increase by one standard deviation, customer satisfaction increase by
0.277 of a standard deviation.
 As customer service increase by one standard deviation, customer satisfaction increase by
0.026 of a standard deviation.
 As pricing increase by one standard deviation, customer satisfaction increase by 0.223 of a
standard deviation.
 As promotion increase by one standard deviation, customer satisfaction increase by 0.161
of a standard deviation.
 Therefore, the strongest would be the product quality variable with a beta weight of 0.277.
The second would be the pricing variable with a beta weight of 0.223. The weakest variable
would be promotion with beta weight of 0.161. While customer service variable does not
explain the variance in customer satisfaction significantly.

Research Methodology: Tools, Methods and Techniques 36


11.3.3 Interpretation of Regression Analysis
(vii) Recommendation
 The company should ensure that employees are continuously
producing high quality product to ensure customer satisfaction.
 The company needs to put the best price and promotion
advertisement to attract customers.

(viii) Future research


 Future studies should use other variables that have possible
contribution on customer satisfaction.
 Suggest moderating and mediating variables that would influence the
relationship between independent variable and dependent variable.

Research Methodology: Tools, Methods and Techniques 37

You might also like