Quantitative Data Analysis - Meita L K PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Quantitative

Data
Analysis
M E I TA L E S M I AT Y K

2208555
What We are Going to Discuss
• 9.1 Computerized data analysis and SPSS
• 9.2 Preparing the Data for the Analysis
• 9.3 Data Reduction and Reliability Analysis
• 9.4 Key Statistical Concepts
• 9.5 Descriptive Statistics
• 9.6 Comparing Two Groups: T-Test

Quantitative • 9.7 Comparing More than Two Groups: Analysis of Variance (ANOVA)
• 9.8 Correlation

Data Analysis • 9.9 Non-Parametric Tests


• 9.10 Advanced Statitics Procedures
Introduction
The Types

Simple
descriptive
statistics (i.e.
calculating the
mean)

Complex
multivariate
procedures

3. Analyze it using a 4. Result:


1. Arranging 2. Collecting the Accepting or
set of mathematical
research questions Quantitative data
procedure (statistics) rejecting the
hypotheses
General Characteristics
Quantitative data analysis is producing
relatively straightforward results because
there are well-defined procedures guided by
universally accepted canons, and the
computer will do most of the detailed
mathematical work.
The Prerequisite Procedures
9.1. SELECT AND LEARN TO USE A STATISTICAL 9.2. PREPARE THE DATA
PROGRAM FOR THE ANALYSIS

• Microsoft Excel There are several steps that must be taken by


researchers in preparing data for analysis.
• SPSS (Statistical Packages for the Social Sciences)
“form a stack of disorganized hard copy to online
• https://www.sas.com/en_us/software/stat.html
data that are trustworthy” (Davidson, 1996)
• https://www.qlik.com/us/data-analytics/data-
analytics-tools

• https://www.techjockey.com/blog/free-statistical-
software#datamelt

• https://www.predictiveanalyticstoday.com/top-
statistical-software/
9.2.1 Coding Quantitative Data
Quantitative coding is the process of Example.
categorizing the collected non-numerical Gender Code
information into groups and assigning the Male 1
numerical codes to these groups. Numeric Female 2
coding is shared by all statistical software
and among others, it facilitates data Questionnaire Result Code

conversion and measurement comparisons.


Strongly Agree 4
Agree 3
Neutral 2
Disagree 1
Strongly Disagree 0
9.2.2 INPUTTING THE DATA
9.2.3 DATA SCREENING AND
CLEANING

1. Creating the data file in sequences based on any


The initial data usually contain mistakes.
revision so that the file will not be mixed up.

2. Defining coding frames for variables. Specifying


the variable and the value will help the researcher 1. correcting the impossible data
remember the details when he comes back to the
data at a later stage. 2. correcting the incorrect entered values

3. Keying in the data. Find a part in inputting the 3. correcting contradictive data
numbers to avoid mistakes.
4. Dealing with outliers

Outliers are extreme values that differ from most


other data points in a dataset. They can have a big
impact on your statistical analyses and skew the results
of any hypothesis tests.
9.2.4 Data Manipulation
Making changes in the data set prior to the While working with disparate data,
analysis in order to make it more researchers need to organize, clean, and
appropriate for certain statistical transform it to use it in their decision-
procedures; it does not involve biasing the making process. This is where data
result. manipulation fits in.

1. Handling missing data

2. Re-coding negatively worded value Data manipulation allows the researchers to


manage and integrate data helping drive
3. Standardizing the data
actionable insights.
9.3 Data Reduction and Reliability Analysis

The purpose of data reduction can be two- Reliability relates to the consistency of a
fold: reduce the number of data records by measure.
eliminating invalid data or produce summary
Ex. A participant completing an instrument
data and statistics at different aggregation
meant to measure motivation should have
levels for various applications.
approximately the same responses each
time the test is completed. Although it is not
possible to give an exact calculation of
Although affect the result, the impact is kept
reliability, an estimate of reliability can be
minimal.
achieved through different measures.
9.4 Key Statistical Concepts
In order to be able to choose the 1. Nominal or categorical: Associated with
variables that have no numerical values, are fully
right procedure and interpret the
arbitrary, does not indicate any difference in size
result correctly. or salience. Ex. Gender (male 1 – Female 2)

2. Ordinal: Involved ranked numbers that do not


correspond to any regular measurement. Ex.
9.4.1. Main types of Quantitative Data Everyday [5], twice a week [3[, once a week [1]

3. Interval: Provides a series of values which


correspond to equal differences in the
degree/size of the variable measured. Ex. Test
result
9.4.2 NORMAL DISTRIBUTION
9.4.3 DESCRIPTIVE VS. INFERENTIAL STATISTICS
OF THE DATA

Descriptive statistics Describe a chunk


of raw data using summary statistics,
graphs, and tables.

Inferential statistics uses a small sample of


The normal distribution is a probability function data to draw inferences about the larger
that shows the spread of a variable. This
population that the sample came from.
function is generally demonstrated by a
symmetric graph called a bell curve. When
indicating an even distribution, the curve peaks
in the middle and slopes on either side with
equal values. The peak of the curve is the mean
value.
9.5 Descriptive Statistics
Descriptive statistics, help describe and understand the features of a
specific data set by giving short summaries about the sample and
measures of the data.
The most recognized types of descriptive statistics are measures of
center: the mean, median, and mode, range, and variance.
9.6 Comparing Two Groups: T-Test
Parametric statistical test is a data testing technique The T Test is divided into 2 types,
that is useful for testing hypotheses involving
population parameters. Parametric statistical tests 1) The Independent Sample T Test
can only be used on homogeneous data.
Used for 2 groups of data that are not related or not
The T test is a type of parametric statistical test that the same, Ex. the first data group has 10 data, while
is commonly used to test the significance and the second data group only has 7.
relevance of one or two sample groups.
2) The Paired Sample T Test.

Used for 2 groups of data that are the same and


related. Ex. analysis of differences in company sales
in 2018 and 2019. In this case, this means that the
analysis is carried out annually, which means there
are 12 months, meaning that both data groups have
12 sales data.
9.7 Comparing More than Two Groups:
Analysis of Variance (ANOVA)
ANOVA is used as an analytical tool to test the Example.
research hypothesis which assesses whether
there is a mean difference between more than A researcher wants to assess whether there are
two groups. differences in A, B and C learning models
towards the learning outcomes of English
The final result of the ANOVA analysis is the subjects in grade 6. Each class number ranges
value of the F test or F count. This calculated F from 40 to 50 students.
value will later be compared with the value in
table f. Where in this study, class 6A was given
treatment A, class 6B was given treatment B
If the calculated f value is more than f table, it and class 6C was given treatment C.
can be concluded that accepting H1 and
rejecting H0 or which means there is a After one semester treatment, then compared
significant difference in the mean in all groups. the learning outcomes of all classes (A, B and
C).
Analysis of Covariance (Ancova)
The purpose of ANCOVA is to find out or to see the effect of treatment
on response variables by controlling other quantitative variables.

Ex. A study was conducted at a tertiary institution to find out whether


there is an effect of differences in teaching lecturers on student
course grades, for example course A. At that college there are 3
lecturers teaching the same subject, for example Lecturer I, Lecturer
II, and Lecturer III.

In the case above, the variables used are:

1. The response variable (y) is the score obtained by students for the
course

2. Treatment / treatment, namely teaching lecturers (there are 3


categories)

In fact there are other factors that also affect student scores, such as
IQ. Therefore, IQ is used as a control variable (covariate) to reduce the
error rate. For the purposes of this study, a sample of 12 students was
taken from each teaching lecturer. The data obtained are as follows.
9.8 Correlation
Correlation test is an analytical technique used to determine whether there is a 3) Partial correlation
relationship between the 2 variables being tested.

The measure of the closeness in this correlation test is usually called the correlation
Partial correlation measures the strength of the relationship
coefficient or rho. The rho value ranges from -1 to 1. between two variables, while controlling for the effect of one
or more of the other variables.
If the rho value is close to -1 or 1, then the two variables have a strong correlation.
Conversely, if the rho value is close to 0, then the two variables tend to have a weak THE CORRELATION BETWEEN GRAMMAR AND VOCABULARY
correlation or even no correlation.
MASTERY TOWARD SPEAKING SKILL AMONG THE SEVENTH
GRADE STUDENTS OF MTSN 1 KEDIRI
Four types of correlation that is used in applied linguistics:

1) Pearson Product-Moment

Used to test or analyze the correlation or relationship of the independent variable (X) 4) Multiple correlation
with the dependent variable (Y) in which data from both are in the form of intervals or
ratios.
"The correlation between reading habit and English vocabulary mastery“

2) Point-biserial correlation and Phi coefficient correlation


Point-biserial correlation is used to understand the strength of the relationship
between two variables. Your variables of interest should include one continuous
and one binary variable
9.9 Non-Parametric Tests
There are two types of tests that can be used to a) Non-parametric statistical test is a statistical
determine whether an observed difference test that does not depend on the assumption
between two groups or variables is significant. that the underlying data is normally distributed.
In many cases, these tests use ranking order
1) Parametric test that tests data for information in the data.
assumptions and uses methods that build
on those assumptions. This can include t- b) Non-parametric tests are especially useful
tests or analysis of variance (ANOVA) which when you have a sample of data that doesn't
will differ in their assumptions about how meet the requirements for parametric testing.
the data was generated.
c) Parametric tests are more sensitive than
2) Nonparametric test, which doesn't check nonparametric tests and can detect small
any assumptions about how the data was differences between groups or variables
created and instead just looks at the raw whereas nonparametric tests cannot detect
values and compares them. these small differences.
Types of Non-Parametric Test
1. Chi-Square Test 4. Wilcoxon Signed-Rank Test
The working principle of Chi-Square is to 5. Kruskal-Wallis Test
compare two variables whose data scale is
nominal. The chi square test can be
performed only on large samples. This test is
carried out by tabulating the variables into
categories and then calculating the Chi
Square statistic.
2. Spearman’s Rank Order Correlation
3. Mann-Whitney U Test
9.10 Advanced Statistical
Procedures
1. Two-Way Anova
2. Factor Analysis
3. Cluster Analysis
4. Structural Equation Modelling
5. Meta-Analysis

You might also like