Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

Descriptive Statistics

We have all the data (All society data), summarize data, results are
100% correct.

SCALE: DESCRIPTIVE: – Mean, Sum, Range, Max, Min stdev, skewness,


Kurtosis, Check outlier candidates using standardized values
EXPLORE: Mean, Sum, Range… Check Normality, Check Outlier
candidates using Box plots
CATEGORICAL: FOR NOMINAL/ORDINAL) – Frequency, Percentage of values
FREQUENCIES: For each variable alone, display percentage and count of
variable values, Bar chart, Pie chart or histogram
CROSSTAB: 2 or more intersected variables, display percentages and count

RATIO STATISTICS
Describe the ratio between two scale variables.

Example of research question: Is there good uniformity in the ratio between


the appraisal price and sale price of homes in each of five counties?
Output: Median, mean, coefficient of dispersion (COD), median-centered
coefficient of variation, mean-centered coefficient of variation, minimum
and maximum values, the concentration index computed for a user-specified
range or percentage within the median ratio.
We can determine 
Which township's housing values have changed the most?
Median values closer to 1 has changed the least
Larger COD values indicate greater variability.

The within % of median coefficient of concentration (COC) measures variability,


it simply reports the percentage of values within a certain percentage of the
median. Larger values of this statistic indicate less variability.

INSTRUCTOR: MOHAMMED ABDUL KHALIQ DWIKAT EMAIL:dwikatmo@gmail.com


TOPIC: TESTS' SUMMARY DATE: 7/25/20 PAGE: 1 OF 6
PRETESTS SUMMARY (NORMALITY, LINEARITY, HOMOCEDASTICITY)
1. Testing Normality
In H0 assume that skewness and Kurtosis are equal to Zero
H0: The population (for variable x) is normally distributed.
Ha: The population (for variable x) is NOT normally distributed.

If Sig < = 0.05 (reject H0), Means Not Normally distributed


If Sig > 0.05 (don’t reject H0), Means Normally distributed

2. Testing Linearity
Simple Linear Regression y = aX + b
H0 : a = 0 H0: The Slope of best fit line = 0
Ha : a ≠ 0 Ha: The Slope of best fit line ≠ 0

If Sig < = 0.05 (reject H0), Means Linear Relationship


If Sig > 0.05 (don’t reject H0), Means Not Linear Relationship

The null hypothesis of correlation/linear regression is that the slope of the


best-fit line is equal to zero; in other words, as the X variable gets larger,
the associated Y variable gets neither higher nor lower.

3. Testing Homocedasticity / Correlation


Example of research question: Is there a relationship between age and optimism
Scores? Does optimism increase with age?
The null hypothesis (H0) and alternative hypothesis (Ha) of the significance test
for correlation can be expressed as follow
H0: ρ = 0 or the population corr.coefficient is 0; there is no association
Ha: ρ ≠ 0 or the population corr. coefficient ≠0; a nonzero correlation could exist
If Sig < = 0.05 (reject H0),
Means there is a significant Correlation between X and Y
If Sig > 0.05 (reject H0),
Means there is No significant Correlation between X and Y
Strength of correlation coefficient is explained as
Range Explanation Same for negatives
[0.0 – 0.3[ Not Significant ‫ال ُيذكر‬ 0 to -0.3
[0.3 – 0.5[ Weak ‫ضعيف‬ -0.3 to –0.5
[0.5 – 0.7[ Intermediate ‫متوسط‬ -0.5 to –0.7
[0.7 – 0.9[ Strong ‫قوي‬ -0.7 to –0.9
[0.9 – 1.0[ Very Strong ‫جدا‬ ‫قوي‬ -0.9 to -1

If correlation coefficient between X and Y is


Positive: It means, Increase the value of X will Increase the value of Y
Negative: It means, Increase the value of X will Decrease the value of Y
Use Spearman correlation coefficient for 2 categorical variables
Use Pearson correlation coefficient for 2 Scale variables
Correlation coefficient between a variable and itself (X and X) always = 1

INSTRUCTOR: MOHAMMED ABDUL KHALIQ DWIKAT EMAIL:dwikatmo@gmail.com


TOPIC: TESTS' SUMMARY DATE: 7/25/20 PAGE: 2 OF 6
Checking Reliability
Cronbach Alpha measures internal consistency
Variables used to calculate Cronbach Alpha
All Variables related to our research
Exclude empty variables, One value variables, Serials, ID’s and similar
Exclude Scale measures but don’t exclude Likert Scale Variables

Cronbach alpha values can be quite small. In this situation it may be


better to calculate and report the mean inter-item correlation for the
items. Optimal mean inter-item correlation values range from
.2 to .4 (as recommended by Briggs & Cheek 1986).

Likert Scale
What is Likert Scale Data?
Evaluation on a 5 degree scale, 3 degree scale or any other Level

Average Explanation – 5 Level Scale


Range Meaning-ve Meaning+ve
[1.0 – 1.8[ Strongly Agree Strongly disagree
[1.8 – 2.6[ Agree Disagree
[2.6 – 3.4[ Neutral Neutral
[3.4 – 4.2[ Disagree Agree
[4.2 – 5.0] Strongly disagree Strongly Agree

Average Explanation – 3 Level Scale


Range Meaning
[1.00 – 1.66[ Agree
[1.66 – 2.33[ Neutral
[2.33 – 3.00[ disagree

Inferential Statistics
Is used when we have a sample and want to generalize result to a
population, it include error in generalization called alpha, we have a
hypothesis that want to reject or retain

Nonparametric Tests
Binomial Test (one categorical variable with 2 values only)
Example of research question: Is proportion of Female Spiders = 0.75
H0: proportion of female spiders = 0.75
Ha: proportion of female spiders≠ 0.75
When performing the test, value of H0 should be at first case

Chi Square goodness of fit Test (one categorical/discrete variable, each have
2 or more answers (values))
Example of research question: are students interested in different field equally

H0: The proportions of MIS, CIS and CS Students are equal


Ha: The proportions of MIS, CIS and CS Students are NOT equal
INSTRUCTOR: MOHAMMED ABDUL KHALIQ DWIKAT EMAIL:dwikatmo@gmail.com
TOPIC: TESTS' SUMMARY DATE: 7/25/20 PAGE: 3 OF 6
H0: Students are interested in MIS, CIS and CS equally
Ha: Students are interested in MIS, CIS and CS unequally

Could be used as
H0: there is no significant difference between the Current smart phone
proportion and preferred smart phone proportion that the students have.
Ha: there is a significant difference between the Current smart
phoneproportion and preferred smart phoneproportionthat the students have.

Other Categorical Tests/measures (used in Crosstabs)


Chi Square Test of Independence (two categorical variables, each have 2 or
more values)
Example of research question:
Are older people more optimistic than younger people?
Is there an association between gender and smoking behavior?
Are males more likely to be smokers than females?
Is the proportion of males that smoke the same as the proportion of females?
H0: X is independent of Y
(There is no significant association between x and y.
Ha: X is NOTindependent of Y
(There is a significant association between x and y.

H0: Obesity is independent of eating Junk Meals


Ha: Obesity is NOT independent of eating Junk Meals

McNemar Test: (two categorical variables each have 2 values (Yes/No) measure
the same feature at 2 different times to see the effect of an Intervention)
Example of research question: Is there a change in the proportion of the sample
diagnosed with clinical depression prior to, and following, the intervention?

When you have matched or repeated measures designs (e.g. pre-test/post-


test), you cannot use the usual chi-square test. Instead, you need to use
McNemar’s Test. In the health and medical area this might be the presence or
absence of some health condition (0=absent; 1=present), while in a political
context it might be the intention to vote for a particular candidate (0=no,
1=yes) before and after a campaign speech.
H0: there is No significant change in the proportion of participants diagnosed as
clinically depressed following the program
H0: there is a significant change in the proportion of participants diagnosed as
clinically depressed following the program

COCHRAN’S Q TEST
The McNemar’s Test described in the previous section is suitable if you
have only two time points. If you have three or more time points, you will
need to use Cochran’s Q Test

Example of research question: Is


there a change in the proportion of
participants diagnosed with clinical depression across the three time

INSTRUCTOR: MOHAMMED ABDUL KHALIQ DWIKAT EMAIL:dwikatmo@gmail.com


TOPIC: TESTS' SUMMARY DATE: 7/25/20 PAGE: 4 OF 6
points: (a) prior to the program, (b) following the program and (c) three
months post-program?

Three categorical variables measuring the same characteristic (e.g.


presence or absence of the characteristic 0=no, 1=yes) collected from each
participant at different time points.

Risk-(Odds-Ratio) (two categorical variables each have 2 values (Yes/No))


Quantify how strongly the presence or absence of property A is associated
with the presence or absence of property B in a given population. If each
individual in a population either does or does not have a property "A"

It gives us information as
If you have lung cancer, you are 81% more likely to smoke than if you
didn’t have lung cancer.
If you have smoke, you are 81% more likely to have Lung Cancer than if you
didn’t smoke.

INSTRUCTOR: MOHAMMED ABDUL KHALIQ DWIKAT EMAIL:dwikatmo@gmail.com


TOPIC: TESTS' SUMMARY DATE: 7/25/20 PAGE: 5 OF 6
COMPARING MEANS FOR SCALE VARIABLES (FOR NORMALLY DISTRIBUTED DATA)
One Sample T Test(one scale variable)
Example of research question: Is there a significant difference between the exam
score average and 70
H0: Average weight of herring’s body = 400 grams
Ha: Average weight of herring’s body ≠ 400 grams

Independent Samples T Test (two variables, one scale test variable, one
discrete with only 2 values for grouping)
Example of research question: Is there a significant difference in the mean self-
esteem scores for males and females?
H0: Average amount spent for males= Average amount spent for females
Ha: Average amount spent for males≠ Average amount spent for females

Paired Samples T Test (two scale variables, each measure the same feature, one
before and one after an action)
Example of research question: Is there a significant effect of medicine on lowering
average blood sugar in blood

H0: Average reaction time before drinking a beer = Average reaction time
after drinking a beer
Ha: Average reaction time before drinking a beer ≠ Average reaction time
after drinking a beer

One way ANOVA(one scale variable, one discrete with multiple values)
Example of research question: Is there a difference in optimism scores for young,
middle-aged and old participants?

H0: Average Weight of parsley plants is equal among fertilizers used


Ha: Average Weight of parsley plants is not equal among fertilizers used

Simple Linear Regression(Two scale variables, one is independent (Input) and


the other is Dependent (Output))
Example of research question: How much of the variance in life satisfaction scores
can be explained by self-esteem?
life satisfaction = a * self-esteem + b
Multiple Linear Regression (3 or more scale variables, one or more are
independent (Input) and one is Dependent (Output))

Example of research question:


How much of the variance in life satisfaction scores can be explained by the
following set of variables: self-esteem, optimism and perceived control?
Which of these variables is a better predictor of life satisfaction?
life satisfaction = a1 * self-esteem + a2 * optimism+ a3*perceived control + b
If data is not normally distributed, we should use other alternative
methods as Wilcoxon Signed Rank Test, Kruskal-Wallis Test, Friedman Test,
Mann-Whitney U Test and others
Parametric Technique NonParametric Technique
Independent-samples t-test Mann-Whitney U Test
Paired-samples t-test Wilcoxon Signed Rank Test
One-way between-groups ANOVA Kruskal-Wallis Test
One-way repeated-measures ANOVA Friedman Test

INSTRUCTOR: MOHAMMED ABDUL KHALIQ DWIKAT EMAIL:dwikatmo@gmail.com


TOPIC: TESTS' SUMMARY DATE: 7/25/20 PAGE: 6 OF 6

You might also like