Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

lOMoARcPSD|6537478

Psychological Statistics: Chapter 1 to 4

Psychological Statistics (Our Lady of Fatima University)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by Maurice Villafranca (maurice.villafranca@ub.edu.ph)
lOMoARcPSD|6537478

PSYCHOLOGICAL STATISTICS I

Chapter 1 -- Statistics

In Statistics, we:

1. Organize Data 2. Describe Data (describe the 3. Make Inferences based upon
specific phenomenon) the Data
TEAM A TEAM B
11 4 TEAM A TEAM B TEAM A TEAM B
13 8 (aggressio (pacifis (aggressio (pacifism
15 12 n level) m level) n level) level)
11 4 11 4
17 16
13 8 13 8
19 20
15 12 16 12
17 16 17 16
19 20 19 21

x̄ = 14.8 x̄ = 12.2

(compare the mean = average)


14.8 – 12.2 = 2.6 Average

TYPES OF RESEARCH
Quantitative Research
Qualitative Research
 discovering facts about social phenomena
 understanding human behavior from the  assumes a fixed and measurable reality
informant’s perspective  data are collected through numerical
 data are collected through participant comparisons
observation and interviews  put into categories, in rank order, or measured
 is interpretative and does naturalistic approach in units of measurement
to its subject matter (Denzen and Lincoln, 1994)  aims to establish general laws of behavior and
 aims to understand the social reality of phenomenon across different setting /context
individuals, groups and cultures  research is used to test a theory and ultimately
 “how” and “why” a particular phenomenon support it or reject it
happens

Downloaded by Maurice Villafranca (maurice.villafranca@ub.edu.ph)


lOMoARcPSD|6537478

e.g. Student A with a grade of 98 = 1st Rank

TWO MAIN METHODS OF RESEARCH Student B with a grade of 94 = 2nd Rank

1. Deduction – general theory to particular 3. Interval - distances between each interval on the
data scale
2. Induction – particular data to a general
e.g. Measurement level of stage freight of student A
theory
is 20 and 22, which is the same to student B’s 40
THEORY AND HYPOTHESIS and 42

 Theory - explanation or set of principles that 4. Ratio - equal intervals


is well substantiated by repeated testing and
e.g. 6 people are randomly selected and asked how
explains a broad phenomenon
much money they have with them. The results are:
e.g. People diagnosed with depression prefer to ₱120, ₱240, ₱360, ₱480 and ₱600
sleep more and stay at home than those who are
RESEARCH DESIGN
not depressed
 Correlational Research
 Hypothesis - a proposed explanation for a
- provides a very natural view of the question
fairly narrow phenomenon or set of
we’re researching bc di naten pinapakelaman
observations yung variables

e.g. People diagnosed with depression prefer to  Experimental Research


sleep and stay at home often because they are - by John Stuart Mill
easily drained of energy - an effect should be present when the cause is
present as well
VARIABLES IN RESEARCH
- an effect should be absent when the cause is
1. Independent Variable also absent
- the cause of some effect
RANDOMIZATION
- manipulated by the experimenter
2. Dependent Variable 1. Random Selection - any member of a
- affected by changes in an independent population has an equal chance of being
variable selected as a participant
2. Random Assignment - each participant in the
LEVELS OF MEASUREMENT
experiment is randomly assigned to
1. Nominal - words, letters, and alpha-numeric experimental treatments
symbols

e.g. Female = F Male = M Transgender = T

2. Ordinal – ranking

Downloaded by Maurice Villafranca (maurice.villafranca@ub.edu.ph)


lOMoARcPSD|6537478

- two ways ND can deviate from normal:


a) lack of symmetry (skewness: positively and
RANDOMIZATION IN REPEATED-MEASURES
negatively skewed)
DESIGN
b) pointiness (kurtosis: leptokurtic + and
1. Practice effects - participants perform platykurtic -)
differently in the second condition because of
CENTRAL TENDENCY
familiarity with the experimental situation
2. Boredom effects - participants may perform - the center of the distribution of scores
differently in the second condition because
a. The Mode – the score that occurs most
they are tired or bored from having
frequently in the data set (bimodal;
Counterbalancing – is usually thought of as a multimodal)
method for controlling order effects in a repeated
b. The Median - the middle score
measures design
c. The Mean - the sum of all scores divided by
completed the first condition the number of scores

EXAMPLE: 13, 18, 13, 14, 13, 16, 14, 21, 13

ANALYZING DATA find the mean:

1. Descriptive Statistics (13 + 18 + 13 + 14 + 13 + 16 + 14 + 21 + 13) ÷ 9 =


2. Frequency Distribution / Histogram 15
- a graph plotting values of observations on the
horizontal axis, with a bar showing how many find the median:

times each value occurred in the data set = 13, 13, 13, 13, 14, 14, 16, 18, 21
3. Stem and Leaf Plots
- are similar to histograms but the frequency of = (9 + 1) ÷ 2 = 10 ÷ 2 = 5th number
occurrence of a particular score is represented
by repeatedly writing the particular score itself = 13, 13, 13, 13, 14, 14, 16, 18, 21

rather than drawing a bar on a chart


find the mode:
4. Box and Whisker Plots
- enable us to easily identify extreme scores as = most repeated number: 13
well as seeing how the scores in a sample are
distributed DISPERSION
5. Scattergram
- is the extent to which a distribution is stretched
- gives a graphical representation of the
or squeezed
relationship between two variables
a) Quartiles - three values that split the sorted
6. Normal Distribution
data into four equal parts
- a symmetric distribution where most of the
observations cluster around the central peak

Downloaded by Maurice Villafranca (maurice.villafranca@ub.edu.ph)


lOMoARcPSD|6537478

b) Lower Quartile - median of the lower half of


the data
c) Upper Quartile - median of the upper half of
the data
d) Interquartile - difference between the upper
and lower quartile
e.g.
1) 5, 7, 4, 4, 6, 2, 8
Mean deviation
= 2, 4, 4, 5, 6, 7, 8
6+3+3+2+1++6+ 7 30
Q1 (lower quartile) = 4 = = =3.75
8 8
Q2 (interquartile) = 5
f. Variance - is defined as the sum of the squared
Q3 (upper quartile) = 7
deviations from the mean, divided by the
e. Mean Absolute Deviation (MAD) - is the
number of scores.
average distance between each data value and
g. Standard Deviation - results from taking the
the mean; helps us get a sense of how "spread
square root of the variance
out" the values in a data set are
e.g. e.g. Find the variance for the following set of
1) the Mean Deviation of 3, 6, 6, 7, 8, 11, 15, 16 data representing trees in California (heights in
feet): 3, 21, 98, 203, 17, 9
Mean
= 3 + 21 + 98 + 203 + 17 + 9 = 351
=
Value Distance from
= 351 × 351 = 123,201
9
3 6
= 123,201 / 6 = 20,533.5
6 3
6 3 = 3 × 3 + 21 × 21 + 98 × 98 + 203 × 203 + 17 ×
7 2 17 + 9 × 9

8 1
= 9 + 441 + 9604 + 41209 + 289 + 81 = 51,633
11 2
15 6 = 51,633 – 20,533.5 = 31,099.5

16 7 =6–1=5

3+6+6+7 +8+11+15+ 16 72 = 31,099.5 / 5


= =9
8 8
= 6,219.9 (variance)

√6,219.9

= 78.86634 (standard deviation)

Downloaded by Maurice Villafranca (maurice.villafranca@ub.edu.ph)


lOMoARcPSD|6537478

a. Judgmental Sampling - sample members


are chosen only on the basis of the
researcher’s knowledge and judgment.
b. Convenience Sampling - subjects are
selected because of their proximity to the
researcher.
c. Snowball Sampling - used by researchers to
identify potential subjects in studies where
subjects are hard to locate.
d. Quota Sampling - sample group represents
certain characteristics of the population
chosen by the researcher.

CHAPTER 2 – PROBABILITY
TYPES OF PROBABILTY SAMPLING

Probability - likelihood of a particular event of


a. Simple Random Sampling - a completely
interest occurring
random method of selecting subjects. e.g

Conditional Probability Fish Bowl Technique


b. Systematic Sampling - you choose every
- the probability of a particular event happening “nth” participant from a complete list.
if another event (or set of conditions) has c. Stratified Random Sampling - involves
also happened. splitting subjects into mutually exclusive

Sampling groups; alike to quota sampling


d. Cluster Sampling - a random sample of
Population - a complete set of measurements (or these clusters are selected; there is no need
individuals or objects) having some common for choosing groups bc they already exist.
observable characteristic e.g. Rainbow Village > Red Street

Sample - is a subset of a population *Homogeneity – similarity

TYPES OF SAMPLING *Heterogeneity – difference

a. Probability Sampling - subjects of the *Homogeneity – W/IN groups and BET them
population get an equal opportunity to be
selected as a representative sample. *Heterogeneity – BET groups and W/IN them

b. Non-probability Sampling - it is not known DEGREES OF FREEDOM


that which individual from the population will
be selected as a sample. e.g. 3 combination numbers that will have a mean
of 3
TYPES OF NON-PROBABILITY SAMPLING
= 4, 2, 3 = 9/3 = 3

Downloaded by Maurice Villafranca (maurice.villafranca@ub.edu.ph)


lOMoARcPSD|6537478

2, 2, 5 = 9/3 = 3 Z-SCORE

1, 3, 5 = 9/3 = 3 - is expressed in standard deviation units


- tells us how many standard deviations above
 Uses the Standard Deviation formula
or below the mean our score is.

SAMPLING DISTRIBUTION

 When you plot sample statistics from all of


your samples as a frequency histogram, you
get something called the sampling
distribution

 Thus, if you plotted the sample means of


STANDARD NORMAL DISTRIBUTION many samples from one particular population,
you would have plotted the sampling
 is in belt curve
distribution of the mean.
 has a total area that is equal to 1
 is symmetrical  GET THE MEAN OF ALL THE SAMPLE
 has a mean that is equal to 0 MEAN
 has a standard dev. that is equals to 1
Central Limit Theorem

- used to compare a score to another score - as the size of the samples we select
- you can only compare scores thru increases,
transforming it to standard normal score
- the nearer to the population mean will be the
SCORE – MEAN / SND = SNN
mean of these sample means
*When 2-score becomes negative, “prop above
score” will become “prop below score” - and the closer to normal will be the distribution of
the sample means.
- helps to compare scores from different
Confidence Intervals
samples and compare different scores from
the same samples - are interval estimates of where the
- helps calculate the proportion of the population mean may lie.
population who would score above or below - indicates the range of values that’s likely to
your score contain the true population parameter.

*In order to use the standard normal TYPES OF CONFIDENCE INTERVAL


distribution for analyzing our data, we often
a. Point Estimate - a single figure estimate of
transform the scores in our samples to
an unknown number
standard normal score

Downloaded by Maurice Villafranca (maurice.villafranca@ub.edu.ph)


lOMoARcPSD|6537478

b. Interval Estimate - a range within which we - In statistics, a sample mean deviates from
think the unknown number will fall the actual mean of a population; this
deviation is the standard error.
*In psychology, we always use 95% confidence - once we are able to calculate the standard
interval; which is equivalent to 1.26 z-score. error, we can use this information to find out
how good an estimate our sample mean is
of the population mean

! POINTERS !

 The larger the sample size, the lower the


sampling error.
 Large samples tend to give means that
are better estimates of population means
 Measure of degree of variation around
the mean is the standard deviation.
 The standard deviation of sample means
SAMPLING ERROR is called the standard error
 the difference between population mean  Thus, for large sample sizes the
and sample mean standard error will tend to be less than that
 there will always be a sampling error
for small sample sizes.
 answers the question:
“How well does our sample mean
represents the population mean?” ERROR BAR CHARTS
- is a graphical representation of confidence
- Whenever we select a sample from a
population, there will be some degree of intervals around the mean.
uncertainty about how representative the - An extremely useful means of presenting
sample actually is of the population confidence intervals and standard error in your
- the degree to which such sample statistics
differ from the equivalent population research reports is to generate error bar charts.
parameter
- the difference between the population
parameter and sample statistic
- As you increase your sample size, You
therefore decrease the degree of sampling
error.

STANDARD ERROR - a statistical term that


measures the accuracy w/ which a sample
represents a population.

SAMPLE ERROR MEAN

- the measure of accuracy/precision.

Downloaded by Maurice Villafranca (maurice.villafranca@ub.edu.ph)


lOMoARcPSD|6537478

- Always states that there is no effect in the


underlying population

Research Hypothesis

- is the prediction of how two variables might


be related to each other.

Logic of Null Hypothesis Testing

 to formulate a hypothesis
 to measure the variables involved and
examine the relationship between them

P-value

- determines whether there is evidence to


reject the null hypothesis
CHAPTER 4 – HYPOTHESIS TESTING AND
*95% - probability that the null hypothesis is true
STATISTICAL SIGNIFICANCE
*5% - statistical significance level / α
Hypothesis
THE SIGNIFICANCE LEVEL
- A prediction of how our variables might be
related to each other or how they may affect  Alpha (α)
each other. - the criterion for statistical significance that we
set for our analyses.
Inferential Statistics
- if the probability of… is less than 5%, then
- This is where we are already capable of the findings are said to be significant.
concluding whether our variables are related - if the probability of… is greater than 5%,
to one another or not, or whether our then the findings are said to be non-
variables affect one another. significant.
- Keyword: CONCLUSION
THE CORRECT INTERPRETATION OF P-
Probability Value VALUE

- P-value
- The likelihood of us obtaining our pattern of
results due to sampling error of there is no
relationship between our variables

Null Hypothesis

Downloaded by Maurice Villafranca (maurice.villafranca@ub.edu.ph)


lOMoARcPSD|6537478

- Alpha ( α )

Type II error

- is where you do not reject the null hypothesis


when in fact you should do because in the
underlying population the null hypothesis is
not true.
- FAILING TO REJECT THE NULL
HYPOTHESIS WHEN IT IS FALSE
- Beta ( β )

α VS β

α – null is true, but was rejected

β – null is false, but was not rejected

WHY SET ALPHA AT 0.05?

 If alpha > .05, then we are more prone to


type I errors
 If alpha < .05, then we are more prone to
type II errors

PROBABILITY OF ERRORS ONE-TAILED AND TWO-TAILED TESTS

Type I error

- is where you decide to reject the null


hypothesis when it is in fact true in the
underlying population.
- REJECTING THE NULL HYPOTHESIS
WHEN IT IS TRUE

Downloaded by Maurice Villafranca (maurice.villafranca@ub.edu.ph)


lOMoARcPSD|6537478

PARAMETRIC TESTS

- uses/considers the parameters (sample


mean, population mean, etc.) of the
population
 data must me on an interval scale (not
nominal nor ordinal)
 data must be normally distributed
 homogeneity of variance
 there should be no extreme
scores/outliers

NON-PARAMETRIC TESTS

- opposite of parametric tests

Downloaded by Maurice Villafranca (maurice.villafranca@ub.edu.ph)

You might also like