Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

CHAPTER 11: WITHIN-SUBJECTS STRATEGIES:

DESIGN A. SUBJECT-BY-SUBJECT
- Repeated measures design (subject serves in COUNTERBALANCING
more than one condition of the experiment - Technique for controlling progressive
and measured on the DV after each tx) error for each individual subject by
- Comparing their performance on the DV presenting all tx conditions more than
across conditions to determine whether once
there is a tx effect
2 Techniques:
INDEPENDENT VARIABLES:
- Presumed or possible Cause § REVERSE COUNTERBALANCING
- Can be controlled - Present all tx conditions TWICE
- Can be changed o First in one order then In reverse
order
DEPENDENT VARIABLES: - Makes progressive error the same
- Presumed Effect or results Ex:
- Can be observed and measured New cola= brand A
Old cola= brand B
POWER: ability of the experiment to detect the IV’s = give subjects cola in the order ABBA
effect on the DV

§ BLOCK RANDOMIZATION
• WITHIN-SUBJECT FACTORIAL DESIGN - Each tx must be presented several times
- With 2 or more independent variable - BDCA ● DBAC ● ACDB ● CABD ● BADC
- Typically used when tx conditions are
• MIXED DESIGN short
- 1 between-subjects and
- 1 within-subject variable B. ACROSS-SUBJECTS
COUNTERBALANCING
ORDER EFFECTS - Used to distribute the effects of
- Subject’s response differs due to: progressive error so that if we average
position, order or series of tx across subjects, the effects will be the
- PRACTICE EFFECTS (+) same for all conditions in the
o Leads to IMPROVEMENT as experiment.
experiment goes on; subject-get
better 2 techniques:

- FATIGUE EFFECTS (-) § COMPLETE


o Cause DECLINE in performance COUNTERBALANCING
as experiment goes on; subject- - Controls for progressive order by using
tired ALL possible sequences of the
** BOTH changes= PROGRESSIVE ERROR conditions and using every sequence
- Includes any changes in the subject’s the same number of times; gets harder
responses, caused by testing in multiple when more conditions are added
tx conditions
- As experiment progresses, results are § PARTIAL COUNTERBALANCING
distorted - controls progressive error by using
- Ex: Cola experiment SOME subset of the available order
sequences; these sequences are chosen
CONTROLLING FOR ORDER EFFECTS through special procedures.
1. COUNTERBALANCING
- Function: distribute progressive error - RANDOMIZED COUNTERBALANCING:
across the different tx conditions of the o Simplest partial
experiment counterbalancing
- Ensure order effects that alter results in o Randomly select out as many
one condition will be offset or sequences as the subjects have
counterbalanced for the experiment
- Guarantees order effect won’t be the
reasons for changes in the DV Ex.: 120 possible sequences (5 treatment
conditions) and only 30 subjects; randomly select
30 sequences and assign them to a random subject.
- LATIN SQUARE COUNTERBALANCING Counterbalance for each subject when
o Uses square or matrix where expecting large differences in pattern of
each tx appears only ONCE in progressive error from subject to subject
any order position in the (For example, in a weight training
sequences experiment, everyone will have fatigue at
- Provides protection against order different levels).
effects but cannot control for other
systematic interference between 2 📌 Counterbalance across subjects when
treatment conditions (ABCD, BADC, differences won’t be large.
CDAB, DCBA -> A comes before B twice)

CARRYOVER EFFECTS
Order as a design factor
- Effects of some tx will persist (carry
over) after txs are removed You can include treatment order as an additional
factor in the design if you are concerned that a
Ex: Identifying three scents: lilac, gasoline, and partial counterbalancing technique might not be
perfume. After smelling gasoline, the subject controlling adequately for progressive error or
cannot correctly identify the perfume because the carryover effects.
smell of gas was still there.
For example, with only two treatment conditions,
ORDER EFFECT vs CARRYOVER EFFECT half the subjects receive the sequence AB, and the
others receive BA (2x2 mixed-factorial design)
- ORDER EFFECT: emerge as a result of
the position of a tx in a sequence Treatment order is always a between-subjects
- CARRYOVER EFFECT: function of the tx factor
itself

Subject-by-subject counterbalancing and


complete counterbalancing will usually
control carryover effects adequately by
balancing them over the entire experiment.

o Balanced Latin squares: each


treatment condition appears
only once in each position in the
order sequences and precedes
and follows every other
condition an equal number of
times

A B D C
B C A D
C D B A
D A C B
Choosing among counterbalancing procedures

Every experiment with a within-subjects condition


will need some form of counterbalancing.

• 1 IV or factorial design à counterbalance


all conditions.
• Within-subjects factorial à multiply the
levels of each factor together to get total
number of conditions)

For example, 4 x 2 within-subjects design


has 8 conditions to be counterbalanced.

• In a mixed design, only the within-subjects


have to be counterbalanced
CHAPTER 12: WITHIN-SUBJECTS - A (baseline condition) followed by B
(experimental condition) returns back to
DESIGN: SMALL N A
- Uses:
N: stands for the # of subjects needed in an o Maybe used only if the tx
experiment conditions are reversible
o For large N designs
LARGE N DESIGNS
- AKA- reversal designs
- compare the performance of groups of
subjects.
VARIATIONS OF ABA DESIGN
- most common used research design
ABABA
SMALL N DESIGNS
- The treatment is reintroduced and
- test only one or very few subjects added another return to baseline.
- behavior is studied more intensely in
ABACADA
this design
- uses:
- B, C, and D represent 3 different
o labs and field studies
treatment conditions. Used to extend
o human and animal behavior
the conditions in a small N experiment.
o practical reasons (e.g clinical
psychopathology and
A-B-BC-A-B-BC
psychophysics- how we sense
- It is used to separate the effects of each
and perceive stimuli)
component of a compound independent
o most extensively used in
variable. The components are presented
experimentation- operant
individually and in combinations to
conditioning
determine their effects. Any number of
§ BF SKINNER: studied
components can be examined.
positive and negative
reinforcement
AB
§ He believed, it is better
- Sometimes researchers sacrifice some
to use careful, cautious
precision for ethical reasons.
measurements rather
than statistical tests

BASELINE MULTIPLE BASELINE DESIGN


- A series of baselines and treatments are
- Measure of behavior as it normally compared within the same person, but
occurs without experimental once a treatment is established, it is not
manipulation- first phase of small N withdrawn
design - Uses:
o used when it is not desirable to
Ex. we want to examine the notion that talking to a reverse treatment conditions.
plant makes it grow better. o used when the researcher wants
to test a treatment across
We begin with the control condition of the multiple settings, or when the
experiment (condition A) researcher wants to assess the
effects of a treatment on several
For three months, we do not talk to the plant and behaviors.
just measure it every Monday—this establishes
baseline. STATISTICS
- not usually used in small N designs
In the second phase of the experiment, we - ROLE: to infer things about the
introduce the experimental manipulation population from sample data because
(condition B)—talking to the plant for 2 hours each making generalizations based on one
day and measuring it every Monday for three subject isn’t reasonable.
months.
CHANGING-CRITERION DESIGN
ABA DESIGNS - An experimental approach in which
behavior will be modified in increments
- Refers to the order of the conditions of and the criterion for success will
the experiment intentionally be changed as the
behavior is modified.
- Used when the behavior being modified
cannot be changed all at once; DV
would be your criterion; reward changes
with criterion.

DISCRETE TRIAL DESIGNS


- Does not rely on baselines
- instead it relies on presenting and
averaging across many applications of
different treatment conditions and
comparing performance on the DV
across the treatment conditions.
- Repeated presentation over many trials
can show reliable picture of the effects
of the IV.

ADVANTAGES OF SMALL N DESIGNS:

o When you are studying clinical subject (ex: a


disturbed child).
o When very few subjects are available.
o You get a more accurate picture of results
or effects (because you measure the effects
multiple times and observe it closer).

DISADVANTAGES OF SMALL N DESIGNS:

o Low in external validity. It may be hard to


generalize for the entire population based
on the results of a few subjects.
o History threats are a problem with small N
designs so it is important to replicate
findings before generalizing them.

WHEN WOULD WE PREFER A LARGE N


DESIGN?
• A large N design would be desirable when
we have sufficient subjects and want to
increase generalizability.
• The generalizability of a large N study
depends on how we select our sample since
a seriously biased sample will not represent
the population.

Why doesn't a large N study always have


greater generality than a small N study?

If a large N study’s sample is biased, we will be


unable to generalize its findings to a larger
population. Also, if it is poorly controlled, there will
be no valid findings to generalize.
In contrast, a well-controlled small N experiment
using a single subject might be successfully
replicated across sufficient subjects to generalize
its results to the population from which they were
drawn.
CHAPTER 13: WHY WE NEED that occurred as a result of the experiment:
We say that our results are statistically
STATISTICS? significant.

STATISTICS ALTERNATIVE HYPOTHESIS (H1)


• quantitative measurements of samples - States the effect or relationship exists
• to evaluate objectively with the data - There is no way to directly test the
carry out statistical tests to see if the IV alternative hypothesis (H1) —the
PROBABLY caused changes in the DV research hypothesis— which states that
the data came from different
STATISTICAL INFERENCE populations. Therefore, we can never
• Making a statement about the population really prove that our research
and all its samples based on what we see in hypothesis is correct.
the samples we have. - The best we can do is show that it is
unlikely that the pattern occurred from
chance variation within the population
we sampled: we can only show that the
null hypothesis is probably wrong.

The more variability, the harder it will be to reject


the null hypothesis.

• Allows us to draw conclusions about a DIRECTIONAL HYPOTHESIS


parent population from a sample - A hypothesis that predicts the way
• Answers the question, “Are the differences the difference between groups will
between groups significantly greater than go.
expected between any sample in the ex.: time passes quickly when you are
population?” having fun

VARIABILITY NON-DIRECTIONAL HYPOTHESIS


• For a set of dependent variable - States that there will be a difference
measurements, there is variability when between the two groups/conditions but
the scores are different. no direction is specified
• “spreads out” a sample of scores drawn Ex.: teacher-student relationship
from a population. influences learning

Results are statistically significant when the NORMAL CURVE


difference between our treatment groups exceeds - a symmetrical, bell-shaped curve that
the normal variability of scores on the dependent represent distribution in theory of the
variable. population.
- Scores fall close to the center
STATISTICAL SIGNIFICANCE
• means that there is a treatment effect at an
alpha level we have preselected, like .01 or
.05.

NULL HYPOTHESIS (H0)


We don’t test the research hypothesis directly.
Instead, we formulate and test the null hypotheisis
(H0)
• states that there is no effect or relationship
between variables
• performance of the tx groups is so similar
that the scores could have been sampled CHOOSING A SIGNIFICANCE LEVEL
from the same population.
• States that results are so similar they mean SIGNIFICANCE LEVEL (a)
nothing.
- A criterion for deciding whether to
• Null is assumed true until proven wrong
reject the null or not.
• We use statistical tests to tell us if we reject
the null hypothesis or not.
• If we reject the null hypothesis, we are
confirming a change between the groups In psychology, we generally reject the null
hypothesis if the probability of obtaining this
pattern of data by chance alone is less than 5%. ONE-TAILED and TWO-TAILED TESTS
Then we say the significance level is
p <. 05.
When we choose a significance level of .05, we are
ONE-TAILED TEST (DIRECTIONAL)
saying that we will reject the null hypothesis if we - Used when we have a directional
get a pattern of data that is unlikely that it could hypothesis.
have occurred by chance less than 5 times out of - The 5% is not distributed to both sides;
100. it can only go one way.

In some experiments, we may want a stricter


criterion. We could choose a significance level of p
< .01
An even stricter criterion, such a p < .001 (less than
1 in one thousand), might be chosen for some
medical research or other projects in which being
wrong about a treatment effect could have
disastrous human consequences.

📌 To make a valid test of hypothesis, we must


think ahead and decide what the significance level TWO-TAILED TEST (NONDIRECTIONAL)
will be BEFORE running the experiment.
- Used when we have a nondirectional
hypothesis.
TYPE 1 and TYPE 2 ERROR 9DECISION - The 5% critical region is divided onto
ERROR) either side of the distribution (i.e. 2.5%
on the left and right side of the center).
TYPE 1
- “false positives”
- Reject the null hypothesis when the null
is true
- Represented by (a-alpha)
TYPE 2
- “false negatives”
- Fail to reject a false null hypothesis
- Represented by (b-beta)

CRITICAL REGIONS
- A.K.A as rejection region TEST STATISTICS
- a set of values f - Inferential statistics are statistics that
- or the test statistic for which the null can be used as indicators of what is
hypothesis is rejected. If the observed going on in the population.
test statistic is in the critical region then - they can be used to evaluate results.
we reject the null hypothesis and accept - A test statistic is a numerical summary
the alternative hypothesis. of what is going on in our data.

SUMMARIZING DATA: USING DESCRIPTIVE


STATISTICS

RAW DATA
- The data we record as we run an
experiment

SUMMARY DATA
- Whenever we report the results of an
experiment, we report summary data
rather than raw data

DESCRIPTIVE STATISTICS
- When we have group data, we
summarize them with descriptive
statistics, shorthand ways of describing
the data.
- We summarize and describe data by
using measures of central tendency and
measures of variability.

MEASURES OF CENTRAL TENDENCY

- Measures statistics that describe what is


typical of a distribution of scores.
MEAN
- Arithmetic average
- Add all scores together and divide by
the total # of scores

MEDIAN
- the score that divides the distribution in
half so that half the scores in the
distribution fall above the median, half
below.

MODE
- The score that occurs most often

Find (a) the mean (b) the median (c) the mode of
this set of data.
5, 6, 2, 4, 7, 8, 3, 5, 6, 6

(a) The mean is 5+6+2+4+7+8+3+5+6+6/10 =


52/10 = 5.2

(b) Median: place all the numbers in order 2, 3, 4,


5, 5, 6, 6, 6, 7, 8
Median is 5+6/2 = 11/2 = 5.5

(c)6 appears more than any other number mode =


6

MEASURES OF VARIABILITY

RANGE
- The simplest measure of variability. It is
the difference between the largest and
smallest scores in a set of data.

VARIANCE (s2)
- The average squared deviation of scores
from their mean. The variance tells us
somethings about how much scores are
spread out, or dispersed around the
mean of the data.

STANDARD DEVIATION (s)


- It reflects the average deviation of
scores about the mean.

<3 ILYA ga :p
/sbi

You might also like