Professional Documents
Culture Documents
06 Research Methods Topic Companion Digital Download
06 Research Methods Topic Companion Digital Download
06 Research Methods Topic Companion Digital Download
Contents
Topic
Experimental Methods 3
Observational Techniques 8
Self-Report Techniques 15
Correlations 20
Case Studies [A-Level Only] 23
Aims, Hypotheses, IVs And DVs 25
Sampling 28
Pilot Studies and Experimental Design 34
Control, Demand Characteristics, and Investigator Effects 38
Ethical Guidelines, Peer Review and The Economy 41
Types of Data 47
Descriptive Statistics 50
Presentation and Display of Quantitative Data 57
Distributions: Normal and Skewed Distributions 59
Content Analysis [A-Level Only] 60
Features of Science [A-Level Only] 63
Reliability [A-Level Only] 66
Validity [A-Level Only] 69
Reporting Psychological Investigations [A-Level Only] 72
The Sign Test [As And A-Level] 77
Levels of Measurement [A-Level Only] 80
Probability and Significance [A-Level Only] 83
Statistical Tests [A-Level Only] 85
Appendices 90
EXPERIMENTAL METHODS
Specification: Experimental method. Types of experiment: laboratory and field
experiments; natural and quasi experiments.
WHAT YOU NEED TO KNOW
Outline and evaluate laboratory experiments.
Outline and evaluate field experiments.
Outline and evaluate natural experiments.
Outline and evaluate quasi experiments.
Introduction
Experimental methods all have one thing in common: they are attempting to find a cause and effect
relationship between an independent variable (IV) and dependent variable (DV), and to measure the
extent of this effect. There are four different types of experiment:
1. Laboratory experiment
2. Field experiment
3. Natural experiment
4. Quasi experiment
Summary of Experimental Methods
SETTING IV DV
IV is naturally occurring
NATURAL Natural conditions (e.g. unemployment and an Measures the DV
earthquake)
IV is a difference between
Controlled conditions
QUASI people (e.g. gender and Measures the DV
/Natural conditions
age)
Laboratory Experiments
Laboratory experiments are conducted under specified controlled conditions in which the researcher
manipulates the independent variable (IV) to measure the effect on the dependent variable (DV). The
conditions are heavily controlled in order to minimise the effect of any extraneous variables, to prevent
them from becoming a confounding
variable which might adversely affect
the DV. Participants will be aware
that they are taking part in an
investigation due to the contrived
nature of the situation which may
feel unlike real‐life.
Copyright tutor2u Limited / School Network License / Photocopying Permitted www.tutor2u.net/psychology
Page 4 AQA A LEVEL Psychology topic companion: RESEARCH METHODS
Evaluating Laboratory Experiments
A strength of laboratory experiments is the high degree of control over extraneous variables which can
be achieved. A researcher is therefore able, in most cases, to prevent extraneous variables from
becoming confounding variables which negatively affect the DV. This provides a high degree of internal
validity allowing for conclusions about cause and effect to be drawn between the IV and DV.
A limitation of laboratory experiments is that they can lack external validity. The artificial nature of the
environment in which the investigation is taking place means that the study can lack ecological validity.
This means that the findings of the study cannot always be generalised to settings beyond the
laboratory as the tasks often lack mundane realism and would not be everyday life occurrences. Since
participants know they are being investigated their behaviour can also change in an unnatural manner
resulting in demand characteristics being seen.
Field Experiments
Field experiments are carried out in natural conditions, in which the researcher manipulates the
independent variable (IV) to measure the effect on the dependent variable (DV). The ‘field’ is considered
any location which is not a laboratory. Participants in a field experiment typically do not know that they are
taking part in an investigation with a view to observing more natural behaviour.
Evaluating Field Experiments
The natural setting means that field experiments often have a higher level of ecological validity, in
comparison to laboratory studies. This means that the results are more likely to be representative of
behaviour witnessed in everyday life. However, because the setting is more natural, there is less
control over extraneous variables. These can then become confounding variables and distort the
findings meaning a firm cause and effect relationship cannot be drawn since other factors could have
had an impact on the DV, other than the IV.
There are important ethical issues associated with field experiments. Since participants are often
unaware that they are in fact participants in a psychological investigation, they cannot give informed
consent to take part. As such, the research may involve a breach of their privacy rights and a cost‐
benefit analysis will need to be conducted before proceeding with any study to ensure the perceived
outcomes from the research will outweigh any personal costs to those involved.
Natural Experiments
In a natural experiment, the researcher does not manipulate the IV and instead examines the effect of an
existing IV on the dependent variable (DV). This IV is naturally occurring, such as a flood or earthquake,
and the behaviour of people affected is either compared to their own behaviour beforehand, when
possible, or with a control group who have not encountered the IV. It is important to note that it is the IV
which is natural in this type of experiment, and not necessarily the context in which the investigation is
taking place since participants could be tested in a laboratory as part of the study.
Evaluating Natural Experiments
The naturally occurring IV means that natural experiments often have a higher level of external validity
compared to laboratory and field experiments. These types of investigations are considered high in
ecological validity given the real‐life issues that are being studied rather than manipulated artificially.
However, natural experiments have no control over the environment and subsequent extraneous
variables, which means that it is difficult for the research to accurately assess the effects of the IV on
the DV. It may be that a confounding variable has affected the results so a cause and effect
relationship must be drawn with extreme caution, if at all.
OBSERVATIONAL TECHNIQUES
Specification: Observational techniques. Types of observation: naturalistic and controlled
observation; covert and overt observation; participant and non‐participant observation.
Observational design: behavioural categories; event sampling; time sampling.
WHAT YOU NEED TO KNOW
Outline and evaluate observational techniques, including:
o Covert and overt
o Participant and non‐participant
o Naturalistic and controlled
o Structured and unstructured
Outline and evaluate the factors involved in the design of observational research, including:
o Behavioural categories
o Event sampling and time sampling
Introduction to Observational Techniques
When conducting an observation, the researcher has the choice between:
Covert and overt
Participant and non‐participant
Naturalistic and controlled
Structured and unstructured
It is important to note that these techniques are not mutually exclusive: it is quite possible for an
observation to be naturalistic, unstructured, participant and covert all at the same time, as these terms
refer to different aspects of the methods.
Covert Observations
A covert observation is also known as ‘undisclosed’ observation and consists of observing people without
their knowledge; for example, using a one‐way mirror (covert non‐participant) or joining a group as a
member (covert participant). Participants may be informed of their involvement in the study after the
observation has taken place.
Evaluating Covert Observations
A strength of covert observation compared to overt
observation is that investigator effects are less likely. Since
the investigator is hidden in this type of observation there
is less chance that their direct or indirect behaviour will
have an impact on the performance of the participants. As
a result, there is less chance of demand characteristics
occurring whereby the participant tries to guess the aim of
the investigation and act accordingly, since they are
unaware that they are being observed. This means that the
participants’ behaviour seen will be more natural and
representative of their everyday behaviour.
There are ethical issues associated with the covert method
of observation inherent within its design. As participants
are not aware they are taking part in an investigation, they
Possible Exam Questions
1. Explain how observational research can be enhanced through the use of operationalised behavioural
categories. (2 marks)
2. Explain what is meant by ‘overt observation’. (2 marks)
3. Describe what is meant by ‘participant observation’. (2 marks)
4. Explain what is meant by ‘event sampling’ in relation to observational research in psychology. (2 marks)
5. Controlled observation techniques have been used in the Strange Situation to investigate cultural
variations in attachment. Suggest one advantage of using controlled observation in psychological
research. (2 marks)
Exam Hint: It is important for students to only express one advantage of using a controlled observational
method for this question using the ‘name and explain’ method of elaboration.
6. Briefly explain how a psychologist could improve her research by conducting observations in a
controlled environment. (4 marks)
7. Explain the difference between a participant observation and a non‐participant observation. You may
use an example to support your point. (4 marks)
Exam Hint: Note that this question is asking for a difference and not a definition of each observational
method, so the response must be tailored accordingly.
8. Identify and explain one strength and one limitation of conducting naturalistic observations. (4 marks)
9. A developmental psychologist was interested in investigating the effects of early and late adoption on
future aggressive behaviour in children. She compared the behaviour of children who had been
adopted before the age of two with children who had been adopted after the age of two. The children
were observed in their primary school playground when they were seven years old.
Suggest two operationalised behavioural categories that the developmental psychologist could use in
her observation of aggressive behaviour in children and explain how the psychologist could have
carried out this observation. (4 marks)
Exam Hint: This is a context based question so explicit reference to the scenario is required to gain full
credit. Note that behavioural categories such as ‘verbal aggression’ and ‘physical aggression’ would not
be awarded any marks as they are not operationalised. Suggestions such as ‘kicking’ or ‘swearing’ would
be creditworthy as they are specific and measurable. To answer the second component of this question,
most students will refer to the use of a tally chart for recording behaviours observed on an event
sampling basis.
SELF‐REPORT TECHNIQUES
Specification: Self‐report techniques. Questionnaires; interviews, structured and
unstructured. Questionnaire construction, including use of open and closed questions;
design of interviews.
WHAT YOU NEED TO KNOW
Outline and evaluate the use of questionnaires, including: open and closed questions.
Outline and evaluate interviews, including:
o Structured
o Unstructured
Introduction to Questionnaires
Questionnaires are a type of ‘self‐report’ technique, where participants provide information relating to
their thoughts, feelings and behaviours. They can be designed in different ways, and can comprise of open
questions, closed questions or a mixture of both.
Open Questions
Open questions allow participants to answer however they wish, and thus generate qualitative data since
there is no fixed number of responses to select from. Responses to these types of questions provide rich
and detailed data which can provide insight into the unique human condition.
Evaluation of Open Question Questionnaires
A strength of using open questions is that there is less chance of researcher bias. This is especially true
if the questionnaire is also anonymous, since the participant can answer the questions in their own
words, without input from the researcher providing a set number of responses. Consequently, there is
less chance of the responses being influenced by the researcher’s expectations.
However, there are limitations of using questionnaires in psychological research. Participants may
answer in a socially desirable way, where they try to portray themselves in the best possible light to
the researcher. This means that the open response may lack validity as it is not their natural response.
Closed Questions
Closed questions restrict the
participant to a predetermined set of
responses and generate quantitative
data. There are different types of
closed questions, including: checklist,
Likert response scale and ranking
scale.
Checklist: This is a type of question
where participants tick the
answer(s) that apply to them. For
example: What is the highest
academic qualification you hold?
Predetermined list of questions limits
CLOSED Easy to analyse quantitative data, discover
responses ability to explore interesting
QUESTION trends and replicate research.
answers. Response bias.
Introduction to Interviews
Interviews are another type of self‐report technique which predominantly take place on a face‐to‐face
basis, although they can also happen over the telephone.
There are three different interview designs: interviews can take the form of participants just answering a
predetermined list of questions (structured interviews); they can be more like a relaxed conversation
between friends (unstructured interviews); and, although it is not on the specification, it is also important
to recognise that many fall between the two (semi‐structured interviews).
Responses are usually recorded, with the use of an interview schedule that the interviewer completes
and/or audio or video recording, with the informed consent of the interviewee(s).
Structured Interviews
Structured interviews have the questions decided on in advance and they are asked in exactly the same
order for each interviewee taking part. The interviewer uses an interview schedule and will often record
the answers to each question by taking notes/ticking boxes on their schedule.
Evaluation of Structured Interviews
An advantage of using structured interviews in psychological research is that the quantitative
(numerical) data is easier to statistically analyse. This is useful because direct comparisons can be made
between groups of individuals meaning that the researcher can look for patterns and trends in the
data. Additionally, because the questions are standardised and asked in the same sequence every time
to all participants, the interview is easily replicable to test for reliability.
There are disadvantages of using the structured interview method. It is possible that over the course of
running several interviews following the same schedule with different participants, that investigator
effects may play a role. This is where the interviewer may, unconsciously, bias any responses given to
the questions they ask by their tone of voice, intonations, body language and so on. Likewise,
investigator effects can also occur between researchers where there is more than one researcher
conducting the interviews.
Unstructured Interviews
Unstructured interviews are conducted more like a conversation, with the interviewer only facilitating the
discussion rather than asking set questions. Very little is decided in advance (only the topic and questions
needed to identify the interviewee). Therefore, this type of interview typically produces a large amount of
rich qualitative data. Answers will usually be audio or video recorded, as to write them all down as quickly
as they were spoken would be impossible for the interviewer, and would also spoil the relaxed atmosphere
of the unstructured interview.
Evaluation of Unstructured Interviews
The use of unstructured interviews can increase the validity of findings by significantly reducing the
possibility of investigator effects. The open question schedule in unstructured interviews means that
the investigator does not control the direction of the conversation to meet their own preconceived
agenda. Participants can justify their answers in their own words with opinions rather than trying to
guess the aim of the study through any clues given. This is useful because it reduces the possibility of
participants displaying demand characteristics in their interview responses.
Unstructured interviews generate large quantities of rich and interesting qualitative data. This allows
the interviewer to clarify the meaning and gain further information from the participant if required to
CORRELATIONS
Specification: Correlations. Analysis of the relationship between co‐variables. The
difference between correlations and experiments. Analysis and interpretation of
correlation, including correlation coefficients.
WHAT YOU NEED TO KNOW
Outline and evaluate correlational techniques.
Outline the difference between correlations and experiments.
Interpret correlation coefficients.
Introduction to Correlational Techniques
Correlational techniques are non‐experimental methods used to measure how strong the relationship is
between two (or more) variables. In an experiment, the effect of an independent variable upon the
dependent variable is measured; however, in correlational studies the movement and direction of co‐
variables in response to each other is measured. There is no claim of a cause and effect relationship,
although after a correlational study has been conducted, further research will often be conducted to
determine if one variable is in fact affecting the other.
A real‐world example of this is seen with cigarette‐smoking and lung cancer: first it was noticed that there
was a positive correlation between the number of cigarettes smoked and the likelihood of developing lung
cancer. Later, this research was extended and a cause and effect relationship was discovered between
cigarette‐smoking and lung cancer.
There are different types of correlation:
Positive correlation: As one variable increases the other variable increases. For example – height and
shoe size.
Negative correlation: As one variable increases the other variable decreases. For example – the GCSE
grades of students and the amount of time they are absent from school.
Zero correlation: occurs when a correlational study finds no relationship between variables. For
example – the amount of rainfall in Wales and the number of people who have read the Lord of the
Rings trilogy.
Correlation Coefficient
A correlation coefficient is used to measure the strength and nature (positive or negative) of the
relationship between two co‐variables. The correlation coefficient number represents the strength of the
relationship and can range between ‐1.0 and +1.0. The nearer the number is to +1 or ‐1 the stronger the
correlation. A perfect positive correlation has a correlation coefficient of +1 and for a perfect negative
correlation it is ‐1.
Scattergram
A scattergram (sometimes called a scattergraph) is a graph that shows the correlation between two sets of
data (co‐variables) by plotting points to represent each pair of scores. It indicates the degree and direction
of the correlation between the co‐variables, one of which is indicated on the X‐axis and the other on the Y‐
axis.
Evaluation of Correlational Techniques
Correlational studies are an ideal place to begin preliminary research investigations. Since they
measure the strength of a relationship between two (or more) variables, this can provide valuable
insight for future research. This type of analysis can be used when a laboratory experiment would be
unethical as the variables are not manipulated, merely correlated. In addition, secondary data can also
be used in correlational studies which alleviates the concern over informed consent as the information
is already in the public domain, e.g. government reports.
There are limitations associated with using the correlational method. It is not possible to establish a
cause and effect relationship through correlating co‐variables. This means a researcher cannot
conclude that one variable caused the other variable to increase/decrease as there could be other
factors which influenced the relationship – referred to as the third variable problem. Moreover,
correlations only identify linear relationships and not curvilinear. For example, the relationship
between temperature and aggression is curvilinear, that is the relationship is positive to a point;
however, at very high temperatures aggression declines.
variables are not manipulated, only relationships through conducting a correlation.
TECHNIQUES
correlated.
Measure the strength of a relationship
Can only identify linear relationships and not
between variables, allowing for further
curvilinear.
research to be conducted.
Possible Exam Questions
1. A psychological study recorded the number of hours that children spent in a day care setting from birth
to three years old, and asked each child’s primary care giver to rate their child for aggression. The study
found that, as the number of hours spent in day care went up, the parents rating of aggression also
went up. What type of correlation is this research indicating? (1 mark)
Exam Hint: Most students will be able to identify correctly that this is in fact a positive correlation
because as one variable increases, so does the other.
2. Discuss why it might be more appropriate for a researcher to use a correlation study rather than an
experiment. (3 marks)
Exam Hint: Answers which simply state that a correlational study looks for a relationship and
experiments investigate differences will only gain one mark here. In order to access further marks,
students must be able to explain issues around manipulation of variables and ethics.
3. Outline one strength and one weakness of using correlational methods in psychological research. (4
marks)
Exam Hint: Many students can consider the main weakness of correlations but struggle to correctly
outline a strength. Strengths include the ability to study the relationship between variables that occur
naturally, or to measure things that cannot be manipulated experimentally.
CASE STUDIES [A‐LEVEL ONLY]
Specification: Case studies.
WHAT YOU NEED TO KNOW
Outline and evaluate the use of case studies in psychology.
Case Studies
The purpose of a case study is to provide a detailed analysis of an individual, establishment or real‐life
event. A case study does not refer to the way in which the research was conducted, as case studies can use
experimental or non‐experimental methods to collect data. For example, a researcher may want to
interview the participants, provide a questionnaire to their family or friends and even conduct a memory
test under controlled conditions to provide a rich and detailed overview of human behaviour.
Case studies are often used where there is a rare behaviour being investigated which does not arise often
enough to warrant a larger study being conducted. A case study allows data to be collected and analysed
on something that psychologists have very little understanding of, and can therefore be the starting point
for further, more in‐depth research.
Examples of famous case studies in psychology include: HM, Phineas Gage, Little Albert and Little Hans.
Likewise, psychologists have studied important world events such as the 9/11 terrorist attack in America
and the riots which began in London and spread throughout the UK in 2011.
Evaluation of Case Studies
There are methodological issues associated with the use of case studies. By only studying one
individual, an isolated event or a small group of people it is very difficult to generalise any findings to
the wider population since results are likely to be so unique. This therefore creates issues with external
validity as psychologists are unable to conclude with confidence that anyone beyond the ‘case’ will
behave in the same way under similar circumstances, thus lowering population validity.
An issue in case studies, particularly where qualitative methods are used, is that the researcher’s own
subjectivity may pose a problem. In the case study of Little Hans, for example, Freud developed an
entire theory based around what he observed. There was no scientific or experimental evidence to
support his suggestions from his case study. This means that a major problem with his research is that
we cannot be sure that he objectively reported his findings. Consequently, a major limitation with case
studies is that research bias and subjectivity can interfere with the validity of the findings/conclusions.
A strength of the case study approach is
that it offers the opportunity to unveil rich,
detailed information about a situation.
These unique insights can often be
overlooked in situations where there is
only the manipulation of one variable in
order to measure its effect on another.
Further to this, case studies can be used in
circumstances which would not be ethical
to examine experimentally. For example,
the case study of Genie (Rymer, 1993)
allowed researchers to understand the
long‐term effects of failure to form an
AIMS, HYPOTHESES, IVS & DVS
Specification: Aims: stating aims, the difference between aims and hypotheses.
Hypotheses: directional and non‐directional. Variables: manipulation and control of
variables, including independent and dependent.
WHAT YOU NEED TO KNOW
Outline the aim for a psychological investigation and differentiate between an aim and a hypothesis.
Identify the independent and dependent variables in a psychological investigation.
Outline operationalised hypotheses for psychological investigations, including:
o Directional
o Non‐directional hypotheses
Writing Aims for Investigations
Before a researcher considers the aim of the experiment, there is always a research question they are
trying to answer. For example: ‘Does hunger affect memory for food‐related words?’ Thereafter, the
researcher creates their aim: To examine the effect of hunger on memory of food‐related words.
Exam Hint: Always start the wording of an aim with ‘To examine the effect of…’
Identifying Independent and Dependent Variables
The independent and dependent variables are the vital components of any experiment. Their presence is
how you identify whether a study is following an experimental methodology or not. If there is no
independent variable or dependent variable then the study is non‐experimental.
Independent Variable (IV) – The variable that the researcher manipulates and which is assumed to
have a direct effect on the dependent variable (DV).
Dependent Variable (DV) – The variable that the research measures. The variable that is affected by
changes in the independent variable (IV).
Exam Hint: It is unlikely that you will be asked to define what is meant by the terms IV and DV. You are
more likely to be asked to identify the IV and DV within a scenario.
For example: A psychologist showed participants 50 different cards, one at a time. Each card had two
unrelated words printed on it, e.g. balloon or rabbit. Participants in one group were instructed to form a
mental image to link the words. Participants in the other group were instructed to simply memorise the
words. After all the word pairs had been presented, each participant was shown a card with the first word
of each pair printed on it and asked to recall the second word.
What is the independent variable (IV) in this study? (2 marks)
Answer: Whether participants were instructed to form a mental image to link the unrelated word pairs
or simply instructed to memorise the word pairs without a memory strategy.
What is the dependent variable (DV) in this study? (2 marks)
Answer: The number of word pairs correctly recalled by the participants in each condition.
Exam Hint: Before answering any question where you are required identify the IV and DV, read the
extract carefully and underline the IV and DV. Once you have identified what you think the IV is, ask
yourself the following question: “Is it possible for the experimenter to manipulate this variable?”. If your
answer is ‘yes’ then this is likely to be your IV. If your answer is ‘no’ then this is unlikely to be the correct
IV.
SAMPLING
Specification: Sampling: the difference between population and sample; sampling
techniques including: random, systematic, stratified, opportunity and volunteer;
implications of sampling techniques, including bias and generalisation.
WHAT YOU NEED TO KNOW
Identify the difference between a population and a sample.
Outline and evaluate the following sampling techniques:
o Random
o Systematic
o Stratified
o Opportunity
o Volunteer
Evaluate the sampling techniques in relation to bias and generalisation.
Introduction to Sampling Techniques
Sampling involves selecting participants from a target population. The target population is the particular
subgroup to be studied, and to which the research findings will be generalised. A target population is
usually too large to study in its entirety, so sampling techniques are used to choose a representative
sample.
For example, a sample could be 20 A‐level students from a school that has 500 A‐level students in total.
Sampling Techniques
Psychologists use sampling techniques to choose people to represent the target population. If the sample
is representative then psychologists can generalise the results to the target population with more
credibility.
There are five common types of sampling:
Random
Systematic
Stratified
Opportunity
Volunteer
Random Sampling
With random sampling, every member of the target
population has an equal chance of being selected. This
involves identifying everyone in the target population and then
selecting the number of participants you need in a way which
gives everyone an equal chance of being selected, such as
pulling names from a hat, or using a computer software
package which generates names/number randomly and
without bias.
If a researcher was trying to achieve a random sample from
500 A‐level students in a school, they would place the name of
each student on role into a hat/computer name generator and
For example, every fifth person is chosen and the same interval is then consistently applied to the whole of
the target population such as the 10th, 15th, 20th person and so on.
If a systematic sample of 500 A‐level students in a school was required, a researcher would list every
student on role against a number, perhaps listed in alphabetical order, and then chose every 10th person to
achieve a sample of 20 participants for their study (e.g. person 10, 20, 30, etc.)
Evaluation of Systematic Sampling
An advantage of using a systematic sampling system is that it is free from researcher bias. Since the
researcher is not selecting participants by choice, but by following a predetermined system, this
reduces any potential influence that the investigator may have over obtaining the sample.
However, the systematic sampling method may not be truly unbiased. It might be that every Nth person
has a particular characteristic in common, for example being right‐handed. Although it would be fairly
unlikely and unlucky to get a sample who were all similar on a particular trait, it remains a possibility
with using this technique. Therefore, the sample generated may not be representative meaning
generalisation to the target population would be more difficult.
Stratified Sampling
In stratified sampling, subgroups within a population are identified. Participants are obtained from each
stratum (‘layer’ or category) in proportion to their occurrence within the population.
For example, if a class of A‐level psychology had 20 students: 18 males and 2 females, and a researcher
wanted a sample of 10 to participate in their study, the sample would consist of 9 males and 1 female, to
assigned to each person on a Nth person selected may have
Every Nth person (e.g. 10th,
given list and then selected a similar trait in common and
15th, 25th…) is selected from
according to their Nth position; therefore will not be
a register, phonebook, etc.
without any personal representative of the wider
preference from the population.
researcher.
A subgroup within the
Likely to be representative as
population is identified (e.g. Difficult and time consuming
each subsection of the target
STRATIFIED
gender or ethnic origin). to identify subgroups.
population is proportionally
Participants are obtained in
represented, so results can be
proportion to their People that are selected may
generalised to the wider
occurrence within the be unwilling to take part.
population with more
population.
confidence that they apply.
High chance that sample will
OPPORTUNITY
Selecting participants who be biased, e.g. often use
Quicker and easier to obtain,
are available and willing to available university students,
in comparison to other
take part. who are not representative of
methods.
the target population.
PILOT STUDIES AND EXPERIMENTAL DESIGN
Specification: Pilot studies and the aims of piloting. Experimental design: repeated
measures, independent groups, matched pairs.
WHAT YOU NEED TO KNOW
Outline the aim and purpose of pilot studies and piloting.
Outline and evaluate types of experimental design:
o Repeated measures
o Independent groups
o Matched pairs
Pilot Studies
Pilot studies are small‐scale prototypes of a study that are carried out in advance of the full research to
find out if there are any problems with the following:
Experimental design – do the participants have enough time to complete the tasks?
Instructions for participants – are the instructions clear?
Measuring instruments – including the behavioural categories in observational research and
questions when using questionnaires. They allow for categories and questions to be checked and
modified where necessary.
Carrying out a pilot study beforehand is a way to ensure time, effort and money are not wasted on a
flawed methodology. It is important that a pilot study uses a sample that (although smaller) is
representative of the target population that will be used in the main research.
Experimental Design
The three main types of experimental design are:
Repeated measures
Independent groups
Matched pairs
Repeated Measures
Repeated measures is a design where the same participants take part in each condition of the experiment.
The data obtained from both conditions is then
compared for each participant to see if there
was a difference.
Evaluation of Repeated Measures
There are strengths associated with using
the repeated measures design. Since the
same participants are taking part in all
conditions of the experiment, fewer
participants are required. This makes the
design less costly and time consuming, as
fewer participants need to be recruited. In
addition, the use of the same participants
across conditions reduces the possibility of
participant variables such as individual
Issues with order effects, such
REPEATED
The same participants take and time consuming.
as practice effects or fatigue,
part in each condition of the
as participants take part in
experiment. Reduces participant variables,
both conditions.
as the same participants take
part in both/all conditions.
Two separate groups of
INDEPENDENT
Possible Exam Questions
1. Explain the purpose of a pilot study. (2 marks)
2. A psychologist, Dr Lees, was interesting in studying the behaviour of infants once they had been
reunited with their mothers following a stay in hospital. Dr Lees decided to study the behaviour of the
infants who had experienced a disruption to their attachment in this manner using a naturalistic
method. It was decided to video record the caregiver–infant interactions in their own home for a three‐
hour period both before and after the hospital admission.
Explain why Dr Lees might want to conduct a pilot study before the main observation is carried out. (2
marks)
Exam Hint: Often students fail to provide sufficient elaboration to achieve both marks on questions such
as this. One possible reason that could be presented is that Dr Lees will carry out a pilot study prior to the
main observation to make sure that the video cameras were suitably located to document the
observations. A second creditworthy response would be to explain that the pilot study would enable Dr
Lees to check how appropriate the behavioural categories were to examine the caregiver–infant
interactions accurately.
3. Explain what is meant by matched pairs design. (2 marks)
4. Other than repeated measures design, identify and explain one other research design. (2 marks)
5. Suggest one advantage of an independent measures design. (2 marks)
6. Explain one limitation of a repeated measures design and how a researcher may attempt to address
this issue within their research. (3 marks)
7. Evaluate the use of the independent measures design in psychological research. You may refer to
strengths and/or limitations in your response. (4 marks)
CONTROL, DEMAND CHARACTERISTICS AND INVESTIGATOR
EFFECTS
Specification: Control: random allocation and counterbalancing, randomisation and
standardisation. Demand characteristics and investigator effects.
WHAT YOU NEED TO KNOW
Outline the process and importance of control within psychological investigations, including:
o Random allocation
o Counterbalancing
o Randomisation
o Standardisation
Outline the issue of demand characteristics and investigator effects within psychological
investigations and identify how to avoid these issues.
Control
It is important that researchers know how to control/eliminate extraneous variables through the following
measures: random allocation, counterbalancing, randomisation and standardisation.
Extraneous variables are any variable other than the IV that might affect the DV and thus affect the
results.
Where extraneous variables are important enough to cause a change in the DV, they become confounding
variables. There are many different types of extraneous variables that psychologists need to take account
of when designing their investigations:
Situational variables – variables connected with the research situation. For example, the temperature,
time of day, lighting, materials, etc. They are controlled though standardisation, ensuring that the only
thing which differs between the two groups is the IV. For example, making sure that the temperature is
the same for both groups, the time of day is the same, etc.
Participant variables – variables connected with the research participants. For example, age,
intelligence, gender, etc. They are controlled through the experimental design, such as matched pairs
design, or by randomly allocating participants to conditions, which helps to reduce bias.
Random Allocation
Random allocation of participants to their groups, for example in an independent measures design, is an
extremely important process in psychological research. Random allocation greatly decreases the possibility
that participant variables in the form of individual differences, such as mathematical ability, will adversely
affect the results.
Counterbalancing
To combat the problem of order effects with repeated measures design, researchers can counterbalance
the order of the conditions. The sample is split in half with one half completing the two conditions in one
order and the other half completing the conditions in the reverse order. Any order effects should be
balanced out by the opposing half of participants.
For example, the first ten participants would complete condition A followed by condition B but the
Exam Hint: It is important to note that counterbalancing does not ‘remove’ or ‘get rid of’ order effects; it
works to nullify the order effects, as the participants take part in different conditions in different orders.
Randomisation
This is when trials are presented in a random order to avoid any bias that the order of the trials might
present.
Standardisation
This is the process in which all situational variables of a procedure used in research are kept identical, so
that methods are sensitive to any change in performance. Under these circumstances changes in data can
be attributed to the IV. In addition, it is far more likely that results will be replicated on subsequent
occasions when research is standardised.
Demand Characteristics and Investigator Effects
Both demand characteristics and investigator effects can act as confounding variables which affect the
results of the research. Therefore, the researcher needs to exercise as much control over them as possible
to maintain internal validity.
Demand characteristics occur when the participants try to make sense of the research and change their
behaviour accordingly to support what they believe are the aims of the investigation. Demand
characteristics are a problem as the participants act in a way to support the hypothesis rather than
displaying natural behaviour, making the results lack validity. Conversely, the participant may deliberately
try to disrupt the results, a phenomenon known as the ‘screw‐you’ effect.
Demand characteristics are controlled by not allowing the participants to guess the aim of the research or
the identity of the IV which can be achieved by using a single‐blind experimental technique. This is when
only the researcher knows the true aim of the experiment, and a measure of deception has been used so
that the participants cannot easily guess the aim. Therefore, they are unable to try to either support or
undermine the research on purpose. An example of this is in medical tests when comparing the effects of a
therapeutic drug with a placebo, where only the researcher knows which is which.
Investigator effects are where a researcher (consciously or unconsciously) acts in a way to support their
prediction. This can be a problem when observing events that can be interpreted in more than one way.
For example, one researcher might interpret children fighting as an act of violence, while another might
observe this as rough and tumble play.
ETHICAL GUIDELINES, PEER REVIEW & THE ECONOMY
Specification: Ethics, including the role of the British Psychological Society’s code of ethics;
ethical issues in the design and conduct of psychological studies; dealing with ethical
issues in research. The role of peer review in the scientific process. The implications of
psychological research for the economy.
WHAT YOU NEED TO KNOW
Outline the purpose of ethical guidelines within psychological research.
Outline how to deal with ethical issues in psychological research.
Outline the role of peer review in psychology.
Outline the implication of psychological research for the economy.
Introduction
Ethical issues are considerations that researchers need to consider before, during and after the research is
conducted. Ethical issues take into consideration the welfare of the participants, the integrity of the
research and the use of the data.
The British Psychological Society (BPS) code of ethics sets out a series of guidelines that researchers need
to consider when undertaking psychological research. Six of the main ethical guidelines include:
Deception
Right to withdraw
Informed consent
Privacy and confidentiality
Protection from harm
Exam Hint: An easy way to remember six of the main ethical guidelines is to use the acronym DRIPP.
WHY IS IT UNETHICAL? HOW TO DEAL WITH THE ISSUE?
OVERVIEW
(IF BROKEN) (IF BROKEN)
When information is It prevents participants from At the end of the study the
deliberately withheld from giving fully informed consent participants should be fully
participants or they are which means that they might debriefed and told the true aim
knowingly misled. be taking part in research and nature of the research. At this
DECEPTION
that goes against their views point the participant should be
or beliefs. given the right to withdraw the
publication of their results. The
contact details of the
experimenter should be given if
participants have any further
questions or queries.
psychological investigations, which
may or may not involve deception.
This, in effect, means that they will
have given consent for being
deceived.
Retrospective consent: involves
participants giving consent for their
participation after already taking
part, for instance, if they were not
aware that they were the subject of
an investigation.
Children as participants: involves
gaining the consent of the parent(s)
in writing for children under the age
of 16 to participate in any
psychological research.
participant may later feel which their information will be
ashamed or embarrassed. protected and kept confidential,
e.g. no names will be published in
the final report and any written
information or video information
will be destroyed.
Confidentiality is where a A person’s details or data may Participants are provide with a fake
CONFIDENTIALITY
The Role of Peer Review in the Scientific Process
Peer review is an independent assessment process that takes place before a research study is published
and is undertaken by other experts in the same field of psychology. All psychologists must be prepared for
their work to be scrutinised in this way which is conducted anonymously. There are several aims of the
peer review process:
To provide recommendations about whether the research should be published in the public domain or
not, or whether it needs revision.
To check the validity of the research to ensure it is of a high quality.
To assess the appropriateness of the procedure and methodology.
To judge the significance of the research in the wider context of human behaviour.
To assess the work for originality and ensure that other relevant research is sufficiently detailed.
To inform allocation of future research funding to worthy investigative processes.
Exam Hint: An easy way to remember the five key points of peer review is using the following phrase:
PEER – Provide recommendations about whether the research should be published or not, or whether
it needs revision.
TYPES OF DATA
Specification: Quantitative and qualitative data; the distinction between qualitative and
quantitative data collection techniques. Primary and secondary data, including meta‐
analysis.
WHAT YOU NEED TO KNOW
Outline and evaluate the use of qualitative and quantitative data.
Outline and evaluate the use of primary and secondary data, including the use of meta‐analyses.
Quantitative Data
Quantitative data is numerical data that can be statistically analysed and converted easily into a graphical
format. Experiments, structured observations, correlations and closed/rating‐scale questions from
questionnaires all produce quantitative data.
Evaluation of Quantitative Data
A strength of quantitative data is that it is easy to analyse statistically. When large amounts of
numerical data are generated it is relatively easy to conduct descriptive statistics or inferential tests of
significance which allow for comparisons and trends to be identified between groups. Since established
mathematical procedures are in place for this type of analysis it makes quantitative data more
objective.
A disadvantage of quantitative data is its lack of representativeness. Since this type of data is often
generated from closed questions, the responses gained are narrow in their scope towards explaining
complex human behaviour. This means that, in comparison to qualitative data, the numerical findings
can often lack meaning and context. As such, it may not be a true representation of real life and thus
lacks validity.
Qualitative Data
Qualitative data is non‐numerical, language‐based data expressed in words which is collected through
semi‐structured or unstructured interviews and open questions in a questionnaire. It allows researchers to
develop an insight into the unique nature of human experiences, opinions and feelings.
Evaluation of Qualitative Data
A strength of obtaining qualitative
data is the rich detail obtained by the
researcher. Since participants can
develop their responses freely this
provides the investigator with
meaningful insights into the human
condition. Because of this, the
external validity of findings is
enhanced as they are more likely to
represent an accurate real‐world
view.
A limitation of qualitative data is that
it can be subjective. Due to the rich,
Evaluation of Meta‐Analysis
There are advantages of adopting a meta‐analysis methodology. Since the results are combined from
many studies, rather than just one, the conclusions drawn will be based on a larger sample which
provides greater confidence for generalisation. This, therefore, serves to increase the validity of the
patterns and trends identified.
There are issues of bias associated with meta‐analyses. Since the researcher is selecting data from
research which has already taken place, they may choose to omit certain findings from their
investigation. This could be particularly true if the previous findings showed no significant results or
were inconclusive. As a result, the findings and conclusions from the meta‐analysis will be biased as
they do not accurately represent all of the relevant data on the topic.
Possible Exam Questions
1. Define what is meant by quantitative data. (1 mark)
2. Explain one difference between primary and secondary data. (2 marks)
3. Suggest two reasons why behaviourists do not collect qualitative data in their investigations. (2 marks)
Exam Hint: To gain full marks for this question students can refer to the nomothetic/scientific nature of
the behaviorist approach and disadvantages of qualitative data, e.g. subjectivity / open to
interpretation; cannot be replicated; not open to quantification and statistical analysis; specific so not
amenable to generalisation. Generic evaluation of qualitative data not linked to investigations carried
out by behaviourists will only gain a maximum of one mark.
4. Dr Khanom and Dr Begum are researchers interested in investigating attitudes including racial
prejudice. They decided that a self‐report questionnaire, comprising 20 items with a number of fixed
responses for each, would be the best methodology for their study.
Identify the type of data Dr Khanom and Dr Begum will be collecting. Provide justification for your
choice. (2 marks)
Exam Hint: It is possible for full credit to be achieved for this question by referring to either primary data
(because Dr Khanom and Dr Begum conducted the questionnaire themselves) or quantitative data (since
the questionnaire contains closed questions).
5. Psychologists sometimes collect quantitative data. Outline one study in which a psychologist collected
quantitative data. In your answer, explain how the data collection technique was quantitative. (3
marks)
Exam Hint: A huge variety of studies may appear in response to this question and but must crucially
involve the collection of quantitative data.
DESCRIPTIVE STATISTICS
Specification: Descriptive statistics: measures of central tendency – mean, median, mode;
calculation of mean, median and mode; measures of dispersion – range and standard
deviation; calculation of range; calculation of percentages.
WHAT YOU NEED TO KNOW
Outline, calculate and evaluate the use of different measures of central tendency:
o Mean
o Median
o Mode
Outline, calculate and evaluate the use of different measures of dispersion:
o Range
o Standard deviation*
Calculate percentages.
*Students are NOT required to calculate standard deviation; however, they are required to understand
why standard deviation is used and what it shows.
Descriptive Statistics
Once quantitative data has been collected, it is important to summarise this data numerically. This
quantitative summary is called descriptive statistics, and allows researchers to view the data as a whole. It
also helps the reader to get an understanding of the data and saves them from needing to navigate
through lots of results to get a basic understanding of the data. Descriptive statistics typically include a
measure of central tendency and a measure of dispersion (which will have been selected based on the type
of data collected), and can also include percentages.
Measures of Central Tendency
Measures of central tendency tell us about the central, most typical, value in a data set and are calculated
in different ways.
Mean
Perhaps the most widely used measure of central tendency is the mean. The mean is what people most
are referring to when they say ‘average’: it is the arithmetic average of a set of data. It is the most sensitive
of all the measures of central tendency as it takes into consideration all values in the dataset. Whilst this is
a strength as it means that all the data is being taken into consideration, the sensitivity of the mean is
something that must be considered when deciding which measure of central tendency to use. It can be
very misrepresentative of the data set if there are extreme scores present.
The mean is calculated by adding all of the data together, and dividing the sum by how many values there
are in total. The value that is then given should be a value that lies somewhere between the maximum and
minimum values of that dataset. If it isn’t, then there is a human error with the calculations!
Example: a student sits five mock A‐level psychology exams, and gets 65%, 72%, 71%, 67% and 79%. To
calculate their mean score, you would add all the scores together (65+72+71+67+79 = 354) and then
divide by the number of scores there are (354÷5 = 70.8). This gives a mean score of 71%. Looking at the
data set, a mean of 71% looks quite accurate, as all of the scores are quite close to this value.
DEFINITION: HOW IS IT
STRENGTHS LIMITATIONS
CALCULATED?
Most sensitive
Most representative of all measure as outliers
Calculated by adding up all
the measures of central (extreme scores) can
MEAN the scores in the data set and
tendency because it is distort the mean.
‘AVERAGE’ then dividing by the number
comprised of the whole Can only be used with
of scores.
data set. ordinal and interval
data.
Calculated by putting all
MEDIAN scores in rank order from
Not distorted by extreme Does not reflect all
‘MIDDLE smallest to largest then
scores. scores in the data set.
SCORE’ selecting the middle number
from the data set.
There can be more
Not distorted by extreme
MODE Calculated by identifying the than one mode so it is
scores. The only method
‘MOST most frequently occurring not always a useful
which can be used with
OFTEN’ score within the data set. measure of central
nominal data.
tendency.
Measures of Dispersion
Measures of dispersion are descriptive statistics that define the spread of data around a central value
(mean or median). There are two measures of dispersion: range and standard deviation (SD).
Range
The range is calculated by subtracting the lowest score in the data set from the highest score in the data
set and (usually) adding 1. The addition of 1 to the calculation is a mathematical correction which allows
for the fact that some of the scores in the data set will have been rounded up or down.
Referring to the earlier example, the lowest value was 12 and the highest was 79, resulting in a range of 67
(79‐12=67, or 79‐12+1=68). This value is very straightforward to calculate, which is a clear strength of using
the range. However, it is important to recognise that a data set with a strong negative skew can have a
similar range to a data set with a strong positive skew, in which case it may be providing a very limited
insight into the data set. Equally, it is only taking into consideration the two extreme scores, which may not
be an accurate representation of the data set as a whole.
Students often ask “Why do you add 1 to the range?” and the answer is a simple one which is best
illustrated with an example: If the lowest score is 5 and the highest score is 9, the possible scores are 5, 6,
7, 8 and 9. There are five possible scores, but 9 – 5 = 4. The simple calculation ignores the fact that you
have to include the lowest score in the range, so you add 1.
Standard Deviation
A much more informative measure of dispersion is the standard deviation. However, the increased level of
detail comes at the cost of a slightly more complicated calculation in comparison to the range. The
standard deviation looks at how far the scores deviate from the mean. If the standard deviation is large,
this suggests that the data is very dispersed around the mean and, for example, the participants scored
The standard deviation score takes into consideration all of the values within the data set, and is a very
precise measurement. However, in the same way as the mean, the fact that it takes into account every
value means that it can be easily distorted by an extreme value, which could in turn mean that it
misrepresents the data.
Exam Hint: Questions regarding interpretation of standard deviation values are often worth several
marks, so it is important to make sure you link your answer back to the question, rather than just
pointing out how they are different. Make sure you tell the examiner what these scores actually tell you
about the data!
STRENGTHS LIMITATIONS
Easy to calculate mathematically Does not indicate the distribution
RANGE
without use of a calculator. pattern across the whole data set.
Is a precise measurement of
STANDARD dispersion because all values in the Extreme values can distort the
DEVIATION data set are included in the measurement.
calculation.
Calculation of Percentages
Providing percentages in the summary of a dataset can help the reader get a feel for the data at a glance,
without needing to read all of the results. For example, if there are two conditions comparing the effects of
revision vs. no revision on test scores, a psychologist could provide the percentage of participants who
performed better having revised, to give a rough idea of the findings of the study. Let’s imagine that out of
a total of 45 participants, 37 improved their score by revising.
In order to calculate a percentage, the following calculation would be used:
Number of participants who improved × 100
Total number of participants
The bottom number in the formula should always be the total number in question (such as total number of
participants, or total possible score), with the top number being the number that meets the specific
criteria (such as participants who improved, or a particular score achieved). This answer is then multiplied
by 100 to provide the percentage.
Possible Exam Questions
1. Name one measure of central tendency. (1 mark)
Exam Hint: As this question is simply asking for a measure of central tendency to be named, no further
elaboration is required to gain the mark here.
2. Which of the following is a measure of dispersion? (1 mark)
a) Mean
b) Median
c) Mode
d) Range
3. Calculate the mode for the following data set. (1 mark)
10,2,7,6,9,10,11,13,12,6,28,10
4. Calculate the mean from the following data set. Show your workings. (2 marks)
2, 8, 10, 5, 9, 11, 15, 4, 16, 20
Mean = ________
5. Explain the meaning of standard deviation as a measure of dispersion. (2 marks)
6. Other than the mean, name one measure of central tendency and explain how you would apply this to
a data set. (3 marks)
Exam Hint: It is vitally important that time is taken to read the question fully to ensure that a description
of how to calculate the mean is also presented.
7. Explain why the mode is sometimes a more appropriate measure of central tendency in comparison to
the mean. (3 marks)
8. Explain one strength and one limitation of the range as a measure of dispersion. (4 marks)
9. Evaluate the use of the mean as a measure of central tendency. You may refer to strengths and/or
limitations in your response. (4 marks)
10. A researcher was interested in investigating the number of minor errors that both male and female
learner drivers made on their driving test. In total, ten males and ten females agreed for the
performance on their driving test to be submitted to the researchers. The table below depicts the
findings:
Number of minor errors made by
1,2,0,3,4,2,0,7,6,2
female drivers
Number of minor errors made by male
2,3,1,5,6,2,3,0,1,4
drivers
The mean number of minor errors made during the driving tests for both groups (males and females)
combined is 2.7.
Calculate the percentage of the male drivers who scored above the mean score and the percentage of
the female drivers who scored above the mean score, showing your calculations. (4 marks)
1 3 1 6
2 4 2 3
3 8 3 9
4 1 4 11
5 20 5 14
6 13 6 27
7 17 7 12
8 5 8 4
Median Median
Complete the table above by calculating the median for both groups. Explain why the developmental
psychologist chose the median as a measure of central tendency rather than the mean. (4 marks)
PRESENTATION AND DISPLAY OF QUANTITATIVE DATA
Specification: Presentation and display of quantitative data: graphs, tables, scattergrams,
bar charts, histograms.
WHAT YOU NEED TO KNOW
Present quantitative data in the following formats:
o Tables
o Scattergrams
o Bar charts
o Histograms [A‐Level only]
Graphical techniques and tables are used to summarise data in a clear and visually accessible way.
Tables
Perhaps the most straightforward way of presenting data is in tables, which will summarise the key
descriptive statistics for a data set, for example, the mean values and standard deviation values for each
condition within a psychological investigation. Presenting data in this way will allow the reader to easily
compare the most important values, without needing to interpret the data. For example, the following
table outlines the mean scores and standard deviation for Godden and Baddeley’s (1975) study.
MEAN NUMBER OF WORDS RECALLED AS A FUNCTION OF LEARNING AND RECALL ENVIRONMENT
LAND UNDERWATER Total
Learning Environment Mean Recall Score SD Mean Recall Score SD
Land 13.5 5.8 8.6 3.0 22.1
Underwater 8.4 3.3 11.4 5.0 19.8
Total 21.9 ‐ 20.0 ‐ ‐
Scattergram
A scattergram (sometimes called a
scattergraph) is a graph that shows the
correlation between two sets of data (co‐
variables) by plotting points to represent
each pair of scores. It indicates the degree
and direction of the correlation between
the co‐variables, one of which is indicated
on the X‐axis and the other on the Y‐axis.
A positive correlation shows an upward
trend where as one variable increases,
so does the other.
A negative correlation shows a trend
going in the opposite direction where as
one variable increases, the other
decreases.
With a zero correlation, there is no distinct relationship shown between the two variables. The
individual participant marks randomly appear on the scattergram.
DISTRIBUTIONS: NORMAL AND SKEWED DISTRIBUTIONS
Specification: Distributions: normal and skewed distributions; characteristics of normal
and skewed distributions.
WHAT YOU NEED TO KNOW
Identify normal and skewed distributions.
Identify Normal and Skewed Distributions
Data that is normally distributed produces a symmetrical bell‐shaped curve when plotted, indicating that
most scores are close to the mean, with a progressively fewer scores being located at the extremes of
either tail of the distribution. In this instance, the median and mode also occupy the same centre point of
the curve as the mean does.
For any data set to be considered normally distributed 68.26% will lie within one standard deviation of the
mean (34.13% either side) and 94.55% of scores will lie within two standard deviations from the mean. As
a result, only 4.56% of scores will lie beyond two standard deviations from the mean (2.28% above or
below).
However, sometimes data does not follow this symmetrical pattern which can result in a large proportion
of scores falling below the mean (positively skewed) or after the mean (negatively skewed).
In both instances, the mode remains at the highest point on the graph, since it is not affected by extreme
scores.
CONTENT ANALYSIS [A‐LEVEL ONLY]
Specification: Content analysis: Content analysis and coding. Thematic analysis.
WHAT YOU NEED TO KNOW
Outline the process of content analysis, including the use of coding.
Outline the process of thematic analysis.
Evaluate the use of content and thematic analysis in psychological research.
Content Analysis
Content analysis is a type of observational technique which involves studying people indirectly, through
qualitative data. Qualitative data collected in a range of formats can be used, such as video or audio
recordings (or the interview transcripts), written responses (such as those provided to an open question in
a questionnaire), or even children’s drawings. Content analysis helps to classify responses in a way that is
systematic, which can then allow clear conclusions to be drawn.
It is important for researchers using content analysis to have their research questions formulated, so that
they know exactly what their content analysis will focus on. Researchers must familiarise themselves with
the data before conducting any analysis, so that they are confident that their coding system is appropriate
for the task ahead.
Content analysis is particularly helpful when conducting research that would otherwise be considered
unethical. Any data that has already been released into the public domain is available for analysis, such as
newspaper articles, meaning that explicit consent is not required. For material that is of a sensitive nature,
such as experience of domestic violence, content analysis can also prove useful, as participants can write a
report of their experience which can be used in analysis. This allows high quality data to be collected, even
in difficult circumstances.
Coding
Coding is an important step in conducting content analysis and involves the researcher developing
categories for the data to be classified. Qualitative data can be extensive in its nature, for example
interview transcripts, and so coding can be helpful in reaching succinct conclusions about the data. These
categories provide a framework to convert the qualitative material into quantitative data, which can then
be used for further (statistical) analysis.
For example: A researcher is interesting in investigating prejudice and discrimination in the media towards
refugees. In order to do this, they will follow the following procedures:
1. The researcher will select a newspaper article relating to refugees.
2. They will read through the text, highlighting important points of reference and annotating the margins
with comments.
3. Using the comments made in the margins, the researcher will categorise each excerpt according to
what it contains, e.g. evidence of prejudice, discriminatory language and positive regards towards
refugees.
4. This process will be repeated for each newspaper article of interest identified by the researcher at the
outset.
5. Once all the steps above are completed for each newspaper article, the categories which emerged
through the process of analysing the content are reviewed to decide if any need refining, merging or
subdividing.
6. With the well‐defined (operationalised) behavioural categories, the researcher returns to the original
articles and tallies the occurrence of each ‘behaviour’ accordingly.
7. The qualitative data has now undergone analysis to produce quantitative data which can undergo
further analysis such as statistical testing, descriptive statistics and producing graphs or tables.
Thematic Analysis
Thematic analysis is a technique that helps identify themes throughout qualitative data. A theme is an idea
or a notion, and can be explicit (such as stating that you feel depressed) or implicit (for example, using the
metaphor of a black cloud for feeling depressed). Thematic analysis will produce further qualitative data,
but this will be much more refined.
If we revisit the example above with the researcher reviewing the articles for evidence of prejudice or
discrimination against refugees, the following procedures would be followed:
1. Carry out steps 1–3 as if conducting a content analysis (see above).
2. Thereafter, the researcher must decide if any of the categories identified can be linked in any way, such
as ‘stereotypical views’, ‘economic prejudice’ or perhaps ‘positive experiences for refugees’.
3. Once the themes are successfully identified, they can then be used in shorthand to identify all aspects
of the data that fit with each theme. For example, every time the researcher identifies an example
within the data of a positive experience for the refugee, they might write ‘PER’ (positive experience for
refugees) alongside it, so that they are able to quickly re‐identify this theme in subsequent analysis of
the data.
FEATURES OF SCIENCE [A‐LEVEL ONLY]
Specification: Features of science: objectivity and the empirical method; replicability and
falsifiability; theory construction and hypothesis testing; paradigms and paradigm shifts.
WHAT YOU NEED TO KNOW
Outline the following features of a science:
o Objectivity and empirical method
o Replicability and falsifiability
o Theory construction and hypothesis testing
o Paradigms and paradigm shifts
Features of a Science
An ongoing debate in the field of psychology is whether psychology can be considered a science. It is
important to look at the different aspects of what makes a science and how, if at all, psychology fulfils
these expectations.
Objectivity and Empirical Method
A key feature of science is the ability for researchers to remain objective, meaning that they must not let
their personal opinions, judgements or biases interfere with the data. Laboratory experiments are the
most objective method within the psychology discipline because of the high level of control that is exerted
over the variables. On the other hand, a natural experiment, by its very nature, cannot exert control over
the manipulation of independent variables and is often viewed as less objective. Similarly, the
observational and content analysis methods can fall victim to objectivity issues since the behavioural
categories assigned are at the personal discretion of the investigator.
Empirical methods refer to the idea that knowledge is gained from direct experiences in an objective,
systematic and controlled manner to produce quantitative data. It suggests that we cannot create
knowledge based on belief alone, and therefore any theory will need to be empirically tested and verified
in order to be considered scientific. Adopting an empirical approach reduces the opportunity for
researchers to make unfounded claims about phenomena based on subjective opinion.
Replicability and Falsifiability
Replicability is a key feature of a science, and refers to the ability to conduct research again and achieve
consistent results. If the findings can truly be generalised, and thus be truly valid, psychologists would
expect that any replication of a study using the same standardised procedures would produce similar
findings and reach the same conclusions.
For research to be considered scientific it should also be falsifiable. Falsifiability (Popper, 1934) refers to
the idea that a research hypothesis could be proved wrong. Scientific research can never be ‘proven’ to be
true, only subjected to research attempts to prove them as false. For this reason, all investigations have a
null hypothesis which suggests that any difference or relationship found is due to chance.
An example within psychology which causes conflict in the scientific community for its lack of falsifiability is
the Freudian psychodynamic approach. A central principle of this approach is the notion of the Oedipus
complex, which occurs for boys during childhood whereby they must resolve an unconscious sexual desire
for the opposite‐sex parent in order to develop the final element of their psyche: the superego. If a male
individual refutes the idea that he will have gone through this stage of psychosexual development in his
youth, psychodynamic theorists would counter this with the supposition that they were in denial (a
defence mechanism) which is another facet of the theory. Herein a circular argument is created to prevent
Conversely, there is the deductive process of theory construction which works from the more general ideas
to the more specific and is informally referred to as a ‘top‐down’ approach. Here, the psychologist may
begin with a theory relating to a topic of interest. This will then be narrowed down into a more specific
hypothesis which can be tested empirically. Any data gathered from testing the hypothesis in this way will
then be used to adjust the predictions.
Paradigms and Paradigm Shifts
A paradigm is a set of shared assumptions and methods within a particular discipline. Kuhn (1962)
suggested that it was this that separated a scientific discipline from non‐scientific disciplines. Under this
assumption, he suggested that psychology was perhaps best seen as a pre‐science, separate from the likes
of physics or biology. He suggested that psychology had too much disagreement at its core between the
various approaches (e.g. behaviourist versus cognitive psychologists), and was unable to agree on one
unifying approach to consider itself a science.
The way in which a field of study moves forward is through a scientific revolution. It can start with a
handful of scientists challenging an existing, accepted paradigm. Over time, this challenge becomes
popular with other scientists also beginning to challenge it, adding more research to contradict the existing
assumptions. When this happens, it is called a paradigm shift.
A classic example of a paradigm shift is how scientists historically believed the world to be flat when now it
is widely accepted that the earth is, in fact, round. In psychology, there have been numerous paradigm
shifts over the decades. From the late nineteenth century psychoanalytic theory was at the forefront of
psychological thinking, with the role of the unconscious mind in governing behaviour being the dominant
approach. However, between 1927 and 1938 the work of Pavlov and Skinner emerged who adopted the
behaviourist position that all behaviour was learned from the environment and experiences. Shortly
thereafter, in the 1960s, another paradigm shift occurred with the cognitive approach taking precedence in
psychology with the development of the electronic computer. Here, the shift from behaviourist thinking
moved towards the role of cognitions in explaining human behaviour although elements of the
behaviourist approach remained in use and were combined in cognitive behavioural therapy (CBT).
Possible Exam Questions
1. Outline what is meant by falsifiability in psychological research. (2 marks)
2. Define what is meant by the term paradigm. (2 marks)
3. Explain replicability as a feature of science. (2 marks)
4. Explain what is meant by a paradigm shift. You may use a suitable example to illustrate your point. (2
marks)
5. Outline objectivity and empirical methods as features of science. (4 marks)
6. Discuss the extent to which psychology can be considered a science. Refer to research and/or
approaches in your answer. (10 marks)
RELIABILITY [A‐LEVEL ONLY]
Specification: Reliability across all methods of investigation. Ways of assessing reliability:
test‐retest and inter‐observer; improving reliability.
WHAT YOU NEED TO KNOW
Outline the following types of reliability:
o Test‐retest
o Inter‐observer
Outline ways to improve reliability in different types of psychological investigation.
Reliability Across All Methods of Investigation
Reliability is a measure of consistency. For example, if you are using a tape measure, you expect to get the
same results every time you measure a certain object. If the results are not consistent, then the measure is
not reliable. In psychology, the expectations are the same; if researchers are using a questionnaire to
measure levels of depression, they want to ensure that the measure is consistent between participants
and over time.
Test‐Retest Reliability
One very straightforward way of testing whether a tool is reliable is using the test‐retest method. Quite
simply, the same person or group of people are asked to undertake the research measure, e.g. a
questionnaire, on different occasions.
When using the test‐retest method, it is important to remember that the same group of participants are
being studies twice, so researchers need to be aware of any potential demand characteristics. For example,
if the same measure is given twice in one day, there is a strong chance that participants will be able to
recall the responses they gave in the first test, and so psychologists could be testing their memory rather
than the reliability of their measure. On the other hand, it is also important to make sure that there is not
too much time between each test. For example, if psychologists are testing a measure of depression, and
question the participants a year apart, it is possible that they may have recovered in that time, and so they
give completely different responses for that reason, rather than that the questionnaire is not reliable.
After the measure has been completed on two separate occasions, the two scores are then correlated. If
the correlation is shown to be significant, then the measure is deemed to have good reliability. A perfect
correlation is 1, and so the closer the score is to this, the stronger the reliability of the measure, but a
correlation of over +0.8 is also perfectly acceptable and seen as a good indication of reliability.
Inter‐Observer Reliability
Inter‐observer reliability refers to the extent to which two or more observers are observing and recording
behaviour in a consistent way. This is a particularly useful way of ensuring reliability in situations where
there is a risk of subjectivity. For example, if a psychologist was making a diagnosis for a mental health
condition, it would be a good idea for someone else to also make a diagnosis to check that they are both in
agreement.
In psychology studies where behavioural categories are being applied, inter‐observer reliability is also
important to make sure that the categories are being used in the correct manner. Psychologists would
observe the same situation or event separately, and then their observations (or scores) would be
correlated to see whether they are suitably similar.
1. Define what is meant by the term reliability. (2 marks)
2. A music teacher was interested in studying whether there was a relationship between English language
skill and musical aptitude. He decided to investigate this with Year 11 students in the school where he
worked. He randomly chose 20 students, from the 200 in the year group, and gave each of them two
tests. He used part of a GCSE exam paper to test their English language skill. The higher the mark
3. A psychologist used the observational method to look at behaviours indicative of attachment between
primary caregivers and their infants. Pairs of observers watched a single child interact with the mother
for a twenty‐minute period. They noted the number times the child sought contact and used the
parent as a safe base to go and explore. After seeing the first round of ratings from the observers, the
psychologist becomes concerned about the quality of inter‐rater reliability.
What could the psychologist do to improve inter‐rater reliability before continuing with the
observational research on attachment? (4 marks)
Exam Hint: There is a breadth / depth trade‐off to be struck here: students can elaborate on one
improvement, for example, explain how the observer training might be improved, or alternatively
outline several improvements in less detail such as establishing clearer criteria for categorising
attachment behaviour and filming the interactions so that the observers can practise the categorisation.
4. A psychologist was interested in studying student stress levels in their third year of their degree course.
She asked an academic colleague for feedback on her method who reported concern that the
psychologist had not checked the reliability and validity of the questionnaire used to measure the level
of stress.
Explain how the psychologist could check the reliability and the validity of the stress questionnaire. (5
marks)
Exam Hint: This question requires students to ‘think like a psychologist’ and apply their knowledge
carefully and with real consideration for the context. Avoid stating definitions of reliability and validity
as these are not creditworthy and instead refer explicitly to measures which improve reliability and
validity of questionnaires.
5. Identify and explain two or more ways of improving reliability. (6 marks)
VALIDITY [A‐LEVEL ONLY]
Specification: Types of validity across all methods of investigation: face validity,
concurrent validity, ecological validity and temporal validity. Assessment of validity.
Improving validity.
WHAT YOU NEED TO KNOW
Outline how to assess and improve validity across different types of investigation, referring to:
o Face validity
o Concurrent validity
o Ecological validity
o Temporal validity
o Assessment of validity
o Improvement of validity
Types of Validity
Validity refers to whether something is true or legitimate. Internal validity is a measure of whether results
obtained are solely affected by changes in the variable being manipulated (i.e. by the independent
variable) in a cause and effect relationship. External validity is a measure of whether data can be
generalised to other situations outside of the research environment.
Ecological Validity
Ecological validity is a type of external validity, and refers to the extent to which psychologists can apply
their findings to other settings – predominantly to everyday life. A lack of ecological validity is typically a
point made when discussing weaknesses of laboratory based studies. Due to the artificial and contrived
setting of a laboratory, it stands to reason that it is difficult to generalise the findings to a more natural
situation since behaviour may be very different as a result. Realistically, it is a multitude of variables that
make a laboratory experiment low in ecological validity, including the use of artificial stimulus materials.
Exam Hint: If you are suggesting that research results are low in ecological validity as part of your
evaluation, make sure that you justify this point with specific examples relating to that individual study.
Avoid writing sentences which could be ‘copy and pasted’ into another essay and still make complete
sense, as this means you have not tied the commentary closely enough to the question at hand.
Temporal Validity
Temporal validity is another form of external validity, which refers to the extent to which research
findings can be applied across time. For example, Asch’s research into conformity is often said to be lacking
temporal validity because the study was a ‘child of its time’, that is, the findings were a product of the fact
that the study was conducted in a conformist era, and thus the findings might not be as applicable in
today’s society.
Assessment of Validity
The validity of a psychological test or experiment can be assessed in two main ways. Firstly, the face
validity can be considered, that is, does the test appear to measure what it says it measures? For example,
if there is a questionnaire that is designed to measure depression, do the items all look like they are going
to represent what it is like to have depression? If not, it is not likely to have face validity. A test of face
validity is most likely to be conducted by a specialist in the given area, which in the example above could
be a clinical psychologist, doctor or other mental health specialist familiar with the assessment of
Improving Validity: Observations
When it comes to observations, psychologists can improve validity, in particular ecological validity, by
making sure that the researchers have minimal impact on the behaviour that they are observing. One way
of doing this is to conduct a covert observation, where the researcher is not seen. By doing this,
researchers increase the likelihood that the behaviour observed is natural, as participants will not be acting
in a way that they deem correct or desirable for the sake of the study.
Another way of improving validity in observations is the use of behavioural categories. In this instance,
researchers will tick off behaviours when they are seen which helps to improve validity by reducing the
chance of researcher subjectivity. Ensuring that the categories are clearly defined, and do not overlap,
would also further improve validity in observations.
Research that employs qualitative methodology as opposed to quantitative methodology is often
regarded as having higher ecological validity due to the depth of data that is collected, often through the
use of case studies or interviews. However, validity can be lowered because analysis is more subjective
and open to the investigator’s interpretation. To strengthen the validity here, there are several things that
can be done. First of all, simply including direct quotes from participants can help to improve validity, as it
provides evidence that what was being inferred from the data is accurate. Also, validity can be improved
by collecting data from a variety of sources; for example, having data that has come from interviews,
observations and written reports from participants which is a process called triangulation.
Possible Exam Questions
1. What is meant by validity? (1 mark)
Exam Hint: This question is worth only one mark so candidates need to avoid producing lengthy answers
– be succinct.
2. Briefly explain one way a psychologist could check the validity of the data they have collected in a
questionnaire assessing obsessive compulsive disorder (OCD). (2 marks)
Exam Hint: The key word in this question is ‘way’ not ‘type’. This means that the first mark is awarded
for knowledge of a way (not just naming a type of validity) and the second mark is for explaining how
this would be implemented in this case. Answers are most likely to address face validity or concurrent
validity, but other ways such as construct validity, content validity, criterion validity and predictive
validity would also be creditworthy.
3. Explain how ecological validity and temporal validity differ in psychological research. (4 marks)
Exam Hint: Note that the wording of this question requires the differences between the two types of
validity to be explained, not simply for each term to be defined.
4. Describe two ways of assessing validity. (4 marks)
5. A psychologist was interested in studying student stress levels in their third year of their degree course.
She asked an academic colleague for feedback on her method who reported concern that the
psychologist had not checked the reliability and validity of the questionnaire used to measure the level
of stress.
Explain how the psychologist could check the reliability and the validity of the stress questionnaire. (5
marks)
Exam Hint: This question requires students to ‘think like a psychologist’ and apply their knowledge with
real consideration for the context. Avoid stating definitions as these are not creditworthy and instead
refer explicitly to measures which improve reliability and validity of questionnaires.
REPORTING PSYCHOLOGICAL INVESTIGATIONS [A‐LEVEL ONLY]
Specification: Reporting psychological investigations. Sections of a scientific report:
abstract, introduction, method, results, discussion and referencing.
WHAT YOU NEED TO KNOW
Outline the purpose and structure of the following sections of a psychological report:
o Abstract
o Introduction
o Method
o Results
o Discussion
o References
Reporting Psychological Investigations
When psychologists conduct research, they often want to share their findings with the psychology
community but if everyone wrote psychological reports in their own individual style it would be very
difficult for the reader to navigate. Therefore, everyone uses a conventional format and in the field of
psychology, the American Psychological Association (APA) format is typically used. However, there are
other variations that can be used, such as Harvard. Whilst there are some minor differences, all formats
present research in a similar way. In addition to making it more user‐friendly, following the conventional
format also ensures that every author is providing the reader with a standard level of detail about their
research.
Exam Hint: In the exam, you might be asked to outline the purpose or structure of the following sections
of a psychological report; however, you might also be asked to write a specific part of a report (e.g.
abstract, method section or even a reference). Therefore, it is important that not only do you understand
the purpose and structure of these sections but you also practice writing them.
Sections of a Scientific Report
Abstract
The first section in a psychological report is the
abstract which is a short summary of the key
points of the research in roughly 150–200
words. Since the abstract is typically the first
information that the reader will encounter it
plays an important role in the report. It should
provide enough information to give a general
overview of the study and allow the reader to
make an informed decision about whether to
read the rest of the article or not.
Even though the abstract is simply a summary of
the research, there are still key pieces of
information that should be included: aim and
hypotheses, participants, methods, results, data
analysis and conclusions.
THE SIGN TEST [AS AND A‐LEVEL]
Specification: Introduction to statistical testing; the sign test.
WHAT YOU NEED TO KNOW
Known when and how to use the sign test.
Significance Testing – The Sign Test: When and How to Use It
The sign test is used when looking for a difference between paired data, i.e. repeated measures design (or
matched pairs – counted as one person tested on two occasions) which generates nominal data.
A Worked Example in Six Easy Steps
A slimming club believed that their new weight programme worked. They recorded the weights of ten
members of the club when they first joined, and again after three months.
Use the sign test to work out whether the weight loss programme was effective for these members.
WEIGHT AFTER 3
STARTING WEIGHT DIFFERENCE SIGN
MONTHS
JAN 80 74 ‐6 ‐
JULIE 125 102 ‐23 ‐
JESS 113 114 1 +
JOSIE 108 87 ‐21 ‐
JODIE 96 82 ‐14 ‐
JENNY 78 78 0 0
JOE 102 94 ‐8 ‐
JEFF 124 125 1 +
JUNE 97 75 ‐22 ‐
JADE 122 94 ‐28 ‐
1. Is the hypothesis directional or non‐directional?
Directional since the hypothesis predicts that members will lose weight after three months on the
programme.
2. Work out the sign.
Record each pair of data with a + or – (depending on whether the difference is positive or negative). If
there is no difference (e.g. in the case of Jenny) then a nil sign ‘0’ is recorded.
3. Calculate the value of S (S is the symbol for the sign test and is calculated by adding up the total
number of pluses and minus and selecting the smaller value).
In this case there are 7 minus and 2 pluses, therefore S = 2.
4. Calculate the value of N (N is the total number of scores, minus any nil scores ‘0’).
In this case there are 10 scores, but one is a 0, so N = 9.
5. Find the critical value (see table below)
For a directional test for N = 9, the critical value = 1.
LEVELS OF MEASUREMENT [A‐LEVEL ONLY]
Specification: Levels of measurement: nominal, ordinal and interval. Factors affecting the
choice of statistical test, including level of measurement and experimental design.
WHAT YOU NEED TO KNOW
Outline the different factors that affect the choice of a statistical test, including:
o Experimental design
o Levels of measurement
Nominal
Ordinal
Interval
Factors Affecting the Choice of Statistical Test
It is important to remember that when choosing a statistical test, an appropriate test must be selected and
justified, otherwise the statistical analysis may be brought into question. In psychology, there are a
number of important considerations that researchers must take into account when deciding on an
appropriate statistical test.
Difference or Association
The first important decision to make when choosing a statistical test of significance is whether the research
hypothesis is looking to investigate a difference or a relationship. It is important to identify this factor first
of all, as most statistical tests are designed to be used for one or the other specifically, and cannot be
simply applied to data regardless.
Data that investigates a difference will most typically have two conditions, one control condition and one
experimental condition. For example, imagine a researcher is investigating the impact of revision classes
on exam scores. Participants in the experimental condition may have been given three additional revision
classes to attend, whilst those in the control condition were not given any additional support. The
psychologist would be hoping to see that the average exam result in the experimental condition was
significantly higher than that of the control condition – looking for whether or not a difference between
these two groups exists.
If a researcher was wanting to establish an association/relationship, however, their investigation would
look quite different. Using the same example of the impact of revision on exam performance, each student
would be asked to state how many hours of revision they had completed in preparation for the exam and
this would be correlated against their final exam grade. The psychologist would therefore be investigating
the relationship between the two co‐variables: number of hours of revision completed and the
performance in the exam.
Experimental Design
The second decision which is important to consider when selecting an appropriate statistical test is the
research design that was used.
Psychologists will only need to consider the experimental design if they are looking for a difference, not an
association. If they are looking for an association, then they can move onto the level of measurement, to
help them decide which is the most appropriate statistical test.
The experimental design will have been identified as one of the following three: independent groups,
can often be generated quickly and can possible for the data to express its true
therefore be tested in a timely manner for complexity and can therefore appear overly
reliability. The mode is the measure of central simplistic. There is no measure of dispersion
tendency which can be applied to nominal data. which can be applied to nominal data.
The intervals between scores are not of equal
ORDINAL
Ordinal data provides more detail than nominal value. This means that an average (the mean)
data as the scores are ordered in a linear cannot be used as a measure of central
fashion, e.g. from highest to lowest. tendency. The median is most often used to
overcome this limitation.
In some instances, the intervals are arbitrary.
Interval level data is considered more
For example, 100 degrees centigrade is not
INTERVAL
informative than the nominal and ordinal levels
twice as warm as 50 degrees centigrade. We
of measurement. The gaps in between the
can only say that the difference between 10
scores are of equal value/distance and are
and 20 degrees is the same as between 30 and
therefore more reliable.
40 degrees.
Possible Exam Questions
1. Identify the key term which is used to describe categorical data from the list below. (1 mark)
a) Nominal
b) Ordinal
c) Interval
d) Ratio
2. Define nominal data. (1 marks)
3. Suggest an example of ordinal data. (2 marks)
4. Explain one limitation of nominal level data. (2 marks)
5. Explain why interval level data is often considered the most reliable. (2 marks)
6. Name three levels of measurement. (3 marks)
Exam Hint: Since the command word for this question is ‘name’, no elaboration is required and simply
naming nominal, ordinal and interval will secure all three marks available.
7. Explain what is meant by levels of measurement in psychological research. (3 marks)
PROBABILITY AND SIGNIFICANCE [A‐LEVEL ONLY]
Specification: Probability and significance: use of statistical tables and critical values in
interpretation of significance; Type I and Type II errors.
WHAT YOU NEED TO KNOW
Probability and significance:
o Use of statistical tables and critical values
o Type I and Type II errors
Probability and Significance
Before analysing data, a clear hypothesis must be outlined which can be either directional or non‐
directional. It is important to recognise which is which; without knowing, the wrong statistical test might
be selected for the data which will misrepresent the findings.
A directional hypothesis states which direction the findings are expected to take. For example, in a test of
difference, we might expect that one group performs better than the other; in an association, we might
specify the type of relationship that we expect to see, for example a positive relationship (correlation). A
directional hypothesis is selected by a researcher when previous research in that field of psychology
suggests findings will go in that particular direction.
A non‐directional hypothesis is selected by a researcher when there is little, or conflicting, evidence in that
field of psychology and a clear outcome for the research is not certain. For example, a non‐directional
hypothesis investigating differences may state that a difference between conditions is expected, but not
give further specific details regarding that difference. Equally with an association, a non‐directional
hypothesis would predict that a relationship would be found, without stating the direction.
Use of Statistical Tables and Critical Values
After conducting a statistical test (e.g. the sign test), a number will be generated which is called the
calculated value. It is this number that will help determine whether results are significant, which will in
turn help decide whether to reject the null hypothesis and accept the experimental/alternative hypothesis.
To do this, the calculated value needs to be compared with the critical value in the statistical tables.
The critical value varies with the statistical test
used, as each has its own specific table of
critical values. To know which critical value is
needed, several factors need to be considered
in making the decision.
Firstly, it needs to be decided whether the
investigation used a one‐tailed (directional
hypothesis) or a two‐tailed (non‐directional
hypothesis) test.
Secondly, the number of participants in the
sample is also taken into consideration.
The final factor to ascertain is the level of
significance, or the p‐value (probability).
STATISTICAL TESTS [A‐LEVEL ONLY]
Specification: When to use the following tests: Spearman’s rho, Pearson’s r, Wilcoxon,
Mann‐Whitney, related t‐test, unrelated t‐test and Chi‐squared test.
WHAT YOU NEED TO KNOW
Outline when to use the following statistical tests, with reference to: difference or association,
experimental design and leave of measurement:
o Spearman’s rho
o Pearson’s r
o Wilcoxon
o Mann‐Whitney
o Related t‐test
o Unrelated t‐test
o Chi‐squared test
Statistical Tests
Once a researcher knows if they are looking for a difference or a relationship/association, their research
design and what type of data they are working with then it is relatively straightforward to find out which
statistical test to use.
Below is a table outlining which statistical test to use, based on these decisions.
TEST OF DIFFERENCE
TEST OF
ASSOCIATION
RELATED DESIGN UNRELATED DESIGN
Sign test Chi‐squared
NOMINAL DATA
Simon Cowell
Wilcoxon Mann‐Whitney Spearman’s rho
ORDINAL DATA
Wants More Singers
Related t‐test Unrelated t‐test Pearson’s r
INTERVAL DATA (parametric) (parametric) (parametric)
Receiving Unanimous Praise
Parametric and Non‐Parametric Tests
Whilst the table above provides an overview of the test that should be used, it is also important for
researchers to consider whether the data they have is suitable for a parametric or a non‐parametric test.
In psychological research, it is preferable to be able to use a parametric test: these are much more
powerful than non‐parametric tests, but require the data to meet certain assumptions before use.
Firstly, data should be interval level, because parametric tests use the actual score, rather than ranked
data.
Secondly, the data should be drawn from an underlying normal distribution, so we would expect the
data itself to be normally distributed.
Thirdly, there should be homogeneity of variance – the variances in the two groups should not be
significantly different from one another. One way of testing for homogeneity of variance is to compare
the standard deviation scores for each condition. Because the participants are drawn from the same
population, it is expected that each condition would be similarly dispersed, particularly if the conditions
are related, thus giving homogeneity of variance.
APPENDICES
Spearman’s rho
Level of significance for one‐
0.05 0.025
tailed test
Level of significance for two‐
0.1 0.05
tailed test
N = 1 1.000
5 0.900 1.000
6 0.829 0.886
7 0.714 0.786
8 0.643 0.738
9 0.600 0.700
10 0.564 0.648
11 0.536 0.618
12 0.503 0.587
13 0.484 0.560
14 0.464 0.538
15 0.443 0.521
16 0.429 0.503
17 0.414 0.485
18 0.401 0.472
19 0.391 0.460
20 0.380 0.447
25 0.337 0.398
30 0.306 0.362
For data to be statistically significant, the calculated value must be equal to or higher than the critical
value.
Pearson’s r
Level of significance for one‐
0.05 0.025
tailed test
Level of significance for two‐
0.1 0.05
tailed test
N = 2 0.9000 0.9500
3 0.805 0.878
4 0.729 0.811
5 0.669 0.754
6 0.621 0.707
7 0.582 0.666
8 0.549 0.632
9 0.521 0.602
10 0.497 0.576
11 0.476 0.553
12 0.475 0.532
13 0.441 0.514
14 0.426 0.497
15 0.412 0.482
16 0.400 0.468
17 0.389 0.456
18 0.378 0.444
19 0.369 0.433
20 0.360 0.423
25 0.323 0.381
30 0.296 0.349
35 0.275 0.325
40 0.257 0.304
45 0.243 0.288
50 0.231 0.273
60 0.211 0.250
70 0.195 0.232
80 0.183 0.217
90 0.173 0.205
100 0.164 0.195
For data to be statistically significant, the calculated value must be equal to or less than the critical value
to show significance.
Related and Unrelated T‐Test
Level of significance for one‐
0.05 0.025
tailed test
Level of significance for two‐
0.1 0.05
tailed test
df = 1 6.314 12.706
2 2.920 4.303
3 2.363 3.182
4 2.132 2.776
5 2.015 2.571
6 1.943 2.447
7 1.895 2.365
8 1.860 2.306
9 1.833 2.262
10 1.812 2.228
11 1.796 2.201
12 1.782 2.179
13 1.771 2.160
14 1.761 2.145
15 1.753 2.131
16 1.746 2.120
17 1.740 2.110
18 1.734 2.101
19 1.729 2.093
20 1.725 2.086
21 1.721 2.080
22 1.717 2.074
23 1.714 2.069
24 1.711 2.064
For data to be statistically significant, the calculated value must be either equal to or greater than the
critical value to show statistical significance.
Chi‐squared test
Level of significance for one‐tailed test 0.05 0.025
Level of significance for two‐tailed test 0.1 0.05
df = 1 2.71 3.84
2 4.60 5.99
3 6.25 7.82
4 7.78 9.49
For data to be statistically significant, the calculated value must be equal to or greater than the critical
value to show significance.