Massey University

175.102 Psychology as a Natural Science


Measurement, Descriptive Statistics, and

175.102 Psychology as a natural science

175.102 Psychology as a natural science

Scales of measurement, Statistics and Graphs

Objectives: Develop an understanding of: Scales of measurement

Measures of central tendency and variability

Graphical presentation of statistics

Scales of Measurement

As science is based upon the systematic observation of objects or events, it must

involve some form of measurement. Of course, we are all familiar with measures of length,

weight, time and so on. We are also familiar with the ranking of pupils in class and the

categorising of people and events. What we may fail to appreciate is that these cases involve

different types of measurement. They differ in the scale of measurement involved, and in the

use to which the resulting data can be put.

We can say that 40 kg is twice as heavy as 20 kg, but we cannot assume that a student

placed 40th in class received only half as many marks as a student placed 20th. From this

example we can see that numbers based on different scales of measurement give us different

amounts of information and are suitable for different kinds of analysis.

The Four Basic Types of Numerical Scale for measured variables

1. Nominal Scales

Nominal = naming. These scales represent the lowest level of measurement (the most

basic). Nominal scales merely classify objects by assigning labels or numbers to them on the

basis of qualitative differences. Numbers on a nominal scale do not imply quantity. You

cannot add or subtract or do any other arithmetic operations with these scales. Consider, for

example, the numbers assigned to psychology courses. The course 175.102 is not "greater

than" the course 175.102, nor are psychology courses (175—) harder than mathematics
175.102 Psychology as a natural science

courses (160—). Note that the number assigned to each category is arbitrary. The categories

within gender, relationship status, names of pets, are further examples of nominal scores. An

example from our class survey was that you were asked what area of New Zealand you live

in. We often use nominal categories when describing data, for example, counting how many

people who took part in the survey were from Manawatu or Auckland.

2. Ordinal Scales

Here, the numbers not only distinguish between objects, but they also put the objects

into some sort of order or sequence. For example, the results of a race are usually reported on

an ordinal scale; 1st, 2nd, 3rd, etc. As with nominal scales, the numbers cannot be added or

subtracted. However, with an ordinal scale the size of the number is important. In other

words, what the number represents is important in terms of ordering (e.g., 1st indicates the

winner of an event). It’s important to remember that the rank doesn’t tell us anything about

the distance between each of the ranks. We know that the first person to finish a running race

arrived at the finish line before the second person to finish the race, but we don’t know how

long each runner took to finish the race and we don’t know how much faster the 1st place

runner was compared to the 2nd place runner.

3. Interval Scales

These scales not only have all of the properties of ordinal scales, but they also have

the characteristic that there are equal intervals between the units of measurement. For

example, the Celsius scale of temperature is an interval scale: The difference between 20°C

and 30°C is the same as between 10°C and 20°C. (The ordering in a race is not an interval

scale as the distance between the 1st and 2nd contestants is not necessarily the same as that

between the 2nd and 3rd as they cross the finishing line). However, there is no real zero in the

Celsius scale. The zero point is arbitrary - it does not indicate an absence of temperature.
175.102 Psychology as a natural science

With interval scales, then, we can add and subtract and take means, but not ratios - we cannot

say that a temperature of 30°C is twice as high as one of 15°C because we lack a true zero

point. I.Q. tests are assumed to involve interval scales, as are many other psychological tests.

Numbers on an interval scale can be added or subtracted, but cannot be multiplied or

divided (i.e. expressed in terms of ratios) because the zero value on the scale is arbitrary.

4. Ratio Scales

This scale represents the highest level of measurement. It possesses all of the nominal,

ordinal, and interval properties, but has the additional requirement that the starting point for

the scale represents a meaningful zero point. On a ratio scale zero represents the true absence

of the property being measured so it is appropriate to talk in terms of ratios. Weight, height

and time are examples of ratio measurements. Thus, for instance, not only is the distance

from 2 to 3 metres the same as that between 3 and 4 metres, but 4 metres is twice as far as 2

175.102 Psychology as a natural science

Exercise 1

Identify the level of measurement for each of the survey items in this table.

Table 1. Example survey: identify the scale of measurement for each row and write it in
the column on the right.

Student’s Health Survey Scale of


1. Sex: Male / Female / Nominal

Prefer not to say / Prefer to self-identify ____________

2. Year of birth Interval*

3. Country of birth Nominal

4. My height in meters is ___________ m Ratio

5. My weight in kilograms is ___________ kg Ratio

6. How many cigarettes do you smoke per day? Ordinal

 None
 0-5 per day
 5-10 per day
 10-20 per day
 over 20 per day

7. How much exercise do you do on an average day? Ordinal

 Minimal (no specific exercise or daily routine)
 Moderate (some moderate exercise most days e.g.
 Vigorous (vigorous activity most days e.g. jogging,

8. I have friends and relatives with whom I can discuss the Nominal
positive and negative events of the days: True / False

9. I have 5 portions of fruit and vegetables each day: Yes / No Nominal

10. I would rate my overall health as Ordinal

 Excellent
 Good
 Average

175.102 Psychology as a natural science

 Below Average
 Poor

11. My resting pulse rate (in beats per minute) is _________ bpm Ratio

12. My body temperature (in degrees centigrade) is _______ °C Interval**

* year is interval (years vary in day length because of leap year)

**Celsius is Interval (zero degrees C is not zero temperature)

175.102 Psychology as a natural science


Knowing what scale of measurement our data is in tells us which statistical analyses

we can perform on our data. Many students regard statistics with fear and loathing as they

feel their mathematics ability is inadequate. It may be of some comfort to know that only a

fifth form level of mathematics is required for this course. That is, the basic arithmetical

operations (addition, subtraction, multiplication, division, and square roots), and simple


No calculations will be required in the final examination for this paper. Nevertheless,

you should attempt to come to grips with general principals. Statistics is an essential part of

psychology. Psychological scientists observe and measure, a process that generates

quantitative (numerical) data. Statistics is the applied branch of mathematics that is concerned

with the collection, organisation, summarisation, description and interpretation of that

numerical data.

The observations and measurements made by psychologists are many and varied.

Psychologists attempt to answer such questions as:

a) Does a higher wage increase happiness?

b) Can low intensity cognitive behaviour therapy (LICBT) provided using the internet

lower the risk of relapse for mood disorders?

c) Does breakfast help children to achieve better grades in school?

d) Does playing violent video games increase aggression in adolescents?

175.102 Psychology as a natural science

To definitively answer these questions would require collecting the entire set of all

measurements relating to them (and this might not tell you about what might happen in the

future): The levels of wages and happiness for all employees (everywhere!), training and then

measuring the effectiveness of internet based LICBT for every person (in the world) with a

mood disorder, and the amount of breakfast consumed and grades for all children, and the

effects of all videogames that include violent content on all adolescents.

The set of all measurements of interest is called a population (note that in statistics a

population is a set of measurements, not a collection of people). Clearly it is not practical, or

even possible, to collect all the measurements of interest. What a psychologist must do is

collect a subset (a smaller number) of measurements from the population of interest. This

subset is called a sample.

Psychologists spend a great deal of time working with samples. An opinion poll is a

sample of peoples’ attitudes; an IQ score is a sample of an individual’s intelligence; people

who volunteer for experiments are samples intended to represent all people. It should be

noted that sampling is a complex issue and requires careful thought and planning, since it has

important implications for the generalisation of any results. Generalisation is when we try to

make claims about a population based on our sample, for example, we might collect data

about the effectiveness of an intervention for 200 people who have a mood disorder and then

we make a claim to say that the intervention would have the same effect for all people with

the same mood disorder. For a sample to be useful it must represent the population it was

drawn from. That is, we want to be able to generalise from our sample back to the population

that we’re interested in. In order for samples to be representative of a specific population they

must be large and randomly selected from that population.

Many criticisms of research in psychology are based on the sample not being

representative of the population. Whether this is a valid criticism of the research depends on
175.102 Psychology as a natural science

the details of that particular study. If we ran a survey investigating food preferences with a

sample collected in New Zealand then we might not want to generalise our findings to people

living in China, but it might be acceptable to generalise our findings to people living in New

Zealand. To summarise, if we want to generalise our findings from a sample to a population,

then we should be sure that our sample is representative of the population.

175.102 Psychology as a natural science

Figure 1. The obtaining of a sample of measurements from a population of interest.

Sampling examples

 To obtain a sample of blood pressure levels we may take blood pressure

readings of 40 people chosen at random.

 To evaluate whether an intervention for depression is effective, we might

recruit 100 people who suffer from depression using a depression

questionnaire to identify the severity of the symptoms.

 To establish whether breakfast helps children to learn at school in New

Zealand, we could recruit children from 200 families who attend 4 schools that

were chosen to represent the cultural background of New Zealand’s


 To find out whether video game violence influences aggressive behaviour in

adolescents, we could recruit young people aged 10-19 to participate in a lab

based experiment.

Psychologists must be able to describe their sample accurately and concisely. The

numbers used to do this are called descriptive statistics.

175.102 Psychology as a natural science

A sample is invariably made up of a range of measurements. To describe a sample, it

is therefore important to describe not only the central or most frequent values within of the

measurements, but also the spread or variability of the measurements.

That is, descriptive statistics can be conveniently divided into two groups:

1) measures of central tendency

2) measures of variability

Measures of central tendency

Measures of central tendency describe the centre of the distribution of the sample

measurements. In statistical language, we define a distribution as all of the individual values

of the sample.

a) The arithmetic mean ("mean" or "average") is the sum of the sample measurements

divided by the number of measurements in the sample. The mean takes into account the

value of each score and so uses more of the information we have obtained than either the

mode or the median. It should be used to estimate the centre of the distribution of scores

for interval or ratio data. However, the mean is too influenced by "outliers" (scores that

are very much larger or smaller than most of the scores in the data set) to be a satisfactory

measure of central tendency when the distribution is markedly asymmetrical.

For example, if you measured the height of 10 randomly sampled people that you

met on the street, you would find a deceptively high mean height if the random

sample included Alex Pledger from the 2016 NZ Breakers because he is 2.15m

tall (the mean class height for this year’s 175.102 is approximately 1.67m).

175.102 Psychology as a natural science

b) The median is the score or value of the middle item and is obtained by arranging all

measurements in the sample in ascending order. For an odd number of measurements, the

median is the middle measurement. For an even number of measurements, the median is

the average of the two middle measurements (the sum of the two middle values divided

by two).

Although we know that half of the cases are above or below the median, we do not know

how far above or below the median they are. Unlike the mean, which takes account of

each score, the median is based on only one or two values. In its favour, the median is not

affected by extreme values, and so is appropriate when the distribution of scores is not

symmetrical. It is useful when the data are measured on an ordinal scale.

An example of when the median is a useful measure might be when you plan to sell

your home and wonder what the likely price you might sell it for. If you calculated the

mean price for your area, this might be skewed by the recent sale of a $40m mansion,

meanwhile, the median would not be much affected by the sale of a mansion. In an

average neighbourhood with 10 sales at $600k to $700k, the median price would not

be much affected whether the mansion sold for $40 million or $400 million, the

median price would still be the middle value from the distribution (in contrast, the

mean would be strongly affected).

c) The mode is the scale value that occurs most often in a set of data. It is possible for

there to be more than one mode and if each value occurs only once there will be no

mode. Generally, the mode does not provide a great deal of information because it is

based on only one value and perhaps few scores. The mode is useful when what you are

interested in is predicting actual values, unlike the mean, which might not represent any

actual score in the set of data. The mode is especially useful for data measured on a
175.102 Psychology as a natural science

nominal scale. It would make no sense to find a mean or median Psychology paper, but

the mode, the most commonly chosen paper, is meaningful.

An example where the mode might be used is when we discuss the demographics of

our sample. We might report that the majority of our sample were European New

Zealanders because this was the cultural identity that was reported most often (this

can also help us to work out if our sample is representative of the population). Think

about how ridiculous it would be to try and use the arithmetic mean or the median to

describe our sample’s cultural identity.

Measures of variability

Everything that psychologists measure can vary. Height, age, gender, intelligence,

feelings of happiness, hours watching television are all variables on which people differ. It is

important, therefore, to describe the amount of variability in sample measurements.

a) The range is the difference between the largest and smallest measurement in the sample.

This statistic is quick to calculate but, because it is based on only the two extreme

measurements, it is often misleading. For example, the range is useful to lecturers when

evaluating the assessments in a paper (lecturers get worried if the minimum value for an

assessment is very low)

b) The standard deviation (SD) is a measure of how spread out the individual scores are

around the mean. It is based on deviation scores that tell you how large a difference there

is between a particular score and the mean, and whether that value is above or below the

mean. The mean acts as a balance point for a set of scores, and so the sum of the

deviations above and below it will always cancel out and equal zero. Therefore, if we

want to find a simple number to represent the way the scores are spread around the mean,

175.102 Psychology as a natural science

we cannot use the average deviation (because this will be zero divided by the number of

scores). The best solution for statistical procedures is to use the average squared

deviation —since the square of a number is always positive. The sum of the squared

deviations is known as the variance. It is a simple matter to convert the variance into a

measure of distance, expressed in the same measurement units as the original data, by

taking its square root. The square root of the variance is the standard deviation. The larger

the SD the greater the variability in the sample. The advantage of the SD over the range is

that, like the mean, it is based on all the sample measurements, and not just the two

extreme measurements. Like the mean, however, it is affected by extreme scores.

The majority of measurements will fall within one standard deviation above and

below our mean value. In clinical psychology, the severity of a psychological disorder is

often expressed as standard deviation. Similarly, when your lecturer gives you feedback

about an assessment, they might tell you the mean class score and the standard deviation so

that you can work out exactly how well you did compared to other students.

In essence, we can summarise a body of data —all the exciting outcomes of our

experiment —in terms of two important classes of statistic, measures of central tendency and

measures of spread. Thus, armed with two numbers, we know the value around which the

scores cluster, and we also know how typical that value is, because we have a statistic that

tells us how spread out the group of scores is. Already therefore we know a great deal about

what went on in our experiment.

Measures of central tendency and variability are the most commonly used descriptive

statistics for looking at samples of data but there are others you will need to know. For

example, a descriptive statistic used to describe the relationship between two variables is the

correlation coefficient.
175.102 Psychology as a natural science

Graphical Presentation of Descriptive Statistics

A graph is a way of displaying data in pictorial form. Two lines are drawn at right-

angles to each other, one vertical and one horizontal. Each line is referred to as an axis.

1. Bar Graphs and Histograms

A bar graph is generally used when the scale values represent discrete intervals,

while a histogram is used for continuous scale values.

Continuous scale values are those for which, at least in theory, an infinite

number of values may be assigned, therefore any value is actually only an

approximation. For example, if you measure your height at 180 cm, that measurement

is an approximation, because you could have continued to measure your height in

millimetres, hundredths of a mm, thousandths of a mm, and so on, ad infinitum. Both

types of graph can be used for ordinal, interval or ratio data, but a bar graph, unlike a

histogram, is also suitable also for data measured on a nominal scale, as in Figure 2.

The principal difference in the construction of bar graphs and histograms is that in a bar

graph the columns (or bars) are separated by spaces along the X-axis.

No space is left between the columns in a histogram because the scale values

represent a continuous distribution, as in Figure 4.

175.102 Psychology as a natural science

Bar Graph Examples

Both examples generated using Microsoft Excel


NZ Population (2013)




Male Female

Figure 2. The sex of the New Zealand population from the 2013 census. Data retrieved


Number of People











Age Group

Figure 3. New Zealand population split into age groups based on the 2013 census. Data

retrieved from

175.102 Psychology as a natural science

Histogram Examples

Charts generated using the statistical package R and ggplot2.

Figure 4. Histogram showing the Height of 30 students in the 2016 offering of 175.102.

Frequency Distribution Examples

It is often useful to display the central tendency and variability of a sample using a

frequency distribution. Figure 5 depicts the test score distributions or two classes of school

children. (The distributions are displayed as smoothed polygons, although the data are

actually discrete.)

175.102 Psychology as a natural science

Figure 5. Distribution of student test results for two classes with different teachers.

The mean is represented by the upright line at 64, and the variability in the scores is

reflected in the spread of the curve. While the average test score is the same for both classes

the variability (which includes the standard deviation) is clearly much greater for teacher B.

We might hypothesise that teacher characteristics were the cause of the different amounts of

variability in each class. Perhaps teacher B is able to get the best out of bright students but

has a teaching style which is incompatible with less able students? If so we might assign

such a teacher to running gifted classes for bright children or train the teacher how to engage

with students of all abilities. Note how we have begun to use these sample statistics to make

an inference about how the teacher might perform at other times, or with other children, that

is, beyond the sample studied. These issues will be elaborated on in the following section on

making inferences from the sample to the population.

In many investigations researchers compare sets of data to see if there is any

difference between them. For example, they may wish to investigate a gender difference by

looking at the results for males and females separately. It is convenient to conceptualise such

comparisons in terms of frequency distributions (see Figure 6).

175.102 Psychology as a natural science

Figure 6. Comparison of the performance distribution of males and females on a

hypothetical task.

There are several points to note in analysing such distributions. Measures of central

tendency and measures of variability can both provide useful information. In the present

example, the performance of females is, on average, better than for males (the means).

However, there is a large overlap between the two distributions. That is, just because the

females are better on average doesn’t mean that all females are better than all males. Indeed,

based on these two distributions, many of the males were able to perform better than many of

the females. This provides a warning about placing too much emphasis on means alone,

which can lead to a stereotyping of groups characteristics.

Along with central tendency measures, it is also important to consider the variability

of the distributions. In the present example the two distributions seem to have a similar

amount of variability. If the distributions were different it would be worth asking why. For

example, imagine that the female distribution was more spread (greater variability)—how

might we account for this? Perhaps the range of female ability was greater than the range of

male ability prior to the start of the course and the difference in variability had nothing to do

with our test. A real world example of this is that when we compare intelligence between

groups then we must ensure that the groups are compared to their own demographics

175.102 Psychology as a natural science

distribution for that test (this is why we might choose to express an individual’s score as a

standard deviation compared to their group’s mean for that test).

You should also be aware that many of the variables studied in psychology form a so-

called normal distribution. A normal distribution is a bell-shaped continuous distribution

that is symmetrical about the mean. The distributions in Figures 2 and 3 above are normal

curves. This normal (also called Gaussian) distribution has some wonderful properties that

make it the cornerstone of statistical inference. Many measurements in psychology and

elsewhere follow the normal distribution, for example, height, weight, intelligence (measured

using an intelligence test), average reaction time, memory capacity for lists of items, and so


Making Inferences from the Sample to the Population

Ultimately our sample statistics are used as a basis to make inferences about the

population the sample was taken from. The confidence we can have in our statistics being a

reasonable estimate of what exists in the population depends to a large extent on how large

and representative the sample was.

For example, we hope that our sample of blood pressure readings from 40 people does

represent the population of blood pressure levels, but there is room for doubt. Are the 40

people representatives of all people? Does the measurement represent blood pressure levels

at various times of the day?

Finally, when we are comparing two groups (e.g., male vs. female, young vs. old), we

should consider whether a difference in the means of our samples actually reflects a real

difference in the population from which the samples were taken. Perhaps, after all, the sample

difference is simply a chance effect. When we want to generalise results from our samples
175.102 Psychology as a natural science

back to the population, we should be reluctant to conclude that there is a difference in the

population groups even though there is some difference between the samples where:

1) the difference in the sample means is small

2) there is a lot of variability in the sample measurements (with a lot of overlap between

the two sample distributions)

3) there is only a small number of sample measurements

In these cases, we use formal inferential statistics to help us draw conclusions about

populations from sample data with a measurable degree of confidence. Statistical inference

allows us to draw a rational conclusion about the actual state of things on the basis of

incomplete information, but in accordance with statistical probabilities.


Why might we be cautious in drawing inferences about the population from our sample data?

175.102 Psychology as a natural science

Exercise 2

Measure your height in centimetres or get someone to help you to do this.

My height is _________ cm

There is a table and a figure over on the next page that you can use to answer the

following questions.

1) Identify where your own height is located on this histogram of the height of

individuals who participated in the pre-course survey for 175.102 in 2019. Mark the

location with a small arrow.

2) Mark the location of the mean and median on the histogram using the table below.

Are you above, below, or the same as these values

ANSWER: you would have placed the location at 168cm. Your answer to the second

part of the question would vary depending on whether you are taller or shorter than

the average of the class

3) You might notice that the median and mean are the same in this sample. Why do you

think that this is the case. Hint: look at the shape of the distribution, is it

symmetrically or asymmetrically spread? Are there fewer outliers at the upper end of

the range of heights compared to the lower end of heights?

ANSWER: The distribution of height is roughly symmetrical. As many people are

below the average height as are above the average height. Mean values are biased by

outliers, so if the NZ Breakers basketball team were included in our sample then the

mean would be higher than the median (the median isn’t affected by outliers). There

are as many outliers at the left side (short people) of the distribution as there are at the

right (tall people) of the distribution.

4) Mark the one standard deviation around the mean value for the class. To mark the

lower estimate, subtract the class standard deviation from the class mean; to mark the
175.102 Psychology as a natural science

upper estimate, add the class standard deviation to the class mean value. Are you

more than one standard deviation away from the mean? How might your distance

from the mean change if the histogram was drawn only for males or females?

ANSWER: You would have marked the upper standard deviation at 177.92 cm and

the lower standard deviation at 158.08. If you are male and the distribution that you

were compared to was all female then you would be more likely to be above the

mean. In contrast, if you are female and the distribution that you were compared to

was all male then you would be more likely to be below the mean. If you were

compared to your own biological sex group then you would be likely to be closer to

the mean value than if compared to a mixed group (the distribution would be more

representative of you as an individual).

5) The average height of New Zealanders is as follows: male 177 cm, female 164 cm,

mean height 170.5 cm. Is the class average height similar to this value?

ANSWER: the class average height is reasonably close to the population average

height. It is slightly lower than the population average height because there are more

female students than male students in the class, but the NZ population height is

calculated from close to 50% males and 50% females.

175.102 Psychology as a natural science

Figure 7. A histogram showing height of students who participated in the pre-course

survey for 175.102 in 2019.

Table 2. Descriptive statistics for heights of students who participated in the 175.102
class survey.

Descriptive Statistic Value

Class mean 168 cm

Class median 168 cm

Class minimum 130 cm

Class maximum 210 cm

Class standard deviation 9.92cm

Number of participants 165 (4 missing values)

175.102 Psychology as a natural science


