Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 18


Lesson 1


Data (singular datum) is whatever we

collect in a study. It can be numerical or not
A datum may be referred to as a data point
in a table or graph. It may also be referred
to as a score
Data set refers to all the data collected in an

Types of Data

Data are first categorized by numerical, and non


Quantitative data can be interval or ratio

In research, numerical is called QUANTitative, and

non-numerical is called QUALitative
The difference is that ratio data have an absolute 0
and standard increases, so for example, 4 would be
twice as more than 2

Qualitative data can be nominal or categorical

This distinction is irrelevant, as the analyses for

qualitative data are all the same


A variable is what we collect data of. It

determines what type of data we have
We call it a variable because it can have
different values that can be measured
e.g. IQ is a quantitative variable: we collect
numerical data from it
e.g. Hair colour is a qualitative variable: we
collect non-numerical data from it


The population is the complete group from

which we collect data, and what we make
conclusions about in a study.
A population can be anything, from the total
people of a country, to a class of
psychology students, to the total number of
Oak trees in the world.


However, populations are usually to large to

collect data from, so we select a group from the
population at random, called a sample
If the sample is randomly selected from the
population, we can assume that it represents the
population, and we can make conclusions about
the population from the data we collect from the
Population random sample collect data
conclusion about the population.


The individual is a single member of the

Samples consist of a group of individuals
selected from the population
Populations consist of ALL individuals we
want to study

Descriptive Statistics

Before we can do inferential statistics to

make conclusions, we need to know things
about the population first: We need to
describe it.
Descriptive statistics are calculated from the
data set, or raw data (the data before ANY
calculations or manipulations are done to it)

Central Tendency

The first type of descriptive statistic we calculate

is a measure of central tendency.
Central tendency essentially means the center
point of a data set, or the number that all datum
have a tendency to be close to.
Measures of central tendency include the mean,
the median, and mode, and each serve different

The Mean

The mean is the standard measure of central

tendency, and the most useful for doing inferential
The mean is simply the average: all the scores added
together then divided by the total number of scores
Sometimes the mean does not adequately measure
the central tendency, such as when one or more
scores are extreme compared to the rest.

e.g. 3, 5, 6, 30: 30 is extreme (a single score

significantly changes the mean)

The Median

In cases of extreme scores (called

outliers), we need a different measure of
central tendency that is not affected by
That's the median: the center of a data set
when the scores are listed from smallest to
greatest, or vice versa

Smallest to largest is the standard and

expected way to find the median

The Mode

Mean and median only work for numerical data, like

age or IQ scores.
Sometimes we collect data that is not numerical, like
hair colour or gender.

We cannot find the average number hair


But, we can find the most common hair colour

The mode is simply whichever category appears the


Frequency means something specific in

It is the number of times a score appears in
a data set.

Recall, the mode is the score with the

highest frequency

Later we'll learn how to construct a

frequency table


Recall that measures of central tendency are

based on variables: things that have different
values (it varies between individuals)
This means there are differences between
individuals, resulting in different scores for each
We call these differences variance.

Data comes from a variable, which varies,

resulting in variance


A distribution is a graphical representation

of a data set
It gives us a picture of how the data are
distributed in the sample or population
It is created using, and gives us a picture
of, the central tendency, frequency, and
It is used in the final steps of an analysis

Inferential Statistics

Using descriptive statistics (mean, variance, etc.),

we can conduct an analysis
An analysis (plural analyses) is a mathematical
procedure that takes numerical information
(descriptive statistics) about a population, and
determines a probability
Based on the results of analyses, we make
conclusions about the population, called


Probability is the most basic form of inferential

statistics, and often discussed within the context
of descriptive statistics
Probability is simply the likelihood of a specific
e.g. If you flip a coin, there are 2 possible
outcomes: heads or tails. Therefore, the
probability of getting heads is 50%, and 50% for

Hopefully by now you have a rudimentary
understanding of the terms we use in statistics.
We will use the terms from this lesson quite a bit in
future lessons, so make sure you are clear on the
terms. If you need to, come back to this slide
show, or save it as a reference.
Referring back to these as you go will help you to
better memorize the terms by using them, rather
than just reading them over and over again.

You might also like