Chapter 1: Introduction: 1. Descriptive and Inferential Statistics

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 21

Psychological Statistics

Chapter 1: Introduction
Statistics helps you describe and better summarize numbers, and can even help you to make
assumptions about the greater population, when you can measure only a small sample of
individuals.

1. Descriptive and Inferential Statistics

Descriptive statistics are used to describe the group of individuals that you have in front of
you. Inferential statistics allow you to make inferences about the larger population that
interests you.

For example, you want to know if women really do cry more often than males while
watching chick flicks. You can’t measure the entire population of women (it would take
forever!), but you can test a group of 100 randomly sampled men and women and measure
the minutes they spent crying at the same sad movie. Then by using statistics, you can draw
an inference about the population (i.e., make generalizations about the entire population of
men and women based on your sample).

2. Populations and Samples

Population: The entire group that you are interested in. This could be very general—e.g., all
people in the world—or specific—e.g., just blue-eyed guys in their 30s from Texas.

Sample: A selection of people (cases, observations, etc.) from a larger population—e.g.,


1,000 of the roughly 6.8 billion people in the world, or just 50 of those approximately
300,000 blue-eyed Texan gentlemen.

Population Sample

The entire Introductory Psychology class A subset of students in the Introductory


Psychology course
All of the classes in the Psychology A subset of psychology classes
Department
All departments within the university A subset of departments within the
university

One number that summarizes a bunch of numbers in a sample is called a statistic. The same
kind of summary applied to a population is called a parameter.

Adapted by: Prof. Christian Ranche 1


© Welkowitz, Cohen, & Brooke Lea
Psychological Statistics

3. Measurement Scales

The four measurement scales can be represented by NOIR, that is, N for Nominal, O for
Ordinal, I for Interval, and R for Ratio.

Here is your first exercise:


1. Using the first letter of each type of scale, label each of the variables below, in the space in
front of it, with the type of scale that would most likely be used to measure that variable.

___Hair Color of Hipsters ___Cost of Overpriced Pair of Jeans


___Temperature (Kelvin) ___Shoe Size
___Temperature (Fahrenheit) ___Weight of Sumo Wrestlers
___Temperature (Sunny to Rainy) ___Karats of Bling on Rapper du Jour
___Gender ___Attractiveness Ratings (Hot to Not)
___Likert Scale (1 to 7) ___Attractiveness Ratings (1 to 10)
___Minutes Spent on Facebook per Week ___# of Text Messages Sent per Day
___ # of Siblings ___Movie Genre
___Car Colors ___Height of America’s Next Top Model
___Likert Scale (Disagree Strongly to Agree Contestants
Strongly)

4. Independent and Dependent Variables

2. For each of the proposed experiments below, identify the Independent Variables (IV), as
well as the Dependent Variable (DV):

a. Women are assigned to conditions in which they are presented with the photo of a male,
and each time a different cologne scent is sprayed. After each presentation, they are asked to
rate the attractiveness of the male.

b. Men are asked to perform karaoke of “Oops! . . . I Did It Again” by Britney Spears in
front of a live audience and receive either positive (screaming and whistling) or negative
(booing) reinforcement. Afterward, the singer is asked to rate his success in his performance,
and how much he would want to do it again.

c. New puppies are being trained to sit. They are randomly assigned to trainers (nice or
strict) and to differing reward treats (bacon or peanut butter). The number of minutes it took
for the puppy to learn to sit on command was recorded.

d. Researchers are interested to see how well they can predict success in the Miss USA
Pageant based on minutes spent in the tanning booth.

Adapted by: Prof. Christian Ranche 2


© Welkowitz, Cohen, & Brooke Lea
Psychological Statistics

e. Minutes spent watching The Hills are recorded for college freshmen to investigate if a
correlation exists with their GPA at the end of the year.

5. Summation Notation and Rules

3. For the following summation exercises, the X values are 3, 2, 1, 4.

a. What is the value of ΣX?


b. What is the value of ΣX2?
c. What is the value of (ΣX)2?
d. What is the value of Σ(X + 1)2?

4. For the following summation exercises, the X values are 5, 1, 3, 2, and the Y values are 1,
4, 2, 6, in the same order.

a. What is the value of Σ(X + Y)?


b. What is the value of ΣXY?
c. What is the value of (ΣX)(ΣY)?

Adapted by: Prof. Christian Ranche 3


© Welkowitz, Cohen, & Brooke Lea
Psychological Statistics

Chapter 2: Frequency Distributions and Graphs


If we wanted to make a chart to keep track of how many people got As, Bs, Cs, Ds, and Fs in
a statistics course, we could create a Frequency Distribution table. There are several types
of frequency distributions we will describe, because they relay frequency information in
different ways.

1. REGULAR FREQUENCY DISTRIBUTIONS

A list of every possible grade next to the number of people who received that grade.

Grade Frequency
A 4
A– 11
B+ 7
B 10
B– 4
C+ 3
C 4
C– 2
D+ 1
D 0
D– 0
F 2

2. CUMULATIVE FREQUENCY DISTRIBUTIONS

A list of every possible grade and the number of people who received that grade or lower.
As a note, you should always add the frequencies starting from the lowest score in the
distribution—in this case, from the bottom of the table, working up. (Note that the highest
cumulative frequency should equal N.)
Grade Cum Freq
A 48
A– 44
B+ 33
B 26
B– 16
C+ 12
Adapted by: Prof. Christian Ranche 4
© Welkowitz, Cohen, & Brooke Lea
Psychological Statistics

C 9
C– 5
D+ 3
D 2
D– 2
F 2

3. GROUPED FREQUENCY DISTRIBUTIONS

When there are many different values of X to report (e.g., more than 20), it can be helpful to
group them together into classes. For instance, in the following table, you’ll notice that all of
the Bs (i.e., B, B+, and B–) are grouped into one B category. The preceding table did not
have so many scores that it cried out to be grouped, but it gave us a convenient way to
illustrate grouping. Often, the number of scores/groups to use in a frequency distribution is
a judgment call based on a trade-off between the need for detail and a need to see the big
picture.

Grade Cum Freq


A 15
B 21
C 9
D 1
F 2

4. PERCENTILES AND PERCENTILE RANKS

A percentile rank tells you where you stand compared to the other people in your group.
More specifically, it tells you the percentage of the people that you surpassed. As an
example, if you score at the 85th percentile on a test, you beat 85% of the people who took
that particular test.

Percentiles represent cutoff points that are of particular interest. For example, people are
often interested in knowing what score is associated with the 50th percentile, or each
quartile cutoff (e.g., 25th, 50th, 75th), or every decile (e.g., 10th, 20th, 30th, . . .). A
percentile is a raw score value that corresponds to a particular percentile rank.

5. GRAPHIC REPRESENTATIONS

Bar Charts / Histogram:

Adapted by: Prof. Christian Ranche 5


© Welkowitz, Cohen, & Brooke Lea
Psychological Statistics

25

20

15

10

0
A B C D F

The example to the left is a bar chart. You can


tell, because adjacent bars do not touch, which is
appropriate because letter grades do not
constitute a continuous scale. This bar chart
would be appropriate when you are interested in
looking at a discrete variable. That is, when there
are no finer grade distinctions than between B
and B+ for example.
A histogram is very similar to a bar chart
25 except that adjacent bars touch because
the variable being graphed is continuous.
20
For example, if we wanted to get into a
15 more refined grading system, we could
move to continuous numbered grades –
10
e.g. 90, 91, and so on. Assuming that there
5 is partial credit for scores and that your
instructor uses a highly specific grading
0
scale so that any score is possible, you
100 90 80 70 60 50 40 30 20 10 0
could imagine getting a grade of 91.25 or
76.97, at least theoretically.

Line Graphs / Frequency Polygons:

A frequency polygon is a convenient way of


25
displaying a frequency distribution, as a simple
20 line graph. In the regular frequency polygon to
15 the left, we are displaying the same data as in the
bar chart we just showed you. You could also
10
create a cumulative frequency polygon from the
5 data in a cumulative frequency distribution, like
0 the one we created for the letter grades. Line
A B C D F graphs are preferred when there are too many

Adapted by: Prof. Christian Ranche 6


© Welkowitz, Cohen, & Brooke Lea
Psychological Statistics

bars in your chart, or when your variable is continuous and it is just easier to look at it with
a smooth line.

Stem-and-Leaf Displays:

9 655532111111000
8 998877766666555552
210
7 887555521
6 9
5
4
3 2
2 1
1
0

Stem Plots:

In the stem plot to the left, the


stem represents the first digit in
the score (e.g., whether the person
falls in the 90s, 80s, 70s, etc.). The
leaf represents the second digit in
the score, e.g., the 5 in 95. Stem
and leaf displays are helpful
because they retain the specific
information about exactly what
scores were attained, but if you flip
one on its side, it looks a bit like a
bar chart, and shows you the
shape of the distribution . . . and
that’s pretty cool!

Adapted by: Prof. Christian Ranche 7


© Welkowitz, Cohen, & Brooke Lea
Psychological Statistics

30 Box Plots:
25
20 A Box plot, often referred to as a Box and Whisker
15 plot, is a graphical display that includes the median,
10 the 25th and 75th percentiles, and usually, the
5
maximum and minimum values in your dataset. The
0
top and bottom of the box represent the 25th and 75th
-5
A B C D F percentiles, the line inside is the median, and the
-10
whiskers (lines extending from the box) reach the
maximum and minimum values from the data.

6. SHAPES OF FREQUENCY DISTRIBUTIONS

Symmetry

Ask yourself this question: where do the bulk of the scores lie? If most center around the
middle (aka the mean), and there are roughly the same amount of scores on both sides of
the center, there’s a good chance you have symmetry! In other words, if you could slice the
graph down the middle and the two sides look like mirror images, then you have a
symmetrical distribution.

Normal Distribution (always symmetrical)


Psychological Statistics

Bimodal Distribution (can be symmetrical)

Skewness

The most common reason for a distribution not being symmetrical is that it is skewed (i.e.,
one tail is longer than the other because the scores are more extreme on one side).

To determine if a curve is negatively or positively skewed, draw an arrow (mentally) on the


tail: if the arrow is pointing positively, then it is positively skewed, and if the arrow is
pointing negatively, then the distribution is negatively skewed.

Positively Skewed Negatively Skewed

Modality

What is a mode?

The mode is just the biggest hump in your distribution. More


technically, the mode represents the value that occurs most
frequently. In our previous visuals for the normal
distribution and positive and negatively skewed
distributions, there was only one hump, because there was
only one most common score. The camels will help us to
explain distributions with more than one hump.

The camel above is unimodal, which is most common in distributions of psychological data.

The camel to the left is bimodal. A distribution may become


bimodal when there are
two distinctly different
subgroups in the
population.
Psychological Statistics

The camel to the right is trimodal. You won’t see many of him in the real world, nor are you
likely to see a truly trimodal distribution of data.

Now try these examples:

Imagine that you are an anthropologist who has discovered a group of never-before-seen people who
are native to the continent of Antarctica. You suspect that because they live in such a cold climate,
their normal body temperatures differ from other human beings. To test that hypothesis, you manage
to measure the body temperatures (in degrees Fahrenheit) of 25 Antarcticans. The measurements are:

97.6, 98.7, 96.9, 99.0, 93.2, 97.1, 98.5, 97.8, 94.5, 90.8, 99.7, 96.6, 97.8, 94.3, 91.7, 98.2, 95.3, 97.9,
99.6, 89.5, 93.0, 96.4, 94.8, 95.7, 97.4.

1. Create the following: (a) histogram, (b) stem plot, and (optional) (c) box plot for the data above.

2. What is the percentile rank for a temperature of (a) 95.0 ? (b) 98.6 ?
° °

3. What temperature is at the (a) 30th percentile? (b) 65th percentile?

Chapter 3: Measures of Central Tendency and


Variability
Mean: literally, the average
Mode: the number that shows up the most
Median: literally, the middle number (unless there are an even number of data points; then
average the two middle numbers)
Range: take the largest number and subtract the smallest number – voilà!

Let’s walk through an example…


Psychological Statistics

Suzy Q is about to start at NYU. She’s trying to decide between two dorms: Happy Hall and
Terrific Tower. She tries to contact everyone in each building (which would have given her
population statistics), but only ten people from each dorm actually replied (which left her with
sample statistics). [As a note, because the participants who replied were self-selected, this means
that these are not truly random samples, but for the sake of the exercise, we will treat them as
such.] She asked for a rating (scale: 1 – 10) of their experience living in their respective dorms,
and the ratings were as follows:
Happy Hall: 5.5, 4.5, 6.0, 7.0, 3.0, 1.0, 5.5, 9.0, 2.0, 3.5
Terrific Tower: 6, 7, 8.5, 7.5, 6.5, 9, 9, 4, 6.5, 8

Happy Hall’s Statistics

Mean: (5.5 + 4.5 + 6.0 + 7.0 + 3.0 + 1.0 + 5.5 + 9.0 + 2.0 + 3.5)/10 = 4.7
Median: 1, 2, 3, 3.5, 4.5, 5.5, 5.5, 6, 7, 9  average of 4.5 + 5.5 = 5
Mode: 5.5 shows up twice, so 5.5
Range: lowest number = 1, highest number = 9  9 – 1 = 8
Terrific Tower’s Statistics: you fill in the values…

Mean:
Median:
Mode:
Range:
Psychological Statistics

Statisticsa
VAR00001
N Valid 10
Missing 0
Mean 4.7000
Median 5.0000
Mode 5.50
Std. Deviation 2.40601
Range 8.00
a. VAR00002 = 1.00

Statisticsb
VAR00001
N Valid 10
Missing 0
Mean 7.2000
Median 7.2500
Mode 6.50a
Std. Deviation 1.54919
Range 5.00
a. Multiple modes exist. The
smallest value is shown
b. VAR00002 = 2.00

Try this example…

Imagine that you are an anthropologist who has discovered a group of never-before-seen people
who are native to the continent of Antarctica. You suspect that because they live in such a cold
climate, their normal body temperatures differ from other human beings. To test that hypothesis,
you manage to measure the body temperatures (in degrees Fahrenheit) of 25 Antarcticans. The
measurements:

97.6, 98.7, 96.9, 99.0, 93.2, 97.1, 98.5, 97.8, 94.5, 90.8, 99.7, 96.6, 97.8, 94.3, 91.7, 98.2, 95.3,
97.9, 99.6, 89.5, 93.0, 96.4, 94.8, 95.7, 97.4.

a. What are the mode, median, and mean for the data? Which way does the distribution seem to
be skewed?
b. What are the range, mean deviation, and standard deviation?
Psychological Statistics

Chapter 4: Standardized Scores and the Normal


Distribution
1. The Properties of z Scores
Psychological Statistics

To compare two individuals who are in different distributions, it can be a big help to change their
raw scores to standardized scores, and then compare the standardized scores. The simplest and
most common standardized scores are the ones known as z scores.

1) Above or Below the Mean: Once raw scores have been converted to z scores, it is
amazingly easy to tell if a data point lies above or below the mean of its distribution: if
it’s a positive value, the data point is above the mean, and if it’s a negative value, the data
point is below the mean. Simple as that.

2) Distance from the Mean: The magnitude of a z score tells you immediately how many
standard deviations it is from the mean. If a z score is +2 or –2, you know right away that
the score is pretty far (i.e., 2 SDs) from the mean, and in a bell-shaped curve, it would be
pretty unusual.

3) Comparing Variables on Different Scales: Standardized scores make it possible to


compare two raw scores that are measured on very different scales. For example, you
could compare the number of hours someone spent stalking people on Facebook in a
month to the number of face-to-face dates she went on during the same month. The two z
scores would tell you where she fell in the Facebook stalking distribution and where she
fell in the dating distribution, respectively (e.g., she might be low on stalking relative to
her peers, but relatively high in dating). By converting to z scores for the two different
distributions, you’ve managed to compare apples (e.g., number of hours) with oranges
(e.g., number of dates).

After converting raw scores to z scores, the mean of your numbers will be zero, and the standard
deviation will be 1. The new mean and SD are consequences of using the following formula:

Let’s try an example: Imagine that you need to acquire everyone’s weight in a fraternity house
in order to match them as participants in a Greek Olympics Wrestling Tournament. Since
disclosing one’s weight can be a touchy subject for some people (the bodybuilder may be proud
to announce his muscle mass to the world, but the lanky cross-country runner may be a bit more
shy on that front), we can at least try to mask the obvious numbers by using z scores. To keep
things simple, we’ll imagine that only eight guys from your frat will be involved in the
tournament.

Data:
Weights of eight fraternity brothers in pounds: 165, 235, 170, 185, 210, 190, 180, 145.

Step 1:
First, find the mean of the weights given: (165+235+170+185+210+190+180+145)/8 = 1480/8 =
185.
Next, find the standard deviation of the 8 numbers: (biased) SD = 26.0.
Psychological Statistics

Step 2:
Then, plug each raw score into the formula: z = (X – µ)/σ, where µ = 185 and σ = 26.

Weight of Each Formula z score


Fraternity Brother
165 (165– –.77
185)/26
235 (235– 1.92
185)/26
170 (170– –.58
185)/26
185 (185– 0
185)/26
210 (210– .96
185)/26
190 (190– .19
185)/26
180 (180– –.19
185)/26
145 (145– –1.54
185)/26

Step 3:
If you knew the z scores for weight of the possible opponents of these frat brothers, you could
then match opponents based on z scores, without having to know their actual weights. For
example, the guy who weighs 235 pounds is nearly 2 standard deviations above the mean and
therefore needs to be matched with someone in the same ballpark to avoid an unseemly
massacre. Note that the table tells you immediately whether someone is below the average of the
eight frat brothers, because he will have a negative z score (–.77, –.58, –.19, –1.54), or above
average (+1.92, +.96, +.19), or right at the average (z = 0). Also, note that the mean of the z
scores is 0, and the (biased) SD is 1.0 (within rounding error—check for yourself!).

The one problem with the matching system just described is that you won’t know if the opposing
team is lighter or heavier on average (or more or less variable), because you are removing the
original mean (and standardizing the SD) when you convert to z scores (we’re assuming that the
opposing team is presenting its weights in terms of z scores, as well). But you will be good at
matching wrestlers who are at the same relative positions in their respective distributions.

2. T-Scores, SAT Scores, and IQ Scores

So now that you’re a pro with z scores, you may be wondering why everyone doesn’t use them
all the time. The answer is: Would you want to tell someone that you scored a –.35 on an exam?
It may be a bit awkward to post that on the refrigerator! To combat this, other scoring scales
have been developed to make people feel more positively about themselves by giving everyone a
positively valued score. (To be real, the main reason for these other scales is to avoid having to
Psychological Statistics

deal with decimal points and minus signs.) Some common examples include: T-scores, SAT
scores and IQ scores.

To create these new scale scores, it makes sense to begin by finding the z score for the raw score
you want to convert. Then, you can simply plug the z score into one of the formulas that follow
to obtain a more convenient and aesthetically pleasing score.

T-Score = 10z +50  mean = 50; SD = 10.

Example: If z = –0.35, then T = 10 (–0.35) + 50 = 46.5 (a big improvement over a negative


score!)

So, unless someone scores 5 SDs below the mean (an extremely rare event), his/her score will be
a positive number. A common use of the T-score is for various psychological tests that are
measured originally on arbitrary scales that have no intrinsic meaning, such as a self-esteem
rating, which may have been measured originally as the sum of a bunch of 5-point Likert scales.
If the original score of someone being tested is transformed to a T score of 40, it becomes
obvious that the person is one SD below the mean (his/her z score would be –1, which is much
more awkward to deal with).

SAT Score = 100z +500  mean = 500; SD = 100

Ever wonder how you ended up with a score of 670 on your verbal SAT, when there were only
45 questions on the test? Well, here’s your answer! Again, the raw score is first converted into a
z score (in the case of 670, the corresponding z score is +1.7), and then the z score is plugged into
the SAT formula. You should note that SAT scores can be thought of as T scores that have been
multiplied by a factor of 10.

IQ Score (Stanford-Binet) = 16z + 100  mean =100; SD = 16

Now it should no longer be a mystery to you why the average IQ score is 100, a number that was
obviously chosen for its simplicity! IQ scores could just have easily been on the T score scale,
but for reasons that are known only to Stanford and Binet (), the common IQ formula took the
form shown in that equation. As noted previously, if you know someone with an IQ score 2 SDs
above 100 (i.e., 132), you know you’re dealing with someone who is unusually intelligent
(someone above the 95th percentile as we will soon show). [Please note that the WAIS IQ scale
is based on the formula: IQ = 15z + 100.]

Now try a few examples using the data from the exercises in the previous chapter:
1. Convert these body temperatures into z-scores: 97.6, 98.7, 96.9, 99.0, 93.2, 97.1, 98.5,
97.8, 94.5, 90.8, 99.7, 96.6, 97.8, 94.3, 91.7, 98.2, 95.3, 97.9, 99.6, 89.5, 93.0, 96.4, 94.8,
95.7, 97.4.
a) How many of these scores are above the mean?
b) What is the spread between the highest and lowest z-score? What does this tell us?
2. Convert the ratings for each dorm (separately) into z-scores:
Psychological Statistics

Happy Hall: 5.5, 4.5, 6.0, 7.0, 3.0, 1.0, 5.5, 9.0, 2.0, 3.5;
Terrific Tower: 6, 7, 8.5, 7.5, 6.5, 9, 9, 4, 6.5, 8

a) Which dorm has a larger gap between the highest and lowest z-score?
b) Does this correspond with the raw data as well?

3. You just started your first teaching gig, and to ensure the students couldn’t decipher what
the
list depicted, you are given a list of their IQs as z-scores, based on the entire school
population.
Convert each one to an actual IQ score, using the Stanford-Binet formula:
2.1, –.8, .3, .25, 1.5, –1.6, 1.8, 2, 0, .2, 2.8, –1.2, –.6, 1.3, 1.6.

a) What is the average IQ score? Is this above or below the mean?


b) Without first converting each z-score to an IQ score, how could you have figured out
the average IQ score by only using the z–score data?

3. The Normal Distribution

The normal distribution (aka the normal curve) is an elusive entity that exists only in theoretical
terms, since the tails of the curve continue endlessly. (It is considered to be theoretical, because
in real life, there are almost always actual endpoints on each side of the curve.) Nonetheless, it is
an important concept to understand, because it shows up (albeit not an exact replica) quite often.
A good example (illustrated here) of an approximate normal distribution would be the curve for
IQ scores (based on the WAIS scale); as discussed in the last chapter, the top of the distribution
(the center point) would be the 100 mark, and all of the rest of the scores would fall elsewhere on
the bell-shaped curve. Note in this example it is obviously only an approximation of the normal
distribution because the tails are finite, as no one could score below zero, and any particular IQ
test has to have a maximum score—no matter what kind of genius takes it!

Areas of Distributions

The area under the curve is considered equal to 100%, with standard cutoff points at each
standard deviation marker as you stray from the mean.
Psychological Statistics

As an example, let’s try to find out how Casey’s IQ fares in comparison to the rest of the
population. (His brother is always teasing him that he’s a meathead whose only talent is catching
a football, but Casey refuses to believe that. Sure, he’s not the best student and rarely cracks a
textbook, but he knows he could go neck and neck with the AP students if he put his mind to it.
Or so he hopes . . .) So his IQ score is 130, which puts him at two standard deviations above the
mean on the WAIS IQ scale (remember that M = 100, SD = 15). When we glance at the normal
distribution in the illustration, we can see that two standard deviations above the mean would
equal 97.72% (13.59% + 34.13 % + 50 %)—which means that Casey scores as high, if not
higher than, 97.72% of the population. Looks like it’s time for Casey to toss the football aside
and pick up that statistics book!

As a quick cheat sheet, relevant to all normal distribution curves, you should memorize the
following values to make your life a little easier when you’re trying to better understand these
percentage values.

Area to the left of the mean + 1 SD = 84.13%


Area to the left of the mean + 2 SDs = 97.72%
Area to the left of the mean + 3 SDs = 99.87%
Area to the left of the mean + 4 SDs = 99.99999%

Looking at the value for 4 SDs should help illustrate why it’s unique to find some value that is 4
SDs (or beyond) above the mean. As an example, at 6'6'', Michael Jordan’s height is only
roughly 3 SDs above the mean. On the other hand, Yao Ming is 7'6'' and clocks in as the second
tallest person in the world; admittedly, he is 6 SDs above the mean, but yeah, it takes being
second tallest IN THE WORLD to get to that point!

One other way to view these values is from the standpoint of figuring out how much of the
population you capture within each standard deviation, starting at the center point (which is the
mean) and working toward the tails.

The quick cheat sheet for those values is as follows:

1 SD in both directions from the mean = 68.26%


2 SDs in both directions from the mean = 95.44%
3 SDs in both directions from the mean = 99.74%
4 SDs in both directions from the mean = 99.999999%

Again, being 4 SDs out from the mean captures almost the entire population; except for the
occasional outlier pretty much everyone is within 4 SDs from the mean. Whereas the first set of
cheat sheet figures translates to percentile ranks, these values will help you to understand where
the middle XX% fall with respect to standard deviations away from the mean.

As a note, these percentages could all be expressed as values from 0–1.00 (i.e., proportions),
which will be more relevant when we discuss probabilities in later chapters. For example,
Psychological Statistics

someone with an IQ score of 115 is at the 84.13%tile, so we can also say that they beat .8413 of
the people in their distribution.

Now you try a few examples:

4. What (approximate) percentile corresponds to an IQ score of:

a) 109? _________

b) 135? _________

c) 90? _________

d) 75? _________

Parameters of the Normal Distribution

Although the normal distribution (ND) has the same basic shape for each one created, the central
point (its mean) and the width or spread of the curve (the standard deviation) are the parameters
that give each ND its uniqueness.

As an example, look at the difference between men’s and women’s heights, both shown in the
same graph here. As you can see, the women’s curve is narrower and taller, while the men’s
curve is shorter and wider. More concretely, women are on average shorter than men (when
comparing means), and there is more variability in men’s heights than in the heights of women.
They are both normal distributions, but with differing means and SDs.

Can you think of a variable in nature that would fall into a normal distribution?

Table of the Standard Normal Distribution


Psychological Statistics

To stave off having to do integral calculus (remember that evilness from your high school days?)
to determine an area underneath the curve every time there is a different mean and standard
deviation, the standard normal distribution was created, with an accompanying table of values.
Keep in mind, the standard normal distribution is based on the mean equaling zero and the
standard deviation being 1, which should sound somewhat familiar to you. Remember those
useful z-scores? Well, they’re back! But this time, they will have abundantly more meaning to
you, since you’re now a burgeoning statistician.

So once you’ve transformed your values into z-scores, you can look up the area under the curve
in Table A of your text. Keep in mind, Table A provides only the percent of area between the
mean and the z-score, which means it is going to cap off at 50% (the top half of the curve), which
is OK, because the curve is symmetrical.

Let’s do an example . . .

If I told you that the z-score for the average running speed for the QB of the UT Austin football
team (when compared to all other QBs in college football) is +2.58, what percentage of QBs
would be faster? First, look at the row for 2.5, and then skim across it to find the column for .08,
and you’ll come to the value 49.51. With this information, you can assume that roughly only .
49% (50.00 – 49.51) of QBs run faster than the UT QB. I smell a victory in UT’s future this
year!

Now, you try to find these values on your own, and explain what each one means.

5. The z-score for the fraternity pledge who ate 14 guppies as part of his hazing process is
+1.89.
About what percentage of pledges did he “beat” in his attempt to please his fellow
brothers?

6. A coffee shop at USC sells an absurd amount of coffee in the mornings and afternoons.
However, around 11 P.M., their sales plummet dramatically. In comparison to every
other hour
of the day, the 11 P.M. time slot has a z-score for sales of –2.97. How dismal are the
sales
for this hour, in terms of its percentile rank?

7. The photography club at NYU has a budget of $14,500 per year for equipment
purchases, which ranks it at 94.50% among U.S. private universities. What is the
corresponding z-score?

4. Finding Areas for Normal Distributions

One thing you need to be aware of is that sometimes you need to determine the area under the
curve that is between two z-scores, as opposed to between a z-score and the mean (which the
Psychological Statistics

table readily supplies). For example, what if you wanted to determine how much of the
population has an IQ (WAIS) between 115 and 130?

First, find the value for 130 (47.72—look at 2.0 SDs from the mean, since 130 is exactly 2 SDs
above 100), and then find the value for 115 (34.13—look at 1.0 SDs from the mean, since 115 is
exactly 1 SD above the mean). Now, to find the area BETWEEN these two values, just subtract
one from the other: 47.72 – 34.13 = 13.59. You now know that 13.59% of the population falls
between the WAIS IQ scores of 115 and 130; you now also know that between one and two
standard deviations on the normal distribution, you end up with an area of 13.59%.

Now you try a few examples . . .

8. What is the area between the following pairs of z-scores?

a) 1.05 and 1.15 _________


b) 2.30 and 2.85 _________
c) 0.00 and 2.15 _________
d) –.34 and –.12 _________
e) –2.30 and –2.85 _________
f) –1.41 and +1.41 _________
g) –.34 and 1.56 _________
h) –3.0 and 3.00 _________
i) 0.00 and 4.00 _________

2. Using SAT scores, what percentage of the population falls between?

a) 650 and 750 _________


b) 450 and 500 _________
c) 210 and 790 _________
d) 500 and 800 _________
e) 200 and 500 _________

You might also like