Professional Documents
Culture Documents
Chapter 1 - 6
Chapter 1 - 6
~ This material is arguably in the “Top Ten Most Important” concepts the students will encounter in the
study of statistics and may merit identifying it as such.
~ The word nominal means having to do with names or labels.
o It involves classifying individuals or events into categories.
o They all have different names but are not related in any way.
o For example: if you were measuring academic majors for a group of university students, those
majors would be classified according to categories such as: psychology, sociology, business
science, biology etc. so each student in that group would be classified in a category.
~ An ordinal scale consists of categories which are organised in a sequence.
o So, measurements are ranked in class.
o Sizes etc.
o For example: in a psychology module, one could rank students according to who came first,
second, third or fourth in the course in terms of academic achievement. The fact that
categories form an ordered sequence means that there is a relationship between categories.
~ An interval scale consists of ordered categories that are all intervals of exactly the same size.
o So, it is characterized by equal intervals between scale units.
o Basically, the difference between two values are meaningful.
o For example: Person A scored 65% in her psychology 253 exam and person B scored 80% in
her psychology 253 exam. We can also then say that person B scored 15% higher than person
A and we can go further and say that it is equal to Person C scoring 70% and Person D scoring
85% (the difference between both sets of scores is equal). When thinking of this in terms of a
scale, the zero point of the scale is arbitrary, in other words it has no zero point.
~ The ratio scale is basically an interval scale with and added characteristic.
o It has an absolute zero which means that a score of zero indicates none of the variable being
measured.
o For example: number of correct answers on a student’s psychology 253 exam. It can either be
any number of correct answers as well as zero correct answers.
Three Data Structures, Research Methods, and Statistics
Data Structure I:
o Descriptive research (individual variables)
o One (or more) variables measured per individual
o “Statistics” describe the observed variable
o May use category and/or numerical variables
~ Data structure 1: One or More Separate Variables Measured for Each Individual:
~ Descriptive Research:
~ This involves measuring one or more separate variables for each person or participant.
~ The intention here is to simply describe as the title says.
~ Think of this example
Individual Number of hours Number of hours Number of hour
exercise in a day sleeping in a day studying in a day
A 2 6 4
B 1 7 3
C 3 4 5
D 4 8 2
~ So in this table we can see that it speaks to how many hours an individual exercises, sleeps and
studies in each day.
~ Here it is different for each person, and we are simply describing the variables by saying that
Person A exercises for 2 hours a day, sleeps for 6 hours and studies for 4 hours each day, and so
on with the rest of the individuals described here.
Relationships between variables
~ Is very important.
o Two (or more) variables observed and measured
o One of two possible data structures used to determine what type of relationship exists
~ Most research aims to examine whether there is a relationship between variables. For example:
~ Is there a relationship between number of hours spent studying a day and the results on a test?
~ Is there a relationship between number of hours spent studying for psychology 253 exam and the
results of the exam?
~ To establish whether there is a relationship – we first have to make observations of the variables.
Data Structure II:
o The correlational method
o One group of participants
o Measurement of two variables for each participant
o Goal is to describe type and magnitude of the relationship
o Patterns in the data reveal relationships
o Non-experimental method of study
~ Data Structure 2 involves the correlational method.
~ So, this is observing one group with two variables being measured.
~ So we just spoke about examining the relationship between variables and this can be done using two
variables.
~ Simply put, we can measure two variables for each individual.
~ You have examples in your textbook but I want to focus on another example here:
~ In the correlational method two different variables are observed to determine whether there is a
relationship between them.
~ Example
STUDENT NUMBER OF HOURS ACADEMIC
SPENT ON SOCIAL PERFORMANCE
MEDIA (Results on a test)
A 5 50%
B 6 40%
C 4 60%
D 2 70%
o This table shows information on 4 students in relation to the amount of time they spend on
social media and the results they score on a test.
o So, you can see that person C spends 4 hours on social media and scored 60% on her test
o Whereas person B spends 6 hours on social media and scored 40% on his test…
o draw scatterplot with whiteboard function.
Figure 1.5 Data structures for studies evaluating the relationship between variables
o One of two data structures for studies evaluating the relationship between variables.
o Note that there are two separate measurements for each individual (Facebook time and academic
performance).
o The same scores are shown in table (a) and graph (b).
o The second data structure for studies evaluating the relationship between variables.
o Note that one variable is used to define the groups and the second variable is measured to
obtain scores within each group.
o So here you are assessing two variables, type of video game (violent and non-violent) and the
second variable - aggressive behavior is measured to obtain the scores as seen in this table
above.
Experimental Method
~ The experimental method is systematic and scientific approach in which the research manipulates one
or more variables.
~ So, one variable is manipulated and the other is controlled.
o For example: administering a diet supplement to one group and the other group receives a
placebo; there weight-loss is being measured.
~ Goal of experimental method
o To demonstrate a cause-and-effect relationship
o This requires manipulation by the researcher changing its value from one level to another.
~ Manipulation
o The level of one variable is determined by the experimenter
o So relating back to the previous example about violent video games and non-violent video
games.
o The research then manipulates by giving one group of boys a violent video game to play and
the other group non-violent video game.
o The second variable as mentioned is the variable that is measured and in this example it is
whether the manipulation had an effect or not.
~ Control rules out influence of other variables
o Participant variables
o Environmental variables
o Regarding control, this is where the researcher makes sure that no external factors influence
the relationship.
o So in the example of the diet supplement where one group receives the diet pill and the other
group receives the ‘fake’ diet pill (placebo), the researcher then makes sure that no extraneous
factors influence the relationship by ensuring maybe the both groups do the same exercise
routine and eat the same foods etc.
Experimental Method: Control
~ Within the experimental method, there can be controlled and experimental conditions.
~ Individuals in a controlled condition do not receive the experimental treatment (for example, the
group of boys who played the on-violent video games or the one group of people who received the
placebo diet pill.
~ Methods of control
o Random assignment of subjects
o Matching of subjects
o Holding level of some potentially influential variables constant
~ Control condition
o Individuals do not receive the experimental treatment
o They either receive no treatment or they receive a neutral, placebo treatment
o Purpose: to provide a baseline for comparison with the experimental condition
o The purpose of such a condition is to provide a baseline for comparison.
~ Experimental condition
o Individuals do receive the experimental treatment
o Individuals in the experimental conditions do receive the experimental treatment (i.e the
violent video games or the diet pill)
Independent/Dependent Variables
~ Independent variable is the variable manipulated by the researcher
o Independent because no other variable in the study influences its value
~ Dependent variable is the one observed to assess the effect of treatment
o Dependent because its value is thought to depend on the value of the independent variable
~ So in our example the independent variable would be the amount of violence (so the violent vs non-
violent video games) and the dependent variable is the level of aggressive behaviour.
Non-Experimental Methods
~ Non-equivalent groups
o Researcher compares groups
o Researcher cannot control who goes into which group
~ Pre-test / Post-test
o Individuals measured at two points in time
o Researcher cannot control influence of the passage of time
~ Independent variable is quasi-independent
~ There are also non-experimental methods.
~ The first being no-equivalent groups:
o Any study that follows a scientific requirement such as the previous examples are known as
experimental studies.
o And as such there are also designs that are no-experimental but still examine the relationships
between variables.
~ A pre-test and post-test study uses time such a s before and after to create groups of scores. Lets
see the examples…
Two examples of non-experimental studies: Figure 1.7
~ Two examples of non-experimental studies that involve comparing two groups of scores.
~ In (a), a participant variable (gender) is used to create groups, and then the dependent variable
(verbal score) is measured in each group.
~ In (b), time is the variable used to define the two groups, and the dependent variable (depression)
is measured at each of the two times.
Chapter 2: Frequency Distributions
Frequency Distributions
~ A frequency distribution is
o An organized tabulation
o Showing the number of individuals located in each category on the scale of measurement
~ Terminology associated with frequency distributions is one of the least “standardized” across
disciplines and texts students might encounter.
~ Instructors may wish to emphasize the importance of being precise with the terms provided by the
text authors, but also be aware that terms may differ in other texts or courses.
~ Can be either a table or a graph
~ Always shows:
o The categories that make up the scale
o The frequency, or number of individuals, in each category
~ I am not going to go into any definitions as you as the student needs to make sure you go through
every concept and make sure you become familiar with the terminology used in statistics.
~ This is crucial…
o So we know that a frequency distribution is basically data organized in a table that can explain
something…
o It can also be distributed onto a graph like we saw in the previous podcast on data structures.
o But here you will learn the statistical input of data in tables or graphs. Like I said before
practice makes perfect, so go through your textbook and practice using the Learning Check
questions.
o So you can know more or less how you may be examined or what a certain questions asks of
you.
o In any frequency distribution it indicates the categories which make up the scale as well as the
frequency or number of individuals in each category.
Example
X f
8 9 8 7 10 9 6 4 9 8
7 8 10 9 8 6 9 7 8 8 10 2
9 5
N= (number of scores 8 7
20) 7 3
X= score
6 2
f= number of times
5 0
scores occurred.
4 1
ΣX = N
X f
5+4+4+3+3+3+2+ 5 1
2+2+1= 29 4 2
3 3
ΣX = 29 2 3
1 1
ΣX2= 97
o see this table here… we have 20 scores but we can see that some scores present more than
once so we can group it as follows…
o we can see that N= sum, X=score and f=frequency (number of times score occurred). Again
here you must basically memorize these symbols because they are used throughout stats.
§ So these set of scores are organised in the table under the column X (excluding the
scores that occur more than once) and remember it must always be tabulated in order
from highest to lowest.
§ Once you organise your scores from highest to lowest you look at each number and
see how many times a score of 10 occurred, a score of 9 occurred and so on…
§ Here you actually count how many timesa score of 10 occurred etc..
§ You can now see clearly the scores in the table. 2 people scored the highest score
which is 10 and you can also see that no one scored 5 but it is included in the table.
§ Remember the 4 different scales that we spoke about previously, with an ordinal,
interval and ratio scale the categories are listed in order rom highest to lowest.
Frequency Distribution Tables
~ Structure of frequency distribution table
o Categories in a column (often ordered from highest to lowest but could be reversed)
o Frequency count next to category
~ Σf = N
~ To compute ΣX from a table
o Convert table back to original scores or
o Compute ΣfX
~ As mentioned, the categories are ordered from highest to lowest and they are displayed under the
column X and the number of times that score appeared is found in the column f.
~ But now we want to know the total numbers of scores in the distribution and this is denoted by the
symbol above.
o to calculate the sum of scores in the distribution you need to look at both columns.
o Let’s look at the next example…
Calculating the scores in a Frequency Distribution
Consider this table…
X f ΣX
5 1 5
4 2 8
3 3 9
2 3 6
1 1 1
ΣX = 29
~ You see a set of scores and how many times the score appeared which is the frequency.
~ So to calculate the sum of the scores we do the following:
o Remember I said in the previous slide you need to consider both columns when calculating
the sum…
~ You will write out each score including the amount of times each score appeared:
~ 5+4+4+3+3+3+2+2+2+1= 29
~ To get the sum on the frequency squared you square each score and add the squared values = 97
This you do on a calculator…
~ Another way to get the sum of X and input in in a table is as follows…
~ Let’s input into the table together…
Proportions and Percentages
Proportions
• Measures the fraction of the total group that is associated with each score
f
proportion = p =
N
• Called relative frequencies because they describe the frequency ( f ) in relation to the total
number (N)
Percentages
• Expresses relative frequency out of 100
f
percentage = p(100) = (100)
N
• Can be included as a separate column in a frequency distribution table
~ The ability to quickly and comfortably convert between fractions (proportions), decimal fractions
(relative frequency), and percentages is fundamental to success in this course.
~ Some students struggle with reconciling the fact that although these are three distinct metrics, they
all point to the same “deep” meaning.
~ There are other measures that describe the distribution of scores which we can also interpret.
~ These are Proportions and Percentages.
§ Proportion measures the fraction of the total group associated with each score so for
example we see in the previous table that two people scored 4 so we can say that 2
out of 10 people had 4, this is then demonstrated in this form above.
~ Researchers also use percentages to describe a distribution, and this can be done by first finding the
proportion and then multiplying that by 100.
§ Use whiteboard to show proportion and percentage...
Example 2.4: Frequency, Proportion and Percent
µ=å
X
N
~ Sample:
M=
åX
n
~ Instructors my wish to have students compare and contrast these two formulas.
~ The mean is basically the average…
~ To calculate the mean we all all the scores and divide by the number of scores in the data.
~ The formula to calculate the mean is shown here.
~ There are two different ways to calculate the mean firstly in terms of the population and secondly for
a sample.
~ Here you can spot the slight differences between the two formulas.
~ For the formula to calculate the mean for the population it is as follows:
~ So this symbol represents the population mean and it is calculated by adding all scores and dividing
by the number of scores. So the sum of X over N.
~ For the sample mean the symbol is M=the sum of x over the number of scores…
~ So if you see a mean being represented by and M then you are dealing with a sample mean.
~ If it it being represented by the U symbol, then you are dealing with a population mean.
~ Lets looks at an example:
Example:
~ Calculating the population mean:
5, 6, 7, 3, 4
~ N= 5 scores
~ Add up all scores first: 5+6+7+3+4= 25
~ The mean is: 25
o So remember we first count how many scores there are, we see that there are 5 scores so N=5
o Then we add up all the scores which gives us 25
o Now to calculate the population mean we say pop mean = 25 divided by 5 =5 therefore our
population mean is 5.
~ Calculating the sample mean:
3, 6, 5, 3, 4
~ N= 5 scores
~ Add up all scores first: 3+6+5+3+4= 21
~ The mean is:
~ Gym attendance per week
The Weighted Mean
~ Combine two sets of scores
~ Three steps:
o Determine the combined sum of all the scores
o Determine the combined number of scores
o Divide the sum of scores by the total number
of scores
Overall Mean =
M=
åX +åX 1 2
n1 + n2
~ So what if we want to now know the mean of two sets of scores?
~ This is where we use the weighted mean formula.
~ So this formula is the Overall Mean equals the sum of scores for group 1 plus the sum of scores for
group 2 divided by the number of scores for group one and the number of scores for group two.
~ Now let’s see this in an example:
Example:
~ Hours of studying for a test per day:
~ Group 1: 3, 4, 5, 2, 6
~ Group 2: 2, 5, 3, 1, 5
~ So here we have a set of scores which indicate the number of hours a day two groups of 5 people
each, study for a test.
~ So we want to find out what the average or mean hours both groups studied for a test.
~ Group 1 hours are 3,4,5,2,6
~ Group 2 hours are 2,5,3,1,5
~ So let’s follow the steps we just spoke about for calculating the weighted mean:
~ Firstly lets add the total scores for each group: Group 1: 20 so sum of x =20
~ Group 2: 16 so the sum of x=16
~ And we know that there are a total of 5 scores per group so now lets write this out…
Computing the Mean from a Frequency Distribution Table
Quiz Score (X) f fX
10 1 10
9 2 18
8 4 32
7 0 0
6 1 6
Total n = Σf = 8 ΣfX = 66
M = ΣX / n = 66/8 = 8.25
Checkers 32
N=100 scores
Mode= 32
Woolworths 18
Pick ‘n Pay 28
Shoprite 22
~ So a basic example is this set of scores, you can see that 6 appears the most (4 times) hence 6 is the
mode.
~ Lets take another example by looking at the table above, you can see the different stores.
~ So for example a sample of 100 students on campus were asked which store they shop at for their
groceries, the most frequent answer is Checkers as 32 out of 100 students said they shopped at
Checkers.
~ It is also important to note that sometimes there can be two modes in a frequency.
~ Bimodial: is a distribution with 2 modes and a distribution with more than 2 modes is called
multimodial.
Chapter 4: Variability
Defining Variance and Standard Deviation
~ Most common and most important measure
of variability is the standard deviation
o A measure of the standard, or average, distance from the mean
o Describes whether the scores are clustered closely around the mean or are widely scattered
~ Calculation differs for population and samples
~ Variance is a necessary companion concept to standard deviation but not the same concept
o The twin concepts of variance and standard deviation are among the most challenging
concepts in a basic statistics course to communicate and to learn.
o Instructors will almost certainly want to invest special care in the preparation of materials to
help communicate these very difficult concepts.
o So in today’s podcast we wil focus on defining variance and standard deviation.
o The standard deviation is the standard or average distance ffrom the mean, its going to give us
a description of whether scores are close to or cluster together on the mean or widely
scattered out.
o Now we want to note that here too lik when calculating the mean there were slightly different
formulae, so here to calculate the SD of the population and samples are different too.
o Please ensure to familiarize yourself with the formulae in your text book as it also clearly
differentiates between sample and population formulae.
o Now Variance is a somewhat companion concept of SD. BUT it is not the same concept.
o So lets go through these formulae.
~ Step One: Determine the deviation
~ Deviation is distance from the mean
Deviation score = X − μ
~ Step Two: Find a “sum of deviations” to use as a basis of finding an “average deviation”
o Two problems
§ Deviations sum to 0 (because M is balance point)
§ If sum always 0, “Mean Deviation” will always be 0.
o Need a new strategy!
~ Having students try to come up with an intuitive method for developing a measure of variability
based on deviation scores is a great way to get them thinking about what a dead end strategy
averaging deviations is (because, of course, the average of deviations from the mean must be 0).
~ Several teams working on it in a classroom exercise often results in a valuable insight about the issue
(averaging absolute value of deviations) and might produce the one we use—squaring the deviations
to eliminate the negative values.
~ When we calculate the SD we basically asking how different the scores are from one another. So
deviation as mention means distance.
~ So for example: think of a psy 253 exam where the average grade is 70% but your score is 80%, so
essentially your score deviates from th mean by positive 10 points and likewise if you score below 80
you areminus 10 points from the mean.
~ So we can calculate for everyone in the class and calculate the deviation score for each person and at
the end we can calculate the average score for all deviations in the class.
~ This is essentially what SD is. It tells us the average distance of the sample from the pop mean.
Example
Finding the deviation score:
Deviation =x-µ µ=50 x=53
=53-50
Deviation score=3
Deviation =x-µ µ=50 x=45
=45-50
=-5
To get rid of the (-) you have to Ö (square) each and every score and then calculate the variance.
~ Step Two Revised: Remove negative deviations
o First square each deviation score
o Then sum the squared deviations (SS)
~ Step Three: Average the squared deviations
o Mean squared deviation is known as “variance”
o Variability is now measured in squared units
Population variance equals mean (average)
squared deviation (distance) of the scores
from the population mean
o The concept of sum of squared deviations (SS) is absolutely vital to efficient understanding of
the statistical tests presented in the remainder of the text.
o The authors have reduced the computational complexity and the cognitive load required of
students—contingent upon grasping and retaining the concept of SS presented in this chapter.
o The authors also lay the foundation for efficiently learning the fundamentals of ANOVA—
contingent upon grasping and retaining the concept of variance presented in this chapter.
o Consequently, this chapter is essential to success in the remainder of the course.
o If you have any negative scores you simply first square each deviation score and then add up
the squared deviations.
o So the next step would be to calculate the variance which is basically equal to the squared
deviations, soit is the average squared distance from the mean.
o Deviations squared=Variance
~ Step Four:
o Goal: to compute a measure of the “standard” (average) distance of the scores from the mean
o Variance measures the average squared distance from the mean—not quite our goal
~ Adjust for having squared all the differences by taking the square root of the variance
Standard Deviation = Variance
o Variance (in squared distance units) is not intuitively easy to grasp despite being a measure of
average squared distance of scores from the mean.
o Consequently, it is important to emphasize the need to take the square root of the variance to
return it to the same distance unit used in the original measurement procedure.
o So no we can speak to the Standard Deviation:
o This is the square root of the variance. So in order to calculate the SD you first have to calculate
the Variance. The SD provides us with the average distance from the mean.
Figure 4.2 Calculating Variance and Standard Deviation
~ So this diagram can be used to summarize this entire process.
~ Firstly you find deviation score for each score then to get rid of the + and – signs you square each
deviation score.
~ Then we want to find the average of the squared deviation, and this is by adding up those squared
values and then divide by the number of scores this will give you the variance.
~ And finally, to find the standard deviation, you take the square root of the variance.
~ So you will square root that variance score and that will give you the SD.
~ When you are calculating variance for samples the only difference is that the denominator has this n-1
adjustment.
~ For samples, we take the sample size minus 1. the variance for a sample is equal to the sum of
squares divided by the sample size minus 1 and in order to get the SD we just take the square root of
the variance.
~ So when receiving any sample of data and you need to calculate variance and SD for a sample you
make that n-1 adjustment.
~ So why do we do this?....
Figure 4.4 Population of Adult Heights
~ The population of adult heights forms a normal distribution.
~ We know this because when we draw a line through the middle, it will be symmetrical.
~ If you select a sample from this population, you are most likely to obtain individuals who are near
average in height.
~ As a result, the scores in the sample will be less variable (spread out) than the scores in the
population.
~ So, the reason is that samples underestimate the true variability in the population just like we spoke
of in the first slide.
~ Here you can see the entire population and if we just sample 10 or 15 of these individuals we might
me lead to believe there’s less variability inherent in the distribution than really exists.
~ So, by adding that adjustment it then reduces the denominator values and inflates the SD slightly to
correct that initial underestimate.
~ So, let’s look at some examples to calculate this in the video next
Example: Calculating Sum of Squared Deviations (SS)
(å " ! )
SS=åx2 - $
10,7,6,10,6,15 n=6
10+7+6+10+6+15=54 åx=54
10 +7 +6 +10 +6 +15 =546
2 2 2 2 2 2
åx2=546
(%&)!
SS=546- '
=546-486
=60
Example: Calculating Sample Variance
((
s2 =$)* SS=60
n=6
'+
s2 =')*
'+
=%
s2 =12
Example: Calculating Standard Deviation
S=√12 s2 =12
= 3.46
Sample Variability and Degrees of Freedom
~ Population variance
o Mean is known
o Deviations are computed from a known mean
~ Sample variance as estimate of population
o Population mean is unknown
o Using sample mean restricts variability
~ Degrees of freedom
o Number of scores in sample that are independent and free to vary
o Degrees of freedom (df) = n – 1
~ So we know that with the variance for a population the mean is known, unlike with variance for a
sample where the mean of a population is unkown.
~ Hence we measure distance from the sample mean.
~ And we know that we must first compute the sample mean before we can begin to compute
deviation.
~ But calculating the value of M places a restriction on the variability of scores in a sample.
~ This can be demonstrated in the following table.
X A sample of n=3 scores with a mean of 5
2
9
------ (What is the third score?)
~ For example you have sample of n=3 scores and compute a mean of M=5.
~ The first two scores in the sample have no restrictions and we can see that the third score is
restricted, so what do we do?
~ We see that the third score must be 4. we say 4 because the entire sample has a Mean of 5 so for 3
scores to have a mean of 5 the total must be 15.
~ So get to 15 we can see that the first two scores added together gives us 11 therefore 4 is left to make
up 15, hence we say that 4 is the restricted value.
~ So the first two scores were free to have any value but the third score was dependent on the first
two.
~ So a the first sample of n-1 scores are free to vary but the final score is restrict therefore we say that
as a result, the sample is said to have n-1 degrees of freedom.
~ The degrees of freedom determine the number of scores in a sample that are free to vary.
Chapter 5: z-Scores: Location of Scores and Standardized Distributions
Introduction & Purpose of z-scores
~ Identify and describe location of every
score in the distribution
~ Standardize an entire distribution
~ Take different distributions and make them equivalent and comparable
~ In this chapter we will discuss z-scores and the location in a distribution.
~ So, we will learn how we can take any raw score of a test or an assessment for example and convert it
into a standardized score.
~ Okay, so when we talk about z-scores we talk about turning a raw score value into a z-score or a
standardized score.
~ Why?
~ Two purposes: when we convert a raw score into a standard score, we are able to tell the exact
location of an original score of the entire sample of people that took the test.
~ Now more importantly z-scores or standardized scores allow us to directly compare the results of
scores that come from two completely different distributions that have their own mean and own
standard deviation.
~ So, by taking those two raw scores and putting it on a scale we can see which is more or less
competitive than the other.
~ One practical example you can use to introduce z-scores and standardization involves baseball.
~ How could you compare the performance of a player in 1968 to one in 2000?
~ We know that scoring was much lower in 1968, so if we simply looked at their raw batting averages,
or home runs hit, the player from 1968 would likely appear to be much worse.
~ But if we standardize their scores—by comparing them to the mean batting average (or home runs
hit) in 1968 and 2000, we now have a common metric to compare the scores: by how well they did
relative to other players in that season.
Figure 5.1 Two Exam Score Distributions
~ So this Unit Normal table we can see at the top of each column A, B, C, and D.
~ So remember what we learnt in the previous podcasts that transforming an x-value we use the
equation z= x-u divided by SD for populations and for samples we use z=x-m divided by s.
~ Son once you transform your raw score into the z-score you enter the z-score into the table under
column A, but this is given here already in the table.
~ Please also note that the z-score is inputted as 2 digits right at the decimal, so adhere to the standard
rounding rules.
~ The next column B represents Proportion in the body, and over here we can see an illustration of
what it looks like.
~ So if we are interested in the area or probability of proportion in a distribution below a positive z-
score, again this would be positive because it is above the mean, we would sketch the distribution like
this and we would transform the x-value into a z-score and then report the are in column B which
represents the proportion in the body which will always be greater that point 5 or 50%.
~ Here we can mirror the negative z-score
1)µ
z= s
~ So again I say a normal distribution is symmetrical right, so if we look at a negatively skewed
distribution where we could have a negative z-score so the value is below the mean.
~ Now in the next Column C, we would be interested in the area above a positive z-score.
~ So a value above a positive z-score which we may tend to see as the smaller are in the distribution.
~ So because the distribution is symmetrical we can say the same for the other side in terms of below a
negative z-score, again the proportion or probability of a score that is less than the mean.
1)2
z= (
~ Finally we have column D, which is refereed to as the proportion between the mean and z.
~ So we may be interested between the proportion and probability of score between the mean and a z-
score.
~ So here in the distribution it pertains to a positive z-score and we can consider the same, the area
between the mean and the z for a negative z-score.
~ Okay, so in the next podcast I will show you how to use the unit normal table to find proportions and
probabilities for a specific z-score.
Probability/Proportion & z-scores
~ Unit normal table lists relationships between z-score locations and proportions in a normal
distribution.
~ If you know the z-score, you can look up
the corresponding proportion.
~ If you know the proportion, you can use the table to find a specific z-score location.
~ Probability is equivalent to proportion.
~ In this podcast I will show you how to find the Proportions / probabilities for a specific z-score and Z-
Score locations that correspond to specific proportions.
~ There are a few steps to follow:
~ Firstly, to find proportions or probabilities for a specific z-score value you always:
~ Sketch the distribution and shade the are of interest.
~ Transform X values into z-scores
~ Enter the Unit Normal Table using column A and reference the appropriate proportion (Body, Tail, or
Area between Mean and z) according to the sketch drawn in the first step.
~ Remember it is vital to first sketch the distribution. If you dont you may make careless mistakes.
~ Now lets look at an example… In the next slide
~ Remember our equations for z-score…
Example
~ A sample is normally distributed with a mean of u=45 and a standard deviation of o=4.
~ What is the probability of randomly selecting a score that is greater than 43?
Refer to the Table
x=µ+s(z)
=24.3+10(1.28)
=24.3+12.8
=37.1min. ->the 10%
The Unit Normal Table to Locate the z-Score
Calculating the X value corresponding to proportions or probability
~ The probabilities given in the unit normal table will be accurate only for normally distributed scores so
the shape of the distribution should be verified before using it.
~ For normally distributed scores
o Transform the X scores (values) into z-scores
o Look up the proportions corresponding to the z-score values