Chapter 11

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Chapter 11: Tests for differences in means

 Nonparametric tests
o Do not rely on statistics and do not always follow a normal distribution. If the
distribution is skewed the mean does not have the same value to us than if the data are
distributed symmetrically
 Parametric tests
o Tests that rely on statistics from distribution, such as the normal distribution
 The Jarque-Bera test is used to see if your data is drawn from a normal distribution. If the
distribution is normal, then the JB will be zero. The JB will get larger as the sample size increases
which means we can tolerate less and less skew

Single-sample means test:

 Compares the mean of a sample to a pre-specified value and tests for a deviation from that
value. It is used to compare the mean of a variable in a sample of data to a (hypothesized) mean
in the population from which our sample data are drawn. This is important because we rarely
ever have access to data for an entire population
 The general rule for hypothesis testing is, you have the observed difference in the numerator
and what is effectively the expected difference in the denominator (this gets smaller and smaller
as the sample size increases) because you have more and more of the population in our sample.
Then see if the actual/observe difference is much more than the expected difference by
comparing the ratio to some known statistics
 181 (last part)

Independent (two -) sample test for difference in means:

 Why do we care about sample means, this is because we can conclude that the two populations
are representative of the population
 The test is nearly the same as the other tests we have done but different because we want to
know if there is a difference in the two sample means and their respective standard deviations
 We want to know if the difference that we observed is out of the ordinary, given that we are
using sample data. If we have the z-statistic and calculate the standard error of the difference in
the means
 By placing restrictions we can be more precise, and the standard error of the means will be
smaller in magnitude with this restriction, this means we would be more likely to reject the null
hypothesis of equal means because of our expectation of the difference will be smaller
 In the end we would be more likely to reject the null hypothesis of equal means because of our
expectation of the differences will be smaller

Wilcoxon rank sum W (Mann-Whitney U) test:

 Often a two-sample difference of means test is not appropriate, this is for two reasons
o Your data is interval or ratio, but they are not from a normal distribution
o Or, you have data that is ordinal or ranked, so the mean and standard deviation also do
not have meanings that they should
 If these are present, use a nonparametric test for two independent samples
 The Wilcoxon Rank Sum Test:
o Combine all of the data together
o Rank the combine data from lowest to highest
o Separate the samples
o Calculate the sum of the ranks for each sample
 If the samples are drawn from the same population, the sum of the ranks should be similar if the
sample sizes are the same
 If you have two samples have similar summed ranks when they are somooshed together into
one data set, ranked and then separated, the two “distributions” of those data should be
similar. If the data have summed ranks that are similar, the data would look something like this
when it is smooshed together: XXYYXYXYYYXXYXXYYX
o When its like this you wouldn’t know which data vale came from either sample by
looking at it
 Step 5: Calculate the test statistic using the z/t test:
o For this we need the theoretical mean and the standard deviation which are calculated
off of sample sizes and they increase as the sample size increases
o This does not occur with the parametric mean: it increases or decreases but gets closer
and closer to the population mean
 Step 6: Compare Zw with the Zw table, the test statistics
o If the Zw you calculate is between the lower and upper critical vales, Zwt* and Zwu*,
you fail to reject the null hypothesis of equal ranks

The Mann-Whitney U statistic:

 Measures the number of times an observation from the smaller sample ranks lower than an
observation from the larger sample

Matched pairs (dependent sample) difference tests:

 If you have two samples taken at the same time from different places and the people who asked
their views on capital punishment, these samples are independent. The views of one spatial unit
(a city) are not dependent the views on another city, they can be influenced by the same thing,
media and politicians but there is not direct link between the two samples
o What will be dependent is a sample from the same city, with the same people who are
asked about their views on capital punishment before and after an information session
covering issues, this sample is dependent
 Matched pairs: there is a matching observation (same person, place, and thing) in the other
sample
 There are two tests for these;
o Parametric, (a test for interval/ration data)
o Nonparametric for ranked data, or interval/ration data that has been converted to
ordinal data

Matched pairs t-test:

 This tests considers the value difference between the values of each matched pair (same person,
place, thing) and we want to see if there is a pattern
 To find the difference for the matched pair, you take the difference between the two variables
 The rest of the steps are the same, calculate the test statistic and compare it with the t-table or
calculate a p-value

Wilcoxon matched-pairs signed-rank test:

 When data is “strongly ordered” it means each observation has its own rank
 We also measure the difference in absolute terms, nothing the difference and then add the sign
of the change to them. It is common not to no have ordered data
 Next, we sum up the positive ranks, sum up the negative ranks, and then determine the total
number of matched pairs
 When the sample size increases, we can use more “traditional” statistical tables to preform
hypothesis tests

Two-sample difference of proportions test:

 Here we work with calculated percentages, which is ratio data and with proportions we always
calculate a z-statistic because these proportions are normally distributed
 Used if we want to know if different neighborhoods within a city supportive some initiatives
 There are three steps to this:
o Calculate the proportions of yes, no, pass, fail, support, etc.
o Calculate the test statistic. Only if the difference in proportions is large enough is that
difference considered statistically significant
o Compare the calculated value with a z-table or calculate p-value
 To get the p-value;
o Identify the correct test statistic.
o Calculate the test statistic using the relevant properties of your sample
o Specify the characteristics of the test statistic’s sampling distribution
o Place your test statistic in the sampling distribution to find the p value

Application using textbook data:


 Now we will compare the means from two independent samples now: the theft rate and the
residential burglary rate. To see if they are normally distributed, before we do a parametric test,
we know that the theft rate is not normally distributed and the Jarques-Bera preformed on the
residential burglary rate give the same result, p < 0.01

You might also like