Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Stat250GundersonLectureNotes

9:LearningabouttheDifferenceinPopulationMeans

Part1:DistributionforaDifferenceinSampleMeans

TheIndependentSamplesScenario
Recallthattwosamplesaresaidtobeindependentsampleswhenthemeasurementsinone
sample are not related to the measurements in the other sample. Independent samples are
generatedinavarietyofways.Somecommonways:

Randomsamplesaretakenseparatelyfromtwopopulationsandthesameresponse
variableisrecordedforeachindividual.
Onerandomsampleistakenandavariableisrecordedforeachindividual,butthen
unitsarecategorizedasbelongingtoonepopulationoranother,e.g.male/female.
Participantsarerandomlyassignedtooneoftwotreatmentconditions,suchasdietor
exercise, and the same response variable, such as weight loss, is recorded for each
individualunit.

Iftheresponsevariableisquantitative,aresearchermightcomparetwoindependentgroupsby
lookingatthedifferencebetweenthetwomeans.

SamplingDistributionfortheDifferenceinTwoSampleMeans

FamilyDinnersandTeenStress
Astudywasconductedtolookattherelationshipbetweenthenumberoftimesateenhasdinner
withtheirfamilyandlevelofstressintheteenslife.Teenswereaskedtoratethelevelofstress
intheirlivesonapointscaleof0to100.

Theresearcherwouldliketoestimatethedifferenceinthepopulationmeanstresslevelforteens
whohavefrequentfamilydinners(group1)versusthepopulationmeanstresslevelforteens
whohaveinfrequentfamilydinners(group2).

ATypicalSummaryofResponsesforaTwoIndependentSamplesProblem
Population
SampleSize
SampleMean
SampleStandardDeviation
1Frequent
10
53.5
15.7
2Infrequent
10
65.5
14.6

Let 1 bethepopulationmeanstresslevelforallteenswhohavefrequentfamilydinners.
Let 2 bethepopulationmeanstresslevelforallteenswhohaveinfrequentfamilydinners.

Wewanttolearnabout 1 and 2 andhowtheycomparetoeachother.Wecouldestimate


the difference in population means 1 2 with the difference in the sample means x1 x 2 .
Willitbeagoodestimate?

139

Can anyone say how close this observed difference in sample mean stress levels x1 x 2 of
12pointsistothetruedifferenceinpopulationmeans 1 2 ?___ NO_____

Ifweweretorepeatthissurvey(withsamplesofthesamesizes),wouldwegetthesamevalue
forthedifferenceinsamplemeans?__ Probablynot____

Isadifferenceinthesamplemeansof12pointslargeenoughtoconvinceusthatthereisareal
differenceinthemeansforthepopulationsofteens?Maybe,maybenotitdependsonanumber
ofthings(samplesizes,variability,randomlyselected?etc)

Sowhatarethepossiblevaluesforthedifferenceinsamplemeans x1 x 2 ifwetookmanysets
ofindependentrandomsamplesofthesamesizesfromthesetwopopulations?Whatwould
thedistributionofthepossible x1 x 2 valueslooklike?

Whatcanwesayaboutthedistributionofthedifferenceintwosamplemeans?
Thatis,wewanttolearnaboutthesamplingdistributionfor x1 x2 .

Usingresultsfromhowtohandledifferencesofindependentrandomvariablesandtheresults
forthesamplingdistributionforasinglesamplemean,thesamplingdistributionofthe
differenceintwosamplemeans x1 x 2 canbedetermined.

Firstrecallthatwhenworkingwiththedifferenceintwoindependentrandomvariables:
themeanofthedifferenceisjustthedifferenceinthetwomeans
thevarianceofthedifferenceisthesumofthevariances

Next,rememberthatthestandarddeviationofasamplemeanis
.
n

Sowhatwouldthevarianceofasinglesamplemeanbe?

So lets apply these ideas to our newest parameter of interest, the difference in two sample
means x1 x 2 .

SamplingDistributionoftheDifferenceinTwo(Indep)SampleMeans

Ifthetwopopulationsarenormallydistributed(orsamplesizesarebothlargeenough),
Then x1

x 2 is(approximately)
2
2

2
1

N 1 2 ,

n1 n2

140

Sincethepopulationstandarddeviationsof 1and 2aregenerallynotknown,wewillusethe


datatocomputethestandarderrorofthedifferenceinsamplemeans.

StandardErroroftheDifferenceinSampleMeans

s.e. x1 x 2

s12 s 22

n1 n 2

wheres1ands2arethetwosamplestandarddeviations

The standard error of x1 x 2 estimates, roughly, the average distance of the possible
x1 x 2 valuesfrom Thepossible x1 x 2 valuesresultfromconsideringallpossible
independentrandomsamplesofthesamesizesfromthesametwopopulations.

Moreover, we can use this standard error to produce a range of values that we are
very confident will contain the difference in the population means 1 2 , namely,

x1 x 2 (afew)s.e.( x1 x 2 ).Thisisthebasisforconfidenceintervalforthedifferencein
populationmeansdiscussedinPart2.

Lookingahead:
Doyouthinkthefewintheaboveexpressionwillbeaz*valueorat*value?
Whatdoyouthinkwillbethedegreesoffreedom?
Itwillbet*aswearelearningaboutpopulationmeans,thepopulationstddevsareunknown,and
samplesizesmaynotnecessarilybelarge.Thedfforasinglesamplewasn1;however,fordifference
betweentwomeans,therewillbetwodifferentversionsforaCIthatwilldependonwhetherornot
anotherassumptionaboutthepopulationsholds.Hereistheideaforthedfforoneversionthink
aboutthedfforeachsample:n11andn21;thetotaldfthenwouldbe:n11+n21=n1+n22.

Wewillusethestandarderrorofthedifferenceinthesamplemeanstocomputeastandardized
test statistic for testing hypotheses about the difference in the population means 1 2 ,
namely,
SamplestatisticNullvalue.
(Null)standarderror

ThisisthebasisfortestingcoveredinPart3.

Lookingahead:
Doyouthinkthestandardizedteststatisticwillbeazstatisticoratstatistic?
Whatdoyouthinkwillbethemostcommonnullvalueused?
Itwillbeatstatistic(actuallytwoversionsagain)aswearetestingatheoryaboutpopulationmeans,
the population std devs are unknown, so the standard error for the difference will be in the
denominator,andsamplesizemaynotnecessarilybelarge.

H0: 1 2 =______0_____

141

142

Stat250GundersonLectureNotes
LearningabouttheDifferenceinPopulationMeans

Part2:ConfidenceIntervalforaDifferenceinPopulationMeans

ConfidenceIntervalfortheDifferenceinTwoPopulationMeans

General(Unpooled)Approach

Wehavetwopopulationsorgroupsfromwhichindependentsamplesareavailable,(or
onepopulationforwhichtwogroupscanbeformedusingacategoricalvariable).
Theresponsevariableisquantitativeandweareinterestedincomparingthemeansfor
thetwopopulations.

ATypicalSummaryoftheResponsesforaTwoIndependentSamplesProblem:
Population
SampleSize
SampleMean
SampleStandardDeviation
1
n1
x1
s1
2

n2

x2

s2

Let 1 be the mean response for the first population and 2 be the mean response for the
secondpopulation.

Parameterofinterest:thedifferenceinthepopulationmeans 1 2 .
Sampleestimate:thedifferenceinthesamplemeans x1 x 2 .

s12 s 22 wheres ands arethesamplestandarddeviations.


1
2

n1 n 2
Sowehaveourestimateofthedifferenceinthetwopopulationmeans,namely x1 x 2 ,andwe
haveitsstandarderror.Tomakeourconfidenceinterval,weneedtoknowthemultiplier.The
multipliert*isatvaluesuchthattheareabetweent*andt*equalsthedesiredconfidence
level. The degrees of freedom for the tdistribution will depend on whether we use an ugly
formula(usedbysoftwarepackages)orweuseaconservativebyhandapproach.
Standarderror: s.e. x1 x 2

GeneralTwoIndependentSamplestConfidenceIntervalfor12

x1 x 2 t * s.e.( x1 x 2 )

s12 s 22
where s.e. x1 x 2
and t * istheappropriatevalueforatdistribution,andthe

n1 n 2
dfcanbefoundusinganapproximationorconservativelyasdf=smallerof(n11orn21)

Thisintervalrequireswehaveindependentrandomsamplesfromnormalpopulations.
Ifthesamplesizesarelarge(both>30),theassumptionofnormalityisnotsocrucialandthe
resultisapproximate.

143

ThePooledApproach

Ifwecanfurtherassumethepopulationvariancesare (unknownbut)equal,thenthereisa
procedure for which the t* multiplier is easier to find using an exact (not approximate) t
distribution.Itinvolvespoolingthesamplevariancesforanoverallestimateandupdatingthe
standarderroraccordingly.

Itsometimesmaybereasonabletoassumethatthemeasurementsinthetwopopulationshave
thesamevariances

sothat__ 1

22 2 ____where_____denotesthecommonpopulationvariance.

Since both sample variances would be estimating the common population variance, it would
makesensetocombineorpoolthetwosamplevariancestogethertoformanoverallestimate.

Pooledstandarddeviation: s p

(n1 1) s12 (n2 1) s22

n1 n2 2

Notes:
(1)Eachsamplevarianceisweightedbythecorrespondingdegreesoffreedom.
Soalargersamplesizewillresultinalargerweightforthatsamplevariance.
(2)Thedenominatorgivesthetotaldegreesoffreedom:

df=(n11)+(n21)=n1+n22

Replacingtheindividualstandarddeviationss1ands2withthepooledversionspintheformula
forthestandarderrorleadstothepooledstandarderrorof x1 x 2 isgivenby:

Pooleds.e.( x1 x 2 )=

s 2p
n1

s 2p

1
1
s 2p
n2
n1 n 2

s p

1
1

n1 n 2

PooledTwoIndependentSamplestConfidenceIntervalfor12

x1 x2 t * pooleds.e.(x1 x2 )

where pooled s.e.x1 x2 s p


and s p

1
1

n1 n2

(n1 1) s12 (n2 1) s22


and t * istheappropriatevalueforat(n1+n22)distribution.
n1 n2 2

This interval requires we have independent random samples from normal populations with
equalpopulationvariances.Ifthesamplesizesarelarge(both>30),theassumptionofnormality
isnotsocrucialandtheresultisapproximate.

144

Notes:

(1) Somecomputersoftwarepackageswillprovideresultsforboththeunpooledandthepooled
twoindependentsamplestproceduresautomatically.Others,suchasR,willrequireyouto
explorethedatainappropriatewaystohelpdecidewhichmethodyouwishtouseupfront
asyourequesttheanalysis.

(2) Firstcomparethesamplestandarddeviations.Ifthesamplestandarddeviationsaresimilar,
theassumptionofcommonpopulationvarianceisreasonableandthepooledprocedurecan
beused.Ifthesamplesizeshappentobethesame,thepooledandunpooledstandarderrors
areequalanyway.Theadvantageforthepooledversionisthatfindingthedfissimpler.

(3) A graphical tool to help assess if equal population variances is reasonable is sidebyside
boxplots.Ifthelengthsoftheboxes(theIQRs)andoverallrangesbetweenthetwogroups
areverydifferent,thepooledmethodmaynotbereasonable.

(4) SomecomputersoftwarealsoprovideorallowyoutoproducefirsttheresultsofaLevenes
testforassessingifthepopulationvariancescanbeassumedequal.

Thenullhypothesisforthistestisthatthepopulationvariancesareequal.Soasmallpvalue
forLevenestestwouldleadtorejectingthatnullhypothesisandconcludingthatthepooled
procedureshouldnotbeused.

Often a significance level of 10% is used for this condition checking. Your lab workbook
providesmoredetailsaboutLevenestest.WewillseeLevenestestresultsinsomeofour
examplesahead.

Bottomline: Poolifreasonable;butifthesamplestandarddeviationsarenotsimilar,wehave
theunpooledprocedurethatcanbeused.

TryIt!ComparingStressLevelScores

Astudywasconductedtolookattherelationshipbetweenthenumberoftimesateenhasdinner
withtheirfamilyandlevelofstressintheteenslife.Teenswereaskedtoratethelevelofstress
intheirlivesonapointscaleof0to100.

Theresearcherwouldliketoestimatethedifferenceinthepopulationmeanstresslevelforteens
whohavefrequentfamilydinners(group1)versusthepopulationmeanstresslevelforteens
whohaveinfrequentfamilydinners(group2).
HereisapartiallistingofthedatainR.Note
therearetwoDinnerGroupvariables,oneis
numerical(aswasintheoriginaldataset)and
theotheriscategorical(neededforR).

145

Somedescriptivesummaries,sidebysideboxplots,andLevenesTestresultsareprovidedfirst.

Population
SampleSize
1Frequent
10
2Infrequent
10

SampleMean
53.5
65.5

SampleStandardDeviation
15.7
14.6

> with(Dataset, tapply(stresslevel, group, var, na.rm=TRUE))


Frequent Infrequent
244.7222
213.6111
> leveneTest(stresslevel ~ group, data=Dataset, center="mean")
Levene's Test for Homogeneity of Variance (center = "mean")
Df F value Pr(>F)
group 1 0.3958 0.5372
18

a. Oneoftheassumptionsforthepooledtwoindependentsamplesconfidenceintervaltobe
validisthatthetwopopulations(fromwhichwetookoursamples)havethesamestandard
deviation.Lookatthetwosamplestandarddeviations,theboxplots,andtheLevene'stest
result.Doestheassumptionseemtohold(atthe10%level)?Explain.
Yes, the two sample standard deviations are similar, supporting the equal population variance
assumption. The pvalue for Levenes test of 0.5372 is larger than 0.10, we cannot reject the
hypothesisthatthepopulationvariancesareequal.

146

b. Givea95%confidenceintervalforthedifferenceinthepopulationmeanstressscores,that
is,for12.Showallwork.

(10 1) 14.6 (10 1) 15.7


( n1 1) s12 (n2 1) s22
sp

229.825 15.16
10 10 2
n1 n2 2
2

pooled s.e. x1 x2 s p

1 1
1 1

15.16

6.78
10 10
n1 n2

x1 x2 t * pooled s.e.(x1 x2 ) (53.5 65.5) (2.10)6.78 12 14.24


Sotheintervalgoesfrom26.24to2.24

c. Basedontheinterval,doesthereappeartobeadifferenceinthemeanstressscoresforthe
twopopulations?Explain.Notethatthisintervaldoescontainthevalueof0,so0isa
reasonablevalueforthedifferenceinthepopulationmeanstresslevelscores.Thustheredoes
notappeartobeadifferenceinthemeanstresslevelforthepopulationofteenswhofrequently
dinewithfamilyversusthosewhoinfrequentlydinewithfamily(basedonthedata).

We could use R Commander to generate the ttest output using Statistics > Means >
IndependentSamplesTTest.UndertheOptionstab,sincewewanta(twosided)confidence
interval, we select twosided for the alternate direction. Set the confidence level and the
appropriatesettingforAllowequalvariances?

> t.test(stresslevel~group, alternative='two.sided', conf.level=.95,


+
var.equal=TRUE, data=Dataset)

Two Sample t-test


data: stresslevel by group
t = -1.7725, df = 18, p-value = 0.09323
alternative hypothesis: true difference in means is not equal to
0
95 percent confidence interval:
-26.223309
2.223309
sample estimates:
mean in group Frequent mean in group Infrequent
53.5
65.5

147

TryIt!StroopsWordColorTest
InStroopsWordColorTest,wordsthatarecolornamesareshownincolorsdifferentfromthe
word.Forexample,thewordredmightbedisplayedinblue.Thetaskistocorrectlyidentifythe
displaycolorofeachword;intheexamplejustgiventhecorrectresponsewouldbeblue.

GustafsonandKallmen(1990)recordedthetimeneededtocompletetheColorTestforn=16
individuals after they had consumed alcohol and for n = 16 other individuals after they had
consumedaplacebodrinkflavoredtotasteasifitcontainedalcohol.Eachgroupwasbalanced
with8menand8women.

Inthealcoholgroup,themeancompletiontimewas113.75secondsandstandarddeviationwas
22.64seconds.Intheplacebogroup,themeancompletiontimewas99.87secondsandstandard
deviationwas12.04seconds.

Group
1=alcohol
2=placebo

Samplesize
16
16

Samplemean
113.75
99.87

Samplestandarddeviation
22.64
12.04

We can assume that the two samples are independent random samples, that the model for
completiontimeisnormalforeachpopulation.
a. Whatgraph(s)wouldyoumaketocheckthenormalitycondition?Bespecific.Makeaqqplot
foreachsetofdate(sotwoqqplots).Couldalsolookathistograms(oneforeachsample).

b.Howdidthetwogroupscomparedescriptively?
Alcoholgrouptooknearly14secondslongertocompletetestascomparedtoplacebogroup.
Alcoholgroupalsohadmorevariabilityincompletiontimes,withastandarddeviationof22seconds,
comparedtoonly12secondsfortheplacebogrouptimes

c. Whichprocedure?Pooledorunpooled?Why?Asthesetwosamplestandarddeviationsseemto
bedifferent,perhapstheunpooledversionwouldbebetter(safer).

d. Calculatea95%confidenceintervalforthedifferenceinpopulationmeans.
First,findthestandarderrorofthedifferenceinthetwosamplemeans:
s.e.x1 x 2

s12 s 22

n1 n 2

22.64 2 12.04 2

41.0957 6.41
16
16

Aside:Interpretation:Wewouldestimatetheaveragedistanceofthepossible x1

x 2 valuestobeabout

6.4secondsawayfrom 1 2.
Twochoicesforthedegreesoffreedomfort
Welchsrathercomplicatedapproximation(Notifdoingtheintervalbyhand).
Orconservativelyasdf=smallerof(n11,n21)=smallerof(15,15)=15.
withdf=15,t*=2.13.

95%CIis: x1 x 2 t s.e.(x1 x 2 ) (113.7599.87)(2.13)(6.41)13.8813.58(0.30,27.46)


Note:IfusedWelchsapproximationresultwouldbedf=22.9orabout23,t*valueis2.07andinterval
(0.61,27.15).Ourintervalislittlewider(henceconservative).
*

e. Basedontheconfidenceinterval,canweconcludethatthepopulationmeansforthetwo
groupsaredifferent?Whyorwhynot?Yes,becausetheconfidenceintervaldoesnotinclude
thevalueof0.So0isNOTareasonablevaluefor1

148

- 2.

Whatif?
SupposetheresearchersGustafsonandKallmenwereconvinced(basedonpastresults)thatthe
underlyingpopulationvarianceswereequal,sotheypreferthatapooledconfidenceintervalbe
constructed.
Theestimateofthecommonpopulationstandarddeviationwouldbe:
2
2
(n1 1)s12 (n2 1)s22
(161)22.64 (161)12.04
sp

328.77 18.13
n1 n2 2
1616 2

Thepooledstandarderrorforthedifferenceinthetwosamplemeanswouldbe:
1
1
1
1
pooled s.e. x1 x 2 s p

18.13

6.41
n1 n 2
16 16
whichisthesameastheunpooledstandarderrorsincethesamplesizeswereequal.
Thet*multiplierwouldbebasedondf=16+162=30,sot*=2.04(fromTableA.2).

The95%PooledConfidenceIntervalwouldbe:

x1 x2 t * pooleds.e.(x1 x2 )

(13.88)(2.04)(6.41)13.8813.08(0.80,26.96)

Thisintervalstilldoesnotinclude0,sothesamedecisionwouldbemade;however,theinterval
isabitnarrower.Inthisexample,theunpooledintervalmaybeabitconservative(wider)but
theevidenceisstillstrongtostatethetwopopulationmeansappeartodiffer.

Stat250FormulaCard

149

150

Stat250GundersonLectureNotes
9:LearningabouttheDifferenceinPopulationMeans

Part3:TestingaboutaDifferenceinPopulationMeans

TestingHypothesesabouttheDifferenceinTwoPopulationMeans

Wehavetwopopulationsorgroupsfromwhichindependentsamplesareavailable,(or
onepopulationforwhichtwogroupscanbeformedusingacategoricalvariable).
Theresponsevariableisquantitativeandweareinterestedintestinghypothesesabout
themeansforthetwopopulations.

ATypicalSummaryoftheResponsesforaTwoIndependentSamplesProblem:
Population
SampleSize
SampleMean
SampleStandardDeviation
1
n1
x1
s1
2

n2

x2

s2

Let 1 be the mean response for the first population and 2 be the mean response for the
secondpopulation.

Parameterofinterest:thedifferenceinthepopulationmeans 1 2 .

Sampleestimate:thedifferenceinthesamplemeans x1 x 2 .
Standarderror:

s.e. x1 x 2

s12 s 22

n1 n 2

wheres1ands2arethetwosamplestandarddeviations

Pooledstandarderror: pooled s.e.x1 x 2 s p

where s p

(n1 1) s12

n1 n 2 2

1
1

n1 n 2

(n 2 1) s 22

Recall there are two methods for conducting inference for the difference between two
populationmeansforindependentsamplestheGeneral(Unpooled)CaseandthePooledCase.
Bothrequirewehaveindependentrandomsamplesfromnormalpopulations(butifthesample
sizesarelarge,theassumptionofnormalityisnotsocrucial).Bothwillresultinatteststatistic,
butthestandarderrorusedinthedenominatordifferaswellasthedegreesoffreedomusedfor
computingthepvalueusingatdistribution.

151


Hereisthesummaryforthesetwosignificancetests:
Possiblenullandalternativehypotheses.

1.H0:1 = 2 (or 1 - 2 = 0)
versusHa:1 2

2.H0:1 = 2

versusHa:1 > 2

versusHa:1 < 2
3.H0:1 = 2

Teststatistic=SamplestatisticNullvalue

Standarderror

Recalltheguidelinestoassesswhichversiontouse:
(1) Firstcomparethesamplestandarddeviations.Ifthesamplestandarddeviationsaresimilar,
theassumptionofcommonpopulationvarianceisreasonableandthepooledprocedurecan
beused.
(2) A graphical tool to help assess if equal population variances is reasonable is sidebyside
boxplots.Ifthelengthsoftheboxes(theIQRs)andoverallrangesbetweenthetwogroups
areverydifferent,thepooledmethodmaynotbereasonable.
(3) Examine the results of a Levenes test for assessing if the population variances can be
assumedequal.Thenullhypothesisforthistestisthatthepopulationvariancesareequal.
So a small pvalue for Levenes test would lead to rejecting that null hypothesis and
concludingthatthepooledprocedureshouldnotbeused.

Bottomline: Poolifreasonable;butifthesamplestandarddeviationsarenotsimilar,wehave
theunpooledprocedurethatcanbeused.

152

Randomlydividedsupportsthe
independentsamplesassumption

TryIt!EffectofBetablockersonpulserate

Dobetablockersreducethepulserate?Inastudyofheartsurgery,60subjectswererandomly
dividedintotwogroupsof30.Onegroupreceivedabetablockerwhiletheothergroupwas
givenaplacebo.Thepulserateataparticulartimeduringthesurgerywasmeasured.Theresults
aregivenbelow.

Group
1=betablockers
2=placebo

Samplesize
30
30

Samplemean
65.2
70.3

Samplestandarddeviation
7.8
8.4

a. Statethehypothesestoassessifbetablockersreducepulserateonaverage.

H0:____1 = 2________

versusHa:__1 < 2_______

b. Whichtestwillyouperformthepooledorunpooledtest?Explain.
Ourtwosamplestandarddeviationsaresimilar=>sopooledisreasonable.

c. Performthettest.Showallsteps.Aretheresultssignificantata5%level?
(1)findthepooledsamplestddev
sp

(n1 1) s12 ( n 2 1) s 22

n1 n 2 2

(30 1)7.8 (30 1)8.4


65.7 8.106
30 30 2
2

(2)computethetteststatistic
x1 x 2 0
x1 x 2
65.2 70.3
5.1
t

2.44

pooled s.e.( x1 x 2 )
2.09
1
1
1
1
sp
8.106

n1 n 2
30 30
(3)findthepvaluepvalue=P(T2.44ifH0true)
0.006<pvalue<0.012

(4)wewouldrejectH0andconclude
itappearsthatbetablockershelpreduce
pulseratesonaverage.

Theresultsarestatisticallysignificantatthe5%level.

153

TryIt!DoestheDrugSpeedLearning?
In an animallearning experiment, a researcher wanted to assess if a particular drug speeds
learning.Onegroupof5rats(Group1=control)isrequiredtolearntorunamazewithoutuse
of the drug, whereas a second independent group of 8 rats (Group 2 = experimental) is
administeredthedrug.Therunningtimes(timetocompletethemaze)fortheratsineachgroup
wereenteredintoR.

SummaryStatistics
Group

Mean

Std.Dev

SampleSize

General
Std.Error

Pooled
Std.Error

Control
Experimental

46.80
38.38

3.42
4.78

5
8

2.28

2.47

> leveneTest(stresslevel ~ group, data=Dataset, center="mean")


Levene's Test for Homogeneity of Variance (center = "mean")
Df F value Pr(>F)
group 1 1.09
0.32
11

Unpooled
Pooled

TwoSampleTResults
t
df
3.70
10.653
3.41
11

pvalue
0.002
0.003

Conduct the test to address the theory of the researcher (state the null and alternate
hypotheses,reporttheteststatistic,pvalue,andstateyourdecisionandconclusionatthe
5%levelofsignificance).

H0:_____1 = 2 ______

Teststatistic:_____t=3.41___________ pvalue:____0.003_____________

Decision:(circleone)

Ha:_____1 > 2 ______

FailtorejectH0

RejectH0

Thustheaveragetimetocompletethemazeissignificantlylower(faster)
formicetakingthedrug.

154

TryIt!EatthatDarkChocolate
AnAnnArborNewsarticleentitled:DarkChocolatemayhelpbloodflow,reportedtheresultsof
astudyinwhichresearchersfedasmall1.6ouncebarofdarkchocolatetoeachof22volunteers
dailyfortwoweeks.Halfofthesubjectswererandomlyselectedandassignedtoreceivebars
containingdarkchocolatestypicallyhighlevelsofflavonoids,andtheotherhalfreceivedplacebo
barswithjusttraceamountsofflavonoids.Theabilityofthebrachialarterytodilatesignificantly
improvedforthoseinthehighflavonoidgroupcomparedtothoseintheplacebogroup.

Let 1 represent the population average improvement in blood flow for those on the high
flavonoiddietand2representthepopulationaverageimprovementinbloodflowforthoseon
the placebo diet. The researchers tested that the highflavonoid group would have a higher
averageimprovementinbloodflow.

a. Statethenullandalternatehypotheses

H0:___1 = 2 _______versusHa:____1 > 2 __________

b. Theresearchersconductedapooledtwosamplettest.Thetwoassumptionsaboutthedata
arethatthetwosamplesareindependentrandomsamples.
i. Clearlystateoneoftheremainingtwoassumptionsregardingthepopulationsthatare
requiredforthistesttobevalid.
Eachpopulationhasanormaldistribution.OR
Thetwopopulationvariancesareequal.
ii. Explainhowyouwouldusethedatatoassessiftheaboveassumptioninpart(i)is
reasonable.(Bespecific.)

MakeaQQplotforeachsetofdataandseeifthepointsfallapproximatelyona
straightlinewithapositiveslope.Seeifhistogramsofeachsetofdatalook
approximatebellshapedisokalso.OR

ComparethetwosamplestandarddeviationsortheIQRs(viasidebysideboxplots)
toseeiftheyaresimilar.CoulduseLevenestesttomoreformallyassessifequal
populationvariancesisreasonable.
c. Asignificancelevelof0.05wasused.Basedonthestatementsreportedabove,whatcanyou
sayaboutthepvalue?Clearlycircleyouranswer:

pvalue>0.05

pvalue0.05

canttell

d. Theresearchersalsofoundthatconcentrationsofthecocoaflavonoidepicatechinsoaredin
bloodsamplestakenfromthegroupthatreceivedthehighflavonoidchocolate,risingfrom
a baseline of25.6 nmol/L to 204.4 nmol/L. In the group that received the lowflavonoid
chocolate,concentrationsofepicatechindecreasedslightly,fromabaselineof17.9nmol/L
to17.5nmol/L.Theaverageimprovementforthehighflavonoidgroupof204.425.6=
178.8nmol/Lisa(circleallcorrectanswers):

parameterstatistic

samplemeanpopulationmeansamplingdistribution

155

NameThatScenario
Nowthatwehavecoveredanumberofinferencetechniques,letsthinkaboutsomequestions
toasktohelpidentifytheappropriateprocedurebasedontheresearchquestion(s)ofinterest.

1. Istheresponsevariablemeasuredquantitativeorcategorical?

CategoricalProportions,percentages
p:Onepopulationproportion
p1p2:Differencebetweentwopopulationproportions

QuantitativeMeans
Onepopulationmean
d:Paireddifferencepopulationmean
12:Differencebetweentwopopulationmeans

2. Howmanysamples?
Iftwosetsofmeasurementsaretheypairedorindependent?

3. Whatisthemainpurpose?
Toestimateanumericalvalueofaparameter?confidenceinterval
Tomakeamaybenotormaybeyestypeofconclusion
aboutaspecifichypothesizedvalue?hypothesistest

AdditionalNotes
Aplacetojotdownquestionsyoumayhaveandaskduringofficehours,takeafewextranotes,write
outanextraproblemorsummarycompletedinlecture,createyourownsummaryabouttheseconcepts.

156

Stats250FormulaCardSummary

157

158

You might also like