Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 30

N580 exams with correct answers

statistics vs. parameters - Correct Answer - analysis on a sample vs. analysis on the entire population

overview to data analysis - Correct Answer - quantitative: descriptive and inferential

descriptive: simply describe a characteristic of a population/sample or a phenomenon

inferential: hope to generalize from a sample to a population (based on probability)--we use parametric
and non-parametric statistics

parametric tests and non-parametric tests - Correct Answer - every statistical test has assumptions that
must be met

parametric tests have more stringent assumptions--two of the most critical assumptions: normal
distribution of the data and level of measurement (must be interval-like in nature)

non-parametric tests don't have such limitations

we like parametric tests b/c they are more powerful and more likely to find a difference if there is one

one chosen will depend on your data

analysis - Correct Answer - clearly the problem is the driving force--the sophistication of analysis will
never compensate for an insignificant problem

always remember the research question or hypothesis always drives the analysis

look at the hypothesis or question--should be able to begin to think about the type of analysis that
would be appropriate

when you are beginning to think about statistical tests... - Correct Answer - you must look at what drives
the tests-->the research question and the hypothesis--what are the variables and how are they
measured? what is the level of measurement?
if the question focuses on describing some phenomena - Correct Answer - N & % (if data are nominal or
categorical) or descriptive statistics (if measures are interval-like in nature--mean, standard deviation,
range)

if the research question or hypothesis is interested in a relationship between two variables - Correct
Answer - correlation

Pearson's r (parametric test) or Spearman Rho/Kendall's Tau (non-parametric test)

if the research question or hypothesis is interested in a difference between two independent groups -
Correct Answer - the independent t test (parametric) or the Mann Whitney U (if data weren't normally
distributed)

if the research question or hypothesis is interested in a difference between two or more independent
groups - Correct Answer - ANOVA (parametric) or the Kruskal Wallis (if data aren't normally distributed
or if measure is ordinal)

if the research question or hypothesis is interested in differences in two groups that are dependent
(repeated measures) - Correct Answer - the paired t test (parametric) or Wilcoxon (non-parametric)

if the research question or hypothesis is interested in the difference in two or more groups that are
dependent - Correct Answer - RANOVA (parametric) or Friedman (non-parametric)

how can you choose? - Correct Answer - what is the research question or hypothesis asking?

are groups independent or dependent?

what is the level of measurement of the variables?

are the assumptions of the statistical test met, particularly that of normal distribution?

knowledge of statistics is necessary to all who read or conduct research - Correct Answer - research:
making observations or measurements on people, things, or events to answer a research question
statistics: a set of procedures for describing those measurements--how do we quantify and report these
measurements?

data - Correct Answer - the raw material of research

variable - Correct Answer - something that varies or takes on different values

we are always interested in variability and explaining variation

variables are also classified as... - Correct Answer - discrete: finite number of value (almost like you can
count)--obtained by counting

continuous: infinite number of value between any two points--obtained by measuring

statistical methods are sometimes described by the number of variables in the analysis - Correct Answer
- univariate: average cholesterol

bivariate: average cholesterol (exercise group and sedentary group)

multivariate: average cholesterol (exercise: yes/no, diet: good/bad, gender: M/F)

measurement - Correct Answer - key to capturing variables

assigning numbers to objects, events, etc. according to the rules

some things are more difficult to measure than others--consider temperature vs. self-efficacy

level of measurement - Correct Answer - influences what statistical tests can be chosen

4 basic levels:

nominal

ordinal

interval

ratio
want to know about it when collecting data, when analyzing data (level of measurement will influence
statistical test you can select)

always measure at the highest level realistically possible: more powerful test, greater flexibility, more
info

nominal - Correct Answer - category: categorical variable (M/F/transgender, race, etc.--can go in one
but not the other)

can't treat mathematically

if these are the dependent variable: they are reported by n and %, mode

also used in: grouping (study has 3 independent interventions [groups] looking at an independent
variable), logistic regression (results in odds ratios--dependent variable is categorical--given a certain
treatment, what are the odds the patient is dead or alive)

ex: preferred mode of transportation, blood type, gender

non-parametric

how are numbers assigned to nominal level data for statistical analysis? - Correct Answer - blood type

1=O+

2=A+

3=B+

4=AB+

and so forth

think about gender, race, marital status--how would you assign numbers to levels of these variables?

we assign numbers to enter them in a computer

recommend to always collect data at the highest level

ordinal - Correct Answer - uses numbers to designate ordering of an attribute--relative standing

describes an amount of some attribute but there is not an equivalent distance between each number (so
really is ranking)
think about track finishes--not an equivalent amount between 1st place and 2nd place, 2nd place and
3rd place, and so on

examples of ordinal data - Correct Answer - socioeconomic status

academic rank

psychological inventories (mini-mental status exam, QOL scales, etc.)

treatment of ordinal data varies - Correct Answer - parametric or non-parametric

general rule is non-parametric tests should be used

if there are 11 or so measures and if the scores are well-distributed, then parametric tests can be used

interval - Correct Answer - amount of an attribute, equal distance between each number in terms of
amount (no absolute 0 value)

ex: Fahrenheit scale

can be used in parametric tests so dependent variable is interval-like in nature and there is a normal
distribution

ratio - Correct Answer - same as interval but has an absolute 0

morphine in mgs, length in inches, Kelvin temp

can make comparative statements (twenty pounds is twice as heavy as ten pounds)

don't have to distinguish between ratio and interval data because statistically they are treated the same

identifying the characteristics of data - Correct Answer - can the data be put in order? no=nominal

do the data have units, including numbers of things? no=ordinal

yes=metric

do the data come from measuring or counting things? if measuring then continuous data, if counting
then discrete data
quantitative research - Correct Answer - descriptive or inferential

inferential can be parametric or non-parametric

none of this applies for qualitative studies

what type of research questions are answered by descriptive statistics - Correct Answer - try to describe
an event/phenomena that we don't know a lot about

what are the most frequently perceived benefits of and barriers to medication and dietary compliance
among two samples of patients with heart failure?

what is the infant mortality rate among US immigrants from Iraq and Afghanistan?

In addition, descriptive statistics are always used in describing the research sample and should always be
presented in regard to outcome variables

when you read results you always want to read... - Correct Answer - what were the values in the
outcome variables

statistically significant doesn't mean clinically significant/relevant

descriptive statistics - Correct Answer - describe and summarize

answer research questions, describe samples, and values of outcome variables

not generalizable

how are descriptive statistics reported? - Correct Answer - nominal data - N, %, and mode

measures of central tendency: mean, median, and mode

measures of variability: such as range and standard deviation

measures of distribution: skew and kurtosis

nominal data - Correct Answer - categorical


marital status

gender

race

N&%

can also display findings graphically (such as bar, pie chart)

bar graphs as opposed to histograms: spaces between them

making sense of raw data or how do we first look at data? - Correct Answer - frequency distributions

frequency - count of cases

percent - % of time a given value occurs

cumulative percent - percentage for given score combined with all percentages that preceded it

as you collect data, you want to run frequency distribution to get a sense of what data look like

frequency distributions - Correct Answer - can be grouped into class intervals/sets (useful for large
samples)

displayed graphically: bar graph (nominal data), pie chart, histograms (interval/ratio data), frequency
polygon

temperature=interval data--can use histogram

box and whisker plot could also be used (dark line in middle of block=50th percentile, bottom of
box=25th percentile, top=75th percentile--will show outliers)

use of frequency distribution - Correct Answer - get a feel for data

helps in cleaning the data

identify % of missing values

interval or ratio level of measurement - Correct Answer - measures of central tendency

measures of dispersion
measures of distribution

sometimes used for ordinal data (controversial)--if there are sufficient number of measures

measures of central tendency - Correct Answer - mode=most frequent value in a distribution--can be


used also with nominal data

median=middle value--score wherein 50% of scores are above and 50% are below--can be used when
data are really skewed

mean=avg value--add all values in a distribution and divide by total number of values--used with interval
or ratio data

mean - Correct Answer - works best for symmetrical distributions

extreme values can distort the mean

can be manipulated as it is algebraic

appropriate if distribution is normal

inappropriate if distribution is abnormal

not practical if data are really skewed

median - Correct Answer - simply the mid-point

50% percentile

odd # of scores - take middle value

even # of scores - add the middle 2 and divide by 2

not sensitive to extreme scores - useful for skewed data

median = (N+1)/2

mode - Correct Answer - most frequent score

even if you have nominal data you can report the mode

bimodal - two modes


primary and secondary modes

don't usually see it reported in research studies--if it is, it's reported to make a point

perfect normal distribution - Correct Answer - mean, median, mode are the same

see mean reported most often in research studies

measures of dispersion, variability, or scatter - Correct Answer - nothing in statistics that is more
important than variability

standard deviation, variance (won't see it reported in literature), interpercentile/interquartile measures


(IQR--belongs with median), range (difference between highest and lowest)

standard deviation - Correct Answer - most commonly reported measure of variability - avg. amount by
which scores vary around the mean

reported in same unit as the variable

like the mean it is sensitive to extreme values and like the mean it is algebraic

normal distribution - Correct Answer - 68% of scores will fall within 1 SD of mean

95% fall within 2 SD of mean

99.7% will fall within 3 SD of mean

variance - Correct Answer - not usually reported in descriptive statistics

avg. of the squared deviations around the mean

the squared SD

look at the number and it doesn't mean anything to you

range - Correct Answer - difference btwn minimum and maximum value

report as low and high score


interquartile range - Correct Answer - used if you have skewed data or data you don't really want to
report

based on percentages

75th percentile minus 25th percentile so gives you middle 50%

Q3-Q1

can adjust this how you want

won't see this very often

use the interquartile range if range isn't representative of data

symmetry of distribution - Correct Answer - shape: skew, kurtosis

perfect normal distribution: skew=0, kurtosis=0

skew=more important than kurtosis

authors won't report if data aren't skewed but will report if they are

skew: named in direction of tail (positive skew, negative skew)--mean and standard deviation won't be
appropriate

kurtosis: refers to height of curve/distribution--positive=too peaked=leptokurtic--negative=too


flat=platykurtic

inferential statistics - Correct Answer - techniques used generalize from characteristics of a small group
to a larger group that is unmeasured, i.e. the researcher via inferential statistics is able to infer the
characteristics of the larger group from the measured characteristics of the smaller group

allows researchers to draw conclusions beyond their sample to the larger population

don't allow you to prove anything--can only support findings

statistical inferences are not subjective

there is always the possibility of error


population - Correct Answer - all the possible members of a group defined by the researcher

sample - Correct Answer - any number of cases less than the total number of cases in the population
from which it was drawn

to generalize inferential statistics... - Correct Answer - select a sample from the population

derive data from the sample

generalize from the sample data to the population

H0 - Correct Answer - null hypothesis

there are 2 mutually exclusive outcomes

statistical hypothesis

there is no difference, no relationship, etc.--if there is a difference, it's a random error

inferential statistics ask how probable is it that the null hypothesis is false?

Ha - Correct Answer - alternative hypothesis

the experimental treatment is more effective

laws of probability - Correct Answer - probability of an event = 0-100%

p(event)=# of ways the specific event can occur/total # of possible events

deals with relative frequency of an outcome

alpha - Correct Answer - set at start of study

statistical inference - Correct Answer - a researcher uses probability tables (distributions developed for
particular test statistics) to determine how likely it is an outcome or difference occurred by chance alone
(error)
how are probability tables constructed? - Correct Answer - normally distributed variables

sampling distributions

sampling error

sample statistics mean - Correct Answer - often different from the population mean due to sampling
error

amount of sampling error=the difference between the population mean and the sample mean

unfortunately, we generally don't know the population mean--if we did, why would we bother with a
sample

inferential statistics allows us to estimate sampling error

central limit theorem - Correct Answer - means of all samples will form a normal curve

in the central limit theorem: the mean of sampling distribution equals... - Correct Answer - the
population mean so the average sampling error would always be zero

because sampling distributions usually approximate a normal curve, probability statements about the
likelihood of finding any given sample mean can be made--first we need to know the SD of the sampling
distribution of the means (SEM)

also, SD of a sampling distribution=SEM or standard error mean (standard=average amount of error for
all possible samples--error=various sample means contain some error in estimating the population
mean)

SEM - Correct Answer - quantifies how precisely you know the true mean of the population--it takes
into account both the value of the SD and the sample size

both SD and SEM are in same units as units of data

SEM gets smaller as your samples get larger

this makes sense, because the mean of a large sample is likely to be closer to the true population mean
than is the mean of a small sample
with a huge sample you'll know the value of the mean with a lot of precision even if the data are very
scattered

using SEM: point estimate vs. interval estimate - Correct Answer - point estimate vs. interval estimate

point estimate=sampling mean

interval estimate=confidence interval

95th percent confidence interval - Correct Answer - sampling mean + or - (1.96 x SEM)

99th percent confidence interval - Correct Answer - sampling mean + or - (2.58 x SEM)

confidence interval - Correct Answer - the probability that the interval contains the population mean

P values and alpha levels - Correct Answer - alpha value is set by researcher prior to data collection

p value: reported by statistical test after the data are collected and analyzed

usual value is generally set at 0.05; in pilot studies the researcher may set the alpha at 0.10 and in
studies where treatment has serious side effects may set the value at 0.01

effect size - Correct Answer - measure that describes the magnitude of the difference between two
groups

in general, effect sizes tell us how many standard deviations difference there is between the means of
the intervention (treatment) and comparison conditions; for example, an effect size of 0.25 indicates
that the treatment group outperformed the comparison group by a quarter of a standard deviation

effect size (Cohen's)

typically, you see this reported as Cohen's d or d

interpretation depends on the research question

meaning of effect size varies by context and statistical test, but standard interpretation offered by Cohen
is:
0.8=large (8/10 of a SD unit)

0.5=moderate (1/2 of a SD)

0.2=small (1/5 of a SD)

steps in hypothesis testing - Correct Answer - 1. state the hypothesis (usually Ha not H0)

2. select the alpha or level of significance (most commonly set at 0.05 or 0.01)

3. selecting the appropriate test and computing the calculated value of the test statistic (assumptions of
the statistical test are met)

4. compare the calculated value to the critical value (critical value=value that must be obtained r/t alpha
value selected)--a statistical table with sampling distributions for the statistical test you are using will
give you critical values--and the statistical run will give you the exact statistic value and the resulting p
value

5. accept or reject hypothesis

one tail vs. two tail test - Correct Answer - one tail=directional hypothesis

two tail=non-directional hypothesis

possible errors in hypothesis testing - Correct Answer - researcher finds the H0 is: true and H0 is really
true: correct

researcher finds H0 is false and H0 is really true: Type I error

researcher finds H0 is true and H0 is really false: Type II error

researcher finds H0 is false and H0 is really false: correct

controlling risk of error - Correct Answer - Type I: change the alpha or p value--consider p=0.10, 0.05,
0.01, 0.001

Type II: beta=the probability of committing a Type II error--can be decreased by increasing the power (1-
B) of the statistical test--a power of 80% is generally acceptable--easiest way to increase power=increase
sample size
correlation - Correct Answer - relationship that can be measured mathematically

correlation coefficient represents the degree of association between >= 2 variables

partial or semi-partial correlation=3 or more variables

zero order correlation=relationship between 2 variables

what is the strength and direction of relationship? - Correct Answer - descriptive statistics describe
sample (many descriptive research questions don't have hypotheses)

inferential statistics test hypotheses about correlations that exist in larger population

types of data required for correlational analyses - Correct Answer - data are numeric

interval/ratio scales (equal distance between points)

ordinal scales (in most cases--distance between categories is unknown)--ex: military rank,
socioeconomic status, health status

nominal scales (categorical variables) may be recoded, but other factors must be considered when
deciding whether a correlation coefficient is appropriate

positve and negative correlation - Correct Answer - positive: ex: as number of hours of studies
increases, final exam grade increases

negative: ex: as number of hourse exercised decreases, there is an increase in LDL level

scatter plots do not indicate... - Correct Answer - strength of results

non-linear relationships - Correct Answer - no association/correlation or curvilinear (changes direction--


anxiety and performance have a positive relationship up to a certain point at which point it becomes
negative)

correlation assumptions - Correct Answer - 1. relationship is linear: beware that important non-linear
relationships can be overlooked with simple correlation analysis
2. data are normally distributed

3. variables have roughly equal range of variability (homoscedasticity)

4. sample must be representative of population of interest to which inference is made

correlation as a parametric test - Correct Answer - 1. assumes data are normally distributed

2. continuous (ratio/interval) data are used

3. involves a population characteristic

Pearson's Product Moment Correlation Coefficient - Correct Answer - r

most commonly used parametric test for correlation

non-parametric equivalent - Correct Answer - can be used when correlation assumptions are not met

1. when data are not normally distributed

2. when using ordinal data (e.g., tumor stage)

Kendall's Tau: assign ranks to ordinal levels and then compare relationship between variables

interpreting r - Correct Answer - direction is expressed as positive or negative

values of r ranging from -1.00 to +1.00 with strongest relationships at either end

an r value of 0=no correlation

Cohen defines effects as: small (r<0.10), moderate (r=0.30), and large (r>=0.50)

R^2 represents the coefficient of determination and is a measure of the amount of shared variance
between the two variables--high R^2 signifies much of the variance in the dependent variable is
explained by the independent variable

regression - Correct Answer - statistical technique that makes use of correlation between variables and
application of a straight line to develop a prediction equation

used to predict the score of a dependent variable using score(s) from the independent variable(s)
research questions r/t regression - Correct Answer - does an independent variable(s) X significantly
predict the scores on a dependent variable Y?

if so, what is the predicted score for a dependent variable based on the score(s) on an independent
variable(s)?

correlation vs. regression - Correct Answer - correlation: used to examine association--variables=most


often 2--statistics=r (+ or -)--tells us direction and strength

regression: prediction--variables=2 or more--statistics=b (+ or -)--tells us direction, strength, and


magnitude

simple linear regression - Correct Answer - uses one independent variable and one dependent variable

uses score on one variable to predict score on the other

multiple linear regression - Correct Answer - more than one IV and one DV

uses the score of one variable to predict the score on the other AND controls for other factors that
might influence or confound the relationship being studied

logistic regression - Correct Answer - uses a binary DV (0, 1) with more than 1 IV

regression assumptions - Correct Answer - 1. the relationship is linear--beware that important non-
linear relationships can be overlooked with simple regression analysis

2. data are normally distributed

3. variables have roughly equal range of variability (homoscedasticity)

4. sample must be representative of population of interest

type of data for regression - Correct Answer - 1. interval including ordinal

2. nominal data ok if special coding is used (such as numbered)


3. measures of IV and DV for each subject

simple linear regression straight line formula - Correct Answer - y=a+bx

y=DV

x=IV

a=intercept

b=gradient or slope

simple linear regression model - Correct Answer - regression quantifies a linear relationship that
minimizes variation in predicted vs. observed values around the regression line y'=a+bx+e

y'=predicted value of DV

a=intercept

b=slope (beta coefficient)

x=IV (predictor)

e=error (how much y' differs from observed)

multiple linear regression model - Correct Answer - possible when there is a measurable multiple
correlation between a group of predictor variables and one DV

prediction equation: y==a+b1x1+b2x2+...+bkxk

y'=predicted value of DV

a=intercept

b=beta weight (or coefficient)

x=value of IV (predictor)

significance testing - Correct Answer - simple linear regression: correlation is tested (T test statistic)--
R^2 represents meaningfulness (coefficient of determination)
multiple linear regression: multiple correlations tested (F test statistic)--individual beta weights tested (T
test statistic)--testing the beta weights tells us if the IV is contributing significantly to the variance
accounted for in the DV--R^2 significance is tested (F test statistics, ANOVA)

what dos a significant T or F-statistics tell us? - Correct Answer - IV (X) significantly predicts value of DV
(Y')

a good fit

what if the T or F-statistics is not significant? - Correct Answer - X is not associated with Y' or there is a
non-linear relationship (i.e. model is not a good fit for data)

what does a significant beta coefficient tell us? - Correct Answer - amount Y' increases for each unit
change in X

what does R^2 tell us? - Correct Answer - total variation in Y' is attributable to model terms

parametric vs. non-parametric tests - Correct Answer - parametric: greater power, Type II error is less
likely--data are normally distributed--usually ratio/interval data

non-parametric: lesser power, Type II error made more likely--distribution-free--nominal, ordinal, or


skewed data

Chi squared test - Correct Answer - non-parametric test

used to answer research questions and/or hypotheses

one of most frequently used non-parametric test, doesn't have a parametric equivalent

best test for comparing two groups - Correct Answer - t test (parametric)

data are interval or ratio


if data are nominal or ordinal and you are comparing 2 groups - Correct Answer - Chi square (nominal)
or Mann Whitney U (ordinal)

(non-parametric)

best test for comparing 3 or more groups - Correct Answer - ANOVA (parametric)

if data are nominal or ordinal and you are comparing 3 or more groups - Correct Answer - Chi square
(nominal) or Kruskal-Wallis H (ordinal)

(non-parametric)

best test for dependent groups - Correct Answer - paired t tests (parametric)

if data are nominal or ordinal and you are comparing dependent groups - Correct Answer - McNemar
(nominal) or Wilcoxon (ordinal)

(non-parametric)

the goodness of fit test - Correct Answer - a Chi square test

single sample

how well observed values of a single categorical variable match with values expected of a theoretical
model

used as a tool to assess for potential bias in a study sample (such as in table 1 about demographic
characteristics)

Chi squared test of independence - Correct Answer - Pearson's Chi square

used to determine if levels of two categorical variables are independent of one another

McNemar test - Correct Answer - repeated measures


uses an adaptation of Chi squared formula to test direction of change in the same subjects by
administering repeated measures at the nominal level

Chi squared assumptions - Correct Answer - frequency data (representing a count of the number of
study participants that meet a certain condition--expected frequency count of at least 5 participants in
each cell is used)

adequate sample size

measures are independent of each other (mutually exclusive categories--an individual can only be
counted once)

theoretical basis for variable categorization (to ensure analysis will be meaningful)

research question for Chi squared - Correct Answer - does the expected frequency differ significantly
from the actual or observed frequency?

inferential statistics test the null hypotheses that variables are independent of each other

ex: are men more likely than women to quit smoking following their participation in a smoking cessation
intervention? - Correct Answer - variables: gender, smoking status

null hypothesis: gender and smoking status are independent of each other

alternate hypothesis: they are not independent of each other

contigency table - Correct Answer - contains frequency data on the variables that we are studying

interpreting chi squared - Correct Answer - calculate degrees of freedom: (rows-1)*(columns-1)

thus, in a 2x2 table the df=1

look for the p value in the article or computer printout to see if P<0.05

t test - Correct Answer - also referred to as Student's t test (Gosset)

assess whether the means of two groups are statistically different from each other
appropriate when comparing means of two groups (e.g., posttest only two-group design)

describes set of distribution of means of randomly drawn samples from a normally distributed
populations (t distributions, based on sample size and vary according to degrees of freedom)

types of t tests - Correct Answer - parametric, test of choice when comparing two groups and data are
interval or ratio (continuously scored)

Mann-Whitney U and Wilcoxon are non-parametric equivalents of the t test and are to be used when
data are ordinal or not normally distributed

independent (unpaired)

dependent (paired): repeated measures, matched-pairs sample

independent (unpaired) t test - Correct Answer - used to examine differences between two unrelated
groups that are often formed by the random assignment of subjects to the treatment and control
groups--to qualify for independence: the groups must be selected from two different settings, two
different points in time, or not matched in any way

dependent (paired) - Correct Answer - consists of a sample of matched pairs or one group that has been
tested twice, such as the repeated measures t test--study participants are tested prior to and after a
treatment such as the administration of a medication to treat high BP--each patient is their own control

matched-pairs sample: consists of groups that were matched as part of the study design to ensure
similarities between the two groups--may be used in an observational study as a way to reduce or
eliminate the effects of confounding factors

type of data required for t test - Correct Answer - independent (explanatory) variable: nominal (group) -
2 levels or 2 groups

dependent (response) variable: interval or ratio, ordinal if 11 or more dichotomous levels (must be
continuously scored)

independent t test assumptions - Correct Answer - independent variable is categorical, contains 2


levels: 2 mutually exclusive groups (if violated, a paired t test can be used)
distribution of the dependent variable is normal (if violated, try the non-parametric Mann Whitney U or
Wilcoxon tests)

variances of dependent variable for both groups are similar (homogeneity of variance)--assumption
applied by null hypothesis that groups are from a single population (protects us against type II errors)

research question for application of t test - Correct Answer - what is the probability of getting a
difference this large by chance alone?

inferential statistics test the null hypothesis that any difference that occurs between the two groups is a
difference in the sampling distribution

two-tailed test - Correct Answer - null hypothesis: no difference between population means

alternative hypothesis: there is indeed a difference

an extreme value on either side of the sampling distribution would cause a researcher to reject the null
hypothesis

one-tailed tests - Correct Answer - an extreme value on only one side of the sampling distribution would
cause the researcher to reject the null hypothesis

t test and p value - Correct Answer - null hypothesis can be rejected when p<0.05

considerations of variability in testing difference between means - Correct Answer - the larger the
difference in means, the more likely the t test will be significant

however, variability and sample size also influence the outcome

you would be more likely to find statistically significant differences between groups with the lowest
variability

so need to consider sample size and variability of spread when judging the difference between mean
scores

standard error - Correct Answer - denominator in t test


estimate of standard deviation of population

represents pool variance of both groups

larger the standard error, less likely the difference will be significant

a smaller denominator is associated with a larger t statistic so we want our variance to be as small as
possible and our sample size as large as possible

degrees of freedom in the t test - Correct Answer - N - 2

sum of the persons in both groups=N

research question for t test - Correct Answer - do children who receive a distraction intervention differ
in reported level of anxiety from children who do not receive the intervention following an inoculation
procedure?

power and t - Correct Answer - determine one or two-tailed test

one greater than the other=1 tail

difference between groups=2 tail

decide on alpha or probability level (e.g., 0.05)

Cohen (1987) provides tables for determining sample size based on power and effect size

power analysis - Correct Answer - power=probability of rejecting the null hypothesis

power of 0.80 means that there is an 80% chance of rejecting the null hypothesis

more subjects are needed to increase power

power of 0.80 is reasonable in nursing and related behavioral health studies

effect size for t test - Correct Answer - should be based on previous work if it exists, rather than relying
on Cohen's tables

if unavailable, consult Cohen 1987 tables for moderate effect


effect size is simply the difference between the two means divided by the SD for the measure

Cohen's moderate effect size=0.5, or half a SD

confidence intervals - Correct Answer - 95% CI means that you can be 95% sure that it includes the true
population mean

calculated using the mean difference, SE, confidence level that we set as researchers

look up t in distribution table

examples of paired t test analysis and independent t test analysis - Correct Answer - paired:

H0: participation in the BFS intervention will not lead to changes in parental self-efficacy, coping, shared
management, depressive symptoms, and QOL

H1: particpation in the BFS intervention will lead to changes in....

independent:

H0: participants who receive the BFS intervention will not differ from participants who do not receive
the BFS intervention in reported scores of parental self-efficacy, coping, shared management, depressive
symptoms, and QOL

H1: participants who receive the BFS intervention will differ from participants who do not receive the
BFS intervention in reported scores of self-efficacy....

questions and hypotheses that ANOVA addresses - Correct Answer - ex: the main hypotheses were
supplementation with gum arabic or psyllium improved stool consistency and decreased the proportion
of incontinent stools compared to placebo (more followed)

grouping variable=mutually exclusive

independent variable is measured at interval level (treatment)

dependent variables: stool consistency and proportion of incontinent stools

ANOVA - Correct Answer - robust test

used to compare response variable by 2 or more groups


minimizes risk of type I error by examining differences across all groups at once

examines variance to determine if group means differ

why would you not just do multiple t tests? (see above)

sample data are used to compute a test statistic - the F ratio

F: ratio is compared to scores (F-ratio values) in a sampling distribution (table) developed by


statisticians--these of course will be reported via a statistical package in a data run

degrees of freedom and p value are used to read the appropriate F ratio value

if the sampling distribution F ratio value is larger than your test statistic - the null hypothesis is accepted

conversely, if your test statistic is larger than the value (falls w/in the rejection range of the sampling
distribution) the null hypothesis is rejected and the research hypothesis accepted

the problem with running multiple t tests - Correct Answer - rate of Type I errors increases

the more tests you perform, the more likely it is that some will be significant by chance alone

in ANOVA, groups are often called... - Correct Answer - factors

mutually exclusive

you can be in one group and one group only

hypotheses in ANOVA - Correct Answer - H0: u1=u2=u3

HA: u1 doesn't equal u2, u1 doesn't equal u3, u2 doesn't equal u3, or u1 doesn't equal u2 which doesn't
equal u3

what does ANOVA tell us? - Correct Answer - overall, is there a difference between groups? this reduces
the possibility of type I error

doesn't tell us group 1 vs. group 2 and on and on

does this by looking at variance

run a post-hoc test to tell us which groups differ


type of data required for ANOVA - Correct Answer - IV: nominal or ordinal, often called a factor with
different levels

DV: interval or ratio scale/continuous

assumptions for one-way ANOVA - Correct Answer - robust and tends to yield accurate results even if all
of the assumptions aren't met

same as those for the t test

1. appropriate level of measurement - IV and DV

2. IV - groups are mutually exclusive

3. DV has a normal distribution

4. DV - homogeneity of variance - or the groups have equal variances (standard deviation around those
means for each group has to be relatively similar)

if they are significantly violated: you can run the Kruskall Wallis (compares by rank and not mean scores,
non-parametric--also tells you if there is a statistically significant difference between groups and then
you can run pairwise comparisons)

analysis of variance: the logic - Correct Answer - F statistic=between-group variability (BGV)/within-


group variability (WGV)

BGV=difference between the means of each group

WGV=variability of scores within each group

F=(effect of IV + sampling error)/sampling error

calculating ANOVA - Correct Answer - sums of squares for WGV

sums of square for BGV

sums of squares for total variation


WG-SS - Correct Answer - each score in each group is used to determine its deviation from the group
mean

the deviation is squared and all deviations from each group are added to get the SS-WG

BG-SS - Correct Answer - take deviation of group mean from grand mean

conceptually, ANOVA... - Correct Answer - compares the WG-SS to the BG-SS

by the way - the total SS is simply the WG-SS and the BG-SS added

MS calculation - Correct Answer - mean sum

SS/df

MS for WG-SS=WG-SS/(n-k), where k=the number of groups

MS for BG-SS=BG-SS/(k-1)

ANOVA logic - Correct Answer - if group means are equal - there is no between group variability - that
is, the BGV=0 and the F ratio will be 0

if the group mean scores are different, the question is whether the differences are b/c the population
means are different or the difference is due to random chance

if you find overall significance (and only if).... - Correct Answer - you need to conduct a post-hoc test

two commonly used tests are the Scheffe and the Tukey

determining sample size and power - Correct Answer - as with the t- test, using expected effect size,
desired power, and the alpha level sample size (# of subjects per group) can be determined

statistical packages as well as tables developed by Cohen are available

RANOVA - Correct Answer - repeated measures ANOVA


the paired t test for more than two groups: same people were measured twice

same concept: same people measured over time for more than 2 groups

RANOVA as a "within subjects" design - Correct Answer - 1. repeated measure of same variable
overtime (BP response to treatment)

2. exposing all subjects to all treatments - so each subject serves as his/her own control

can be powerful b/c controls extraneous variables

examples of exposing subjects to all treatments - Correct Answer - accurate temps in critically ill
patients (Swan-Ganz, rectal, axillary, tympanic)--run overall test (f-ratio), then post hoc tests if
statistically significant

effectiveness of 3 treatments (Tai Chi, relaxation, meditation) in relieving chronic "stable" pain in
patients with osteoarthritis--assign treatment in random order, allow a "wash-out" period (more
important with meds--always possibility of latency effect and what they learned in first treatment they
apply to second treatment, they may know the questions you're asking, etc.), economy of scale (number
of patients)

following patients over time - Correct Answer - used to see how patients over time following an illness
even

may compare different groups in examining illness trajectory

assumptions of RANOVA - Correct Answer - similar to one-way ANOVA except for compound symmetry

1. correlations between DVs are about the same

2. variances of DVs are equal across measures (essentially homogeneity of variance)

if these assumptions are violated: Friedman test (non-parametric)--looks at rank ordering

mixed design - Correct Answer - combines ANOVA and RANOVA

commonly used in healthcare


two independent variables: (1) 1 variable=mutually exclusive groups, (2) 1 variable - repeated measures
= usually time (i.e., 1 week, 2 weeks, 3 weeks, etc.)

one dependent variable

results (get 3): main effect by treatment, main effect by time, interaction of treatment x time

ANCOVA - Correct Answer - combination of multiple regression and ANOVA to measure differences in
group means

helps reduce error variance

error variance is reduced by controlling for variation in DV that comes from extraneous variable(s)

allows you to remove the influence of that extraneous variable

Stage 1: covariate (regression piece): the variable that you want to control or the variable that you want
to remove in terms of influencing the score of the DV

Stage 2: using an ANOVA approach, the variance that remains in the DV is explained

co-variates - Correct Answer - pre-score

intervening variable

assumptions of RANOVA - Correct Answer -

You might also like