Professional Documents
Culture Documents
N580 Exams With Correct Answer1
N580 Exams With Correct Answer1
statistics vs. parameters - Correct Answer - analysis on a sample vs. analysis on the entire population
inferential: hope to generalize from a sample to a population (based on probability)--we use parametric
and non-parametric statistics
parametric tests and non-parametric tests - Correct Answer - every statistical test has assumptions that
must be met
parametric tests have more stringent assumptions--two of the most critical assumptions: normal
distribution of the data and level of measurement (must be interval-like in nature)
we like parametric tests b/c they are more powerful and more likely to find a difference if there is one
analysis - Correct Answer - clearly the problem is the driving force--the sophistication of analysis will
never compensate for an insignificant problem
always remember the research question or hypothesis always drives the analysis
look at the hypothesis or question--should be able to begin to think about the type of analysis that
would be appropriate
when you are beginning to think about statistical tests... - Correct Answer - you must look at what drives
the tests-->the research question and the hypothesis--what are the variables and how are they
measured? what is the level of measurement?
if the question focuses on describing some phenomena - Correct Answer - N & % (if data are nominal or
categorical) or descriptive statistics (if measures are interval-like in nature--mean, standard deviation,
range)
if the research question or hypothesis is interested in a relationship between two variables - Correct
Answer - correlation
if the research question or hypothesis is interested in a difference between two independent groups -
Correct Answer - the independent t test (parametric) or the Mann Whitney U (if data weren't normally
distributed)
if the research question or hypothesis is interested in a difference between two or more independent
groups - Correct Answer - ANOVA (parametric) or the Kruskal Wallis (if data aren't normally distributed
or if measure is ordinal)
if the research question or hypothesis is interested in differences in two groups that are dependent
(repeated measures) - Correct Answer - the paired t test (parametric) or Wilcoxon (non-parametric)
if the research question or hypothesis is interested in the difference in two or more groups that are
dependent - Correct Answer - RANOVA (parametric) or Friedman (non-parametric)
how can you choose? - Correct Answer - what is the research question or hypothesis asking?
are the assumptions of the statistical test met, particularly that of normal distribution?
knowledge of statistics is necessary to all who read or conduct research - Correct Answer - research:
making observations or measurements on people, things, or events to answer a research question
statistics: a set of procedures for describing those measurements--how do we quantify and report these
measurements?
variables are also classified as... - Correct Answer - discrete: finite number of value (almost like you can
count)--obtained by counting
statistical methods are sometimes described by the number of variables in the analysis - Correct Answer
- univariate: average cholesterol
some things are more difficult to measure than others--consider temperature vs. self-efficacy
level of measurement - Correct Answer - influences what statistical tests can be chosen
4 basic levels:
nominal
ordinal
interval
ratio
want to know about it when collecting data, when analyzing data (level of measurement will influence
statistical test you can select)
always measure at the highest level realistically possible: more powerful test, greater flexibility, more
info
nominal - Correct Answer - category: categorical variable (M/F/transgender, race, etc.--can go in one
but not the other)
if these are the dependent variable: they are reported by n and %, mode
also used in: grouping (study has 3 independent interventions [groups] looking at an independent
variable), logistic regression (results in odds ratios--dependent variable is categorical--given a certain
treatment, what are the odds the patient is dead or alive)
non-parametric
how are numbers assigned to nominal level data for statistical analysis? - Correct Answer - blood type
1=O+
2=A+
3=B+
4=AB+
and so forth
think about gender, race, marital status--how would you assign numbers to levels of these variables?
describes an amount of some attribute but there is not an equivalent distance between each number (so
really is ranking)
think about track finishes--not an equivalent amount between 1st place and 2nd place, 2nd place and
3rd place, and so on
academic rank
if there are 11 or so measures and if the scores are well-distributed, then parametric tests can be used
interval - Correct Answer - amount of an attribute, equal distance between each number in terms of
amount (no absolute 0 value)
can be used in parametric tests so dependent variable is interval-like in nature and there is a normal
distribution
can make comparative statements (twenty pounds is twice as heavy as ten pounds)
don't have to distinguish between ratio and interval data because statistically they are treated the same
identifying the characteristics of data - Correct Answer - can the data be put in order? no=nominal
yes=metric
do the data come from measuring or counting things? if measuring then continuous data, if counting
then discrete data
quantitative research - Correct Answer - descriptive or inferential
what type of research questions are answered by descriptive statistics - Correct Answer - try to describe
an event/phenomena that we don't know a lot about
what are the most frequently perceived benefits of and barriers to medication and dietary compliance
among two samples of patients with heart failure?
what is the infant mortality rate among US immigrants from Iraq and Afghanistan?
In addition, descriptive statistics are always used in describing the research sample and should always be
presented in regard to outcome variables
when you read results you always want to read... - Correct Answer - what were the values in the
outcome variables
not generalizable
how are descriptive statistics reported? - Correct Answer - nominal data - N, %, and mode
gender
race
N&%
making sense of raw data or how do we first look at data? - Correct Answer - frequency distributions
cumulative percent - percentage for given score combined with all percentages that preceded it
as you collect data, you want to run frequency distribution to get a sense of what data look like
frequency distributions - Correct Answer - can be grouped into class intervals/sets (useful for large
samples)
displayed graphically: bar graph (nominal data), pie chart, histograms (interval/ratio data), frequency
polygon
box and whisker plot could also be used (dark line in middle of block=50th percentile, bottom of
box=25th percentile, top=75th percentile--will show outliers)
measures of dispersion
measures of distribution
sometimes used for ordinal data (controversial)--if there are sufficient number of measures
median=middle value--score wherein 50% of scores are above and 50% are below--can be used when
data are really skewed
mean=avg value--add all values in a distribution and divide by total number of values--used with interval
or ratio data
50% percentile
median = (N+1)/2
even if you have nominal data you can report the mode
don't usually see it reported in research studies--if it is, it's reported to make a point
perfect normal distribution - Correct Answer - mean, median, mode are the same
measures of dispersion, variability, or scatter - Correct Answer - nothing in statistics that is more
important than variability
standard deviation - Correct Answer - most commonly reported measure of variability - avg. amount by
which scores vary around the mean
like the mean it is sensitive to extreme values and like the mean it is algebraic
normal distribution - Correct Answer - 68% of scores will fall within 1 SD of mean
the squared SD
based on percentages
Q3-Q1
authors won't report if data aren't skewed but will report if they are
skew: named in direction of tail (positive skew, negative skew)--mean and standard deviation won't be
appropriate
inferential statistics - Correct Answer - techniques used generalize from characteristics of a small group
to a larger group that is unmeasured, i.e. the researcher via inferential statistics is able to infer the
characteristics of the larger group from the measured characteristics of the smaller group
allows researchers to draw conclusions beyond their sample to the larger population
sample - Correct Answer - any number of cases less than the total number of cases in the population
from which it was drawn
to generalize inferential statistics... - Correct Answer - select a sample from the population
statistical hypothesis
inferential statistics ask how probable is it that the null hypothesis is false?
statistical inference - Correct Answer - a researcher uses probability tables (distributions developed for
particular test statistics) to determine how likely it is an outcome or difference occurred by chance alone
(error)
how are probability tables constructed? - Correct Answer - normally distributed variables
sampling distributions
sampling error
sample statistics mean - Correct Answer - often different from the population mean due to sampling
error
amount of sampling error=the difference between the population mean and the sample mean
unfortunately, we generally don't know the population mean--if we did, why would we bother with a
sample
central limit theorem - Correct Answer - means of all samples will form a normal curve
in the central limit theorem: the mean of sampling distribution equals... - Correct Answer - the
population mean so the average sampling error would always be zero
because sampling distributions usually approximate a normal curve, probability statements about the
likelihood of finding any given sample mean can be made--first we need to know the SD of the sampling
distribution of the means (SEM)
also, SD of a sampling distribution=SEM or standard error mean (standard=average amount of error for
all possible samples--error=various sample means contain some error in estimating the population
mean)
SEM - Correct Answer - quantifies how precisely you know the true mean of the population--it takes
into account both the value of the SD and the sample size
this makes sense, because the mean of a large sample is likely to be closer to the true population mean
than is the mean of a small sample
with a huge sample you'll know the value of the mean with a lot of precision even if the data are very
scattered
using SEM: point estimate vs. interval estimate - Correct Answer - point estimate vs. interval estimate
95th percent confidence interval - Correct Answer - sampling mean + or - (1.96 x SEM)
99th percent confidence interval - Correct Answer - sampling mean + or - (2.58 x SEM)
confidence interval - Correct Answer - the probability that the interval contains the population mean
P values and alpha levels - Correct Answer - alpha value is set by researcher prior to data collection
p value: reported by statistical test after the data are collected and analyzed
usual value is generally set at 0.05; in pilot studies the researcher may set the alpha at 0.10 and in
studies where treatment has serious side effects may set the value at 0.01
effect size - Correct Answer - measure that describes the magnitude of the difference between two
groups
in general, effect sizes tell us how many standard deviations difference there is between the means of
the intervention (treatment) and comparison conditions; for example, an effect size of 0.25 indicates
that the treatment group outperformed the comparison group by a quarter of a standard deviation
meaning of effect size varies by context and statistical test, but standard interpretation offered by Cohen
is:
0.8=large (8/10 of a SD unit)
steps in hypothesis testing - Correct Answer - 1. state the hypothesis (usually Ha not H0)
2. select the alpha or level of significance (most commonly set at 0.05 or 0.01)
3. selecting the appropriate test and computing the calculated value of the test statistic (assumptions of
the statistical test are met)
4. compare the calculated value to the critical value (critical value=value that must be obtained r/t alpha
value selected)--a statistical table with sampling distributions for the statistical test you are using will
give you critical values--and the statistical run will give you the exact statistic value and the resulting p
value
one tail vs. two tail test - Correct Answer - one tail=directional hypothesis
possible errors in hypothesis testing - Correct Answer - researcher finds the H0 is: true and H0 is really
true: correct
controlling risk of error - Correct Answer - Type I: change the alpha or p value--consider p=0.10, 0.05,
0.01, 0.001
Type II: beta=the probability of committing a Type II error--can be decreased by increasing the power (1-
B) of the statistical test--a power of 80% is generally acceptable--easiest way to increase power=increase
sample size
correlation - Correct Answer - relationship that can be measured mathematically
what is the strength and direction of relationship? - Correct Answer - descriptive statistics describe
sample (many descriptive research questions don't have hypotheses)
inferential statistics test hypotheses about correlations that exist in larger population
types of data required for correlational analyses - Correct Answer - data are numeric
ordinal scales (in most cases--distance between categories is unknown)--ex: military rank,
socioeconomic status, health status
nominal scales (categorical variables) may be recoded, but other factors must be considered when
deciding whether a correlation coefficient is appropriate
positve and negative correlation - Correct Answer - positive: ex: as number of hours of studies
increases, final exam grade increases
negative: ex: as number of hourse exercised decreases, there is an increase in LDL level
correlation assumptions - Correct Answer - 1. relationship is linear: beware that important non-linear
relationships can be overlooked with simple correlation analysis
2. data are normally distributed
correlation as a parametric test - Correct Answer - 1. assumes data are normally distributed
non-parametric equivalent - Correct Answer - can be used when correlation assumptions are not met
Kendall's Tau: assign ranks to ordinal levels and then compare relationship between variables
values of r ranging from -1.00 to +1.00 with strongest relationships at either end
Cohen defines effects as: small (r<0.10), moderate (r=0.30), and large (r>=0.50)
R^2 represents the coefficient of determination and is a measure of the amount of shared variance
between the two variables--high R^2 signifies much of the variance in the dependent variable is
explained by the independent variable
regression - Correct Answer - statistical technique that makes use of correlation between variables and
application of a straight line to develop a prediction equation
used to predict the score of a dependent variable using score(s) from the independent variable(s)
research questions r/t regression - Correct Answer - does an independent variable(s) X significantly
predict the scores on a dependent variable Y?
if so, what is the predicted score for a dependent variable based on the score(s) on an independent
variable(s)?
simple linear regression - Correct Answer - uses one independent variable and one dependent variable
multiple linear regression - Correct Answer - more than one IV and one DV
uses the score of one variable to predict the score on the other AND controls for other factors that
might influence or confound the relationship being studied
logistic regression - Correct Answer - uses a binary DV (0, 1) with more than 1 IV
regression assumptions - Correct Answer - 1. the relationship is linear--beware that important non-
linear relationships can be overlooked with simple regression analysis
y=DV
x=IV
a=intercept
b=gradient or slope
simple linear regression model - Correct Answer - regression quantifies a linear relationship that
minimizes variation in predicted vs. observed values around the regression line y'=a+bx+e
y'=predicted value of DV
a=intercept
x=IV (predictor)
multiple linear regression model - Correct Answer - possible when there is a measurable multiple
correlation between a group of predictor variables and one DV
y'=predicted value of DV
a=intercept
x=value of IV (predictor)
significance testing - Correct Answer - simple linear regression: correlation is tested (T test statistic)--
R^2 represents meaningfulness (coefficient of determination)
multiple linear regression: multiple correlations tested (F test statistic)--individual beta weights tested (T
test statistic)--testing the beta weights tells us if the IV is contributing significantly to the variance
accounted for in the DV--R^2 significance is tested (F test statistics, ANOVA)
what dos a significant T or F-statistics tell us? - Correct Answer - IV (X) significantly predicts value of DV
(Y')
a good fit
what if the T or F-statistics is not significant? - Correct Answer - X is not associated with Y' or there is a
non-linear relationship (i.e. model is not a good fit for data)
what does a significant beta coefficient tell us? - Correct Answer - amount Y' increases for each unit
change in X
what does R^2 tell us? - Correct Answer - total variation in Y' is attributable to model terms
parametric vs. non-parametric tests - Correct Answer - parametric: greater power, Type II error is less
likely--data are normally distributed--usually ratio/interval data
one of most frequently used non-parametric test, doesn't have a parametric equivalent
best test for comparing two groups - Correct Answer - t test (parametric)
(non-parametric)
best test for comparing 3 or more groups - Correct Answer - ANOVA (parametric)
if data are nominal or ordinal and you are comparing 3 or more groups - Correct Answer - Chi square
(nominal) or Kruskal-Wallis H (ordinal)
(non-parametric)
best test for dependent groups - Correct Answer - paired t tests (parametric)
if data are nominal or ordinal and you are comparing dependent groups - Correct Answer - McNemar
(nominal) or Wilcoxon (ordinal)
(non-parametric)
single sample
how well observed values of a single categorical variable match with values expected of a theoretical
model
used as a tool to assess for potential bias in a study sample (such as in table 1 about demographic
characteristics)
used to determine if levels of two categorical variables are independent of one another
Chi squared assumptions - Correct Answer - frequency data (representing a count of the number of
study participants that meet a certain condition--expected frequency count of at least 5 participants in
each cell is used)
measures are independent of each other (mutually exclusive categories--an individual can only be
counted once)
theoretical basis for variable categorization (to ensure analysis will be meaningful)
research question for Chi squared - Correct Answer - does the expected frequency differ significantly
from the actual or observed frequency?
inferential statistics test the null hypotheses that variables are independent of each other
ex: are men more likely than women to quit smoking following their participation in a smoking cessation
intervention? - Correct Answer - variables: gender, smoking status
null hypothesis: gender and smoking status are independent of each other
contigency table - Correct Answer - contains frequency data on the variables that we are studying
look for the p value in the article or computer printout to see if P<0.05
assess whether the means of two groups are statistically different from each other
appropriate when comparing means of two groups (e.g., posttest only two-group design)
describes set of distribution of means of randomly drawn samples from a normally distributed
populations (t distributions, based on sample size and vary according to degrees of freedom)
types of t tests - Correct Answer - parametric, test of choice when comparing two groups and data are
interval or ratio (continuously scored)
Mann-Whitney U and Wilcoxon are non-parametric equivalents of the t test and are to be used when
data are ordinal or not normally distributed
independent (unpaired)
independent (unpaired) t test - Correct Answer - used to examine differences between two unrelated
groups that are often formed by the random assignment of subjects to the treatment and control
groups--to qualify for independence: the groups must be selected from two different settings, two
different points in time, or not matched in any way
dependent (paired) - Correct Answer - consists of a sample of matched pairs or one group that has been
tested twice, such as the repeated measures t test--study participants are tested prior to and after a
treatment such as the administration of a medication to treat high BP--each patient is their own control
matched-pairs sample: consists of groups that were matched as part of the study design to ensure
similarities between the two groups--may be used in an observational study as a way to reduce or
eliminate the effects of confounding factors
type of data required for t test - Correct Answer - independent (explanatory) variable: nominal (group) -
2 levels or 2 groups
dependent (response) variable: interval or ratio, ordinal if 11 or more dichotomous levels (must be
continuously scored)
variances of dependent variable for both groups are similar (homogeneity of variance)--assumption
applied by null hypothesis that groups are from a single population (protects us against type II errors)
research question for application of t test - Correct Answer - what is the probability of getting a
difference this large by chance alone?
inferential statistics test the null hypothesis that any difference that occurs between the two groups is a
difference in the sampling distribution
two-tailed test - Correct Answer - null hypothesis: no difference between population means
an extreme value on either side of the sampling distribution would cause a researcher to reject the null
hypothesis
one-tailed tests - Correct Answer - an extreme value on only one side of the sampling distribution would
cause the researcher to reject the null hypothesis
t test and p value - Correct Answer - null hypothesis can be rejected when p<0.05
considerations of variability in testing difference between means - Correct Answer - the larger the
difference in means, the more likely the t test will be significant
you would be more likely to find statistically significant differences between groups with the lowest
variability
so need to consider sample size and variability of spread when judging the difference between mean
scores
larger the standard error, less likely the difference will be significant
a smaller denominator is associated with a larger t statistic so we want our variance to be as small as
possible and our sample size as large as possible
research question for t test - Correct Answer - do children who receive a distraction intervention differ
in reported level of anxiety from children who do not receive the intervention following an inoculation
procedure?
Cohen (1987) provides tables for determining sample size based on power and effect size
power of 0.80 means that there is an 80% chance of rejecting the null hypothesis
effect size for t test - Correct Answer - should be based on previous work if it exists, rather than relying
on Cohen's tables
confidence intervals - Correct Answer - 95% CI means that you can be 95% sure that it includes the true
population mean
calculated using the mean difference, SE, confidence level that we set as researchers
examples of paired t test analysis and independent t test analysis - Correct Answer - paired:
H0: participation in the BFS intervention will not lead to changes in parental self-efficacy, coping, shared
management, depressive symptoms, and QOL
independent:
H0: participants who receive the BFS intervention will not differ from participants who do not receive
the BFS intervention in reported scores of parental self-efficacy, coping, shared management, depressive
symptoms, and QOL
H1: participants who receive the BFS intervention will differ from participants who do not receive the
BFS intervention in reported scores of self-efficacy....
questions and hypotheses that ANOVA addresses - Correct Answer - ex: the main hypotheses were
supplementation with gum arabic or psyllium improved stool consistency and decreased the proportion
of incontinent stools compared to placebo (more followed)
degrees of freedom and p value are used to read the appropriate F ratio value
if the sampling distribution F ratio value is larger than your test statistic - the null hypothesis is accepted
conversely, if your test statistic is larger than the value (falls w/in the rejection range of the sampling
distribution) the null hypothesis is rejected and the research hypothesis accepted
the problem with running multiple t tests - Correct Answer - rate of Type I errors increases
the more tests you perform, the more likely it is that some will be significant by chance alone
mutually exclusive
HA: u1 doesn't equal u2, u1 doesn't equal u3, u2 doesn't equal u3, or u1 doesn't equal u2 which doesn't
equal u3
what does ANOVA tell us? - Correct Answer - overall, is there a difference between groups? this reduces
the possibility of type I error
assumptions for one-way ANOVA - Correct Answer - robust and tends to yield accurate results even if all
of the assumptions aren't met
4. DV - homogeneity of variance - or the groups have equal variances (standard deviation around those
means for each group has to be relatively similar)
if they are significantly violated: you can run the Kruskall Wallis (compares by rank and not mean scores,
non-parametric--also tells you if there is a statistically significant difference between groups and then
you can run pairwise comparisons)
the deviation is squared and all deviations from each group are added to get the SS-WG
BG-SS - Correct Answer - take deviation of group mean from grand mean
by the way - the total SS is simply the WG-SS and the BG-SS added
SS/df
MS for BG-SS=BG-SS/(k-1)
ANOVA logic - Correct Answer - if group means are equal - there is no between group variability - that
is, the BGV=0 and the F ratio will be 0
if the group mean scores are different, the question is whether the differences are b/c the population
means are different or the difference is due to random chance
if you find overall significance (and only if).... - Correct Answer - you need to conduct a post-hoc test
two commonly used tests are the Scheffe and the Tukey
determining sample size and power - Correct Answer - as with the t- test, using expected effect size,
desired power, and the alpha level sample size (# of subjects per group) can be determined
same concept: same people measured over time for more than 2 groups
RANOVA as a "within subjects" design - Correct Answer - 1. repeated measure of same variable
overtime (BP response to treatment)
2. exposing all subjects to all treatments - so each subject serves as his/her own control
examples of exposing subjects to all treatments - Correct Answer - accurate temps in critically ill
patients (Swan-Ganz, rectal, axillary, tympanic)--run overall test (f-ratio), then post hoc tests if
statistically significant
effectiveness of 3 treatments (Tai Chi, relaxation, meditation) in relieving chronic "stable" pain in
patients with osteoarthritis--assign treatment in random order, allow a "wash-out" period (more
important with meds--always possibility of latency effect and what they learned in first treatment they
apply to second treatment, they may know the questions you're asking, etc.), economy of scale (number
of patients)
following patients over time - Correct Answer - used to see how patients over time following an illness
even
assumptions of RANOVA - Correct Answer - similar to one-way ANOVA except for compound symmetry
results (get 3): main effect by treatment, main effect by time, interaction of treatment x time
ANCOVA - Correct Answer - combination of multiple regression and ANOVA to measure differences in
group means
error variance is reduced by controlling for variation in DV that comes from extraneous variable(s)
Stage 1: covariate (regression piece): the variable that you want to control or the variable that you want
to remove in terms of influencing the score of the DV
Stage 2: using an ANOVA approach, the variance that remains in the DV is explained
intervening variable