Professional Documents
Culture Documents
Quiz Solutions
Quiz Solutions
Quiz Solutions
QUIZ 1
1. Turkey Problem = Verifying
Science = falsifying
Data = measuring
2. Induction = uncertainty
deduction= certainty
Science = data
3. Is medicine a science?
a. No it is primarily concerned with fixing things that are wrong - specifically with healing
the sick - not with a principled understanding of the natural world
4. Key characteristic of science
a. Involves both induction and deduction in an iterative fashion
5. The black swan problem concerns
a. The fact that induction can never deliver certainty
b. The fact that something that has never happened before is not impossible
6. Deduction is
a. A method to make conclusions about specifics from general principles
7. Someone asserts that “Frogs are green”. How would you test this assertion
a. Looking for a frog that is not green
8. You present Nature with the numbers 3, 6 and 9 and ask whether they are members of the
set.Each time, Nature answers that yes, these numbers are members of the set. You conclude that
the rule is: "Consecutive multiples of 3 are members of the set".Which of these is the best
(maximally informative) number to try next in order to test this rule?
a. 2
9. Why are anecdotes not data?
a. Because they do not result from a formal measurement process
b. It is unclear how representative one's subjective experience is, and science strives for
intersubjective truths
c. People have all kinds of cognitive biases - who knows if the anecdotes are even true
10. Is math a science?
a. No. Math is entirely deductive so it doesn’t qualify as a science
11. μ = mean
Σ= sum
σ = standard deviation
12. The fundamental problem with induction is that
a. The conclusions one draws with inductive methods can be wrong
13. What are some challenges to learning statistics
a. Statistics is not intuitive - the brain is not built to do statistics
b. Most people are exposed to statistical concepts only late in their educational career
c. Most people who are teaching statistics are not optimally suited to communicate the
concepts in an engaging way
14. Induction is
a. A method to derive general principles from specific instances
15. Why is statistical literacy increasingly important
a. We have more and more methods to record data.
b. We have better and better ways to analyze data.
c. Data is becoming ever more ubiquitous.
d. All fields of human endeavor are increasingly relying on data for decision making
QUIZ 2
1. Which of the following is a valid description of what a probability is?
a. Relative frequency
b. Degree of belief
c. A number of 0 and 1
d. Quantified plausibility
2. If events A and B are mutually exclusive, the probability of their union is equal to
a. The probability of A added to the probability of B
3. The probability of the intersection of two independent events A and B can be calculated as
a. p(A)*p(B)
4. The probability of all outcomes in a sample space add up to
a. 1
5. The probability of the intersection of A and B is equal to
a. It can’t be determined from this information, it depends on whether A and B are
independent are not
6. You attend the first lecture of a course and estimate the probability that you will get an "A" in this
course. The next lecture, you adjust your estimate of getting an "A" based on the content of the
2nd lecture. This process implies which interpretation of probability? A. B. C. D. E.
a. Bayesian/subjective
7. If events A and B are not mutually exclusive, their combined probability (A happening or B
happening) is given by the
a. Probability of A plus the probability of B minus the probability of A and B
8. In a binary world with only two possible and mutually exclusive outcomes A and B, the
probability of B can be arrived at by calculating
a. p(B)
b. 1-p(A)
c. p(~A)
9. Addition rule = union
Multiplication rule = intersection
Lack of independence = conditional probability
10. An event A is independent of an event B if
a. The probability of A equals the probability of A given B
b. The probability of the intersection of A and B is equal to the probability of A
times the probability of B
11. The probability that someone has blonde hair is 0.2. The probability that someone has an
IQ over 115 is 0.15. Assuming that IQ and hair color are independent, what is the
probability that someone picked at random is both blond and has an IQ over 115?
a. .15*.2 = 0.03
12. At age 60, there is an equal number of men and women in the population. Schizophrenia
is independent of gender* and affects 2% of the population. What is the probability that a
randomly picked person from the population is a 60 year old man with schizophrenia?
a. .5*0.02 = 0.01
13. The probability that first-year graduate students complete a Ph.D. program with a doctorate
within 10 years is 0.48. This implies that XXX fails to do so.
a. 0.52
14. If there are two candidates running in a political primary and they receive an equal number of
votes in a county, the outcome is sometimes decided by a coin toss. What is the chance that one
candidate would win all coin tosses if there are a total of 6 of them - assuming that the coins are
fair?
a. 1/2^6 = 1 in 64 (1.5%)
15. The probability that someone is smart is 0.25. The probability that someone is not smart is
a. 0.75
16. 25% of addicts are using both cocaine and heroin. 70% of addicts use cocaine. What is the
probability that an addict is using heroin if they are using cocaine?
a. .25/.7 = 0.357
17. The probability that someone is drunk at a party is 0.7. The probability that someone is getting
into a fight at a party is 0.1. The probability that someone is drunk and getting into a fight is 0.08.
What is the probability that someone is getting into a fight if they are drunk?
a. .08/.7 = 0.114
18. You want to do a psychological experiment. On any given day, there is a 30% chance that your
participant won’t show up. In addition, there is a 10% chance that the stimulus presentation
software won’t work. There is also a 5% chance that there is a problem with saving the data and a
3% chance that there are other problems like experimenter error. For the sake of simplicity, you
can assume that all these events are independent. What is the probability that you will be able to
do a successful experiment (be able to record useful data) on any given day?
a. .7*.9*.95*.97 = 0.58
19. The probability of being born male is 0.52. The probability of being born female is 0.48. Assume
that someone close to you is pregnant. What is the probability that the child will be either male or
female?
a. 1
20. The probability that someone is drunk at a party is 0.7. The probability that someone is getting
into a fight at a party is 0.1. The probability that someone is drunk and getting into a fight is 0.08.
What is the probability that someone was drunk if they got into a fight?
a. 0.08/.1 = 0.8
21. The probability of being depressed is independent of having dark hair. The probability of
being depressed is 0.1. The probability of having dark hair is 0.5. The probability that
someone has dark hair and is depressed is
a. 0.5*0.1 = 0.05
22. The probability that someone's last name starts with a vowel is 0.4. The probability that
someone's last name starts with the letter "S" is 0.1. What is the probability that someone's last
name either starts with a vowel or the letter S?
a. 0.4+0.1 = 0.5
23. From empirical studies, we know that the lifetime prevalence for depression is 10%. We also
know that the probability of someone being both female and depressed at the same time is 0.07.
What is the probability that someone who is depressed is female?
a. 0.07/.1 = 0.7
24. 40% of prisoners have tattoos and get into fights with other inmates. 50% of prisoners get into
fights with other inmates. What is the probability that a prisoner that is getting into a fight has a
tattoo?
a. .4/.5 = 0.8
QUIZ 3
1. The mean absolute deviation (MAD)
a. Is a more robust measure of dispersion than the standard deviation
2. The principal problem of the range as a measure of dispersion is that it
a. Is extremely sensitive to outliers
3. You measure IQ in a group of 200 people with severe learning disorders. In general, this sample
is described well by a mean of 75 and a standard deviation of 15. However, one person who
absent-mindedly wandered in from the Mensa convention next door was also tested and measured
at an IQ of 142. The measure of central tendency that is most affected by this is likely the
a. Mean
4. Which of these plots is most consistent with the variable on the x-axis
being independent of the variable on the y-axis?
a. A and D
5. Conceptually, a correlation is
a. The covariance normalized by the product of the standard
deviation
6. The two operations that underlie the calculation of the mean and the
median, respectively are
QUIZ 4
1. Nominal scale = counting
Ordinal scale = ordering
Interval scale = adding
2. Interval scales allow us to meaningfully interpret measures in all ways except for
a. Ratios between two measures
3. Given measurements on a nominal scale, numbers should be interpreted as
a. Labels
b. Categories
4. Taking note of someone’s gender is a good example of taking measures on a
a. Nominal scale
5. Given measurements on an ordinal scale, we can
a. Meaningfully interpret the relative magnitude of two numbers
6. Reaction times are a good example of measures used in psychology that can be considered to be
on a
a. Ratio scale
7. What generative process - in nature - yields normal distributions
a. The random combination of many independent factors
8. Given measurements on a ratio scale, we can meaningfully interpret
a. Distance between two numbers
b. The ratio between two numbers
c. The relative magnitude of two numbers
d. Assume the scale has an absolute zero
9. Standardized tests of achievement like the SAT are a good example of measures on a
a. Interval scale
10. Given measures onn an ordinal scale, it is meaningful to interpret the
a. Mode
b. Median
11. You roll 10 dice and note the sum of this first roll. You keep the results of 3 dice, but roll the other
7 again, then noting the sum (of all 10 dice) of this second roll. You do this (throwing the dice in
this way) 100 times. The expected correlation between the sum of the first and the sum of the
second rolls is
a. 3/10 = 0.3
12. Asking someone how much pain they are in, on a scale from 0 to 10 is a good example of a
a. Ordinal scale
13. A likert scale is a good example of measures on a
a. Ordinal scale
14. Why does correlation not imply causality
a. Correlation is just a statistical relation
b. The relation between two variables in a correlation is bidirectional
c. There are other variables that could mediate the observed correlation
QUIZ 5
1. What statement about instruments of measurement is accurate
a. If the instrument is not reliable, it can’t be valid
2. Both reliability and validity are - at their core - based on
a. Correlations
3. Why is the MBTI so unreliable
a. Because the underlying distribution of scores in each dimension is unimodal, not bimodal
b. Because there are 4 independent dimensions and we categorize each one
c. Because the scores of most people fall close to the decision boundary in each dimension
4. What does simple linear regression minimize
a. The sum of the squared differences between predicted values and measurements
5. Regression to the mean is likely to be a problem in
a. Cases with a low correlation between predictor and outcome
b. Repeated measures designs with relatively low reliability
6. A measurement is considered to have no criterion validity if
a. They don’t predict anything in the real world
7. A measurement is considered to be not objective if
a. They systematically depend on who makes them
8. A measurement is considered to be not reliable if
a. There is no consistency between repeated measures
9. Which statement about instruments of measurements is most accurate
a. If an instrument is not objective it cannot be reliable or valid
10. What is the fundamental problem with operational definitions of a construct
a. There is always a degree of arbitrariness and it is unclear whether the operational
definition captures the construct fully
b. There are always other operational definitions possible
11. Showing that your experimental result replicates (holds) in a population you haven’t studied yet is
a good example of
a. External validity
12. You want to measure someone’s aggressiveness how?
a. Ask the person how aggressive they are on scale from 1-5
b. Ask someone who knows the person how aggressive they are on scale from 1-5
c. Review criminal record to determine how many prior convictions for violent crimes they
have
QUIZ 6
1. The probability is 0.05 what are the odds
a. 0.05/(1-0.05) = 1:19
2. A natural experiment is most suitable when
a. One would like to do an experiment, but cannot do so, for technical or ethical reasons and
one would like to know causality
3. The logit function is defined as
a. The log of the odds
4. Simple linear regression = line
Multiple linear regression = plane
Logistic regression = “S”
5. If SSexplained is zero you might as well guess
a. The mean of the outcome (measurements)
6. The best way to control for confounds is to do
a. An experiment
7. Probability is 0.5 what are the odds
a. 0.5/(1-0.5) = 1
8. Which statement about experiments is most accurate
a. Participants are randomly assigned to conditions
b. Experiments potentially allow establishing causality (in contrast to observational studies)
c. All potential confounds are controlled by randomization
9. In the general linear model, “beta” refers to
a. The weights of the predictors
10. Probability is 0.8 what are the odds?
a. 0.8/(1-0.8) = 4:1
11. Partial correlation allows to
a. Control for confounds by correlating the residuals
b. Bivariate correlation between X and Y:rXY
c. Partial correlation between X and Y while controlling for Z: rXY.Z
d. Correlating residuals of the regression between X and Z and between Y and Z
12. Which of these methods is associated with the goal of controlling for confounds
a. Experiments
b. Natural experiments
c. Multiple regression
d. Partial correlation
A researcher wants to perform a simple linear regression to find out if socio-economic status of a teacher
can predict whether they work at a primary or secondary school. Why can’t this be done?
→ because outcome variable is dichotomous
Which is the most robust measure of dispersion
→ mean absolute deviation
Difference between Pearson’s and Spearman’s correlation
→ spearman’s correlation does not require the relationship to be linear
What central tendency measure is most robust
→ median
Which of these is essentially a measure of objectivity
→ inter-rater reliability
Regression to the mean is caused by
→ random measurement error
Occurs when unusually small//large measurement values are followed by
values that are closer to the population mean
Only avoided if measurements are perfect and completely determined by predictor
The regression line is drawn based on
→ maximizing betas
Error types
→ systematic error
Inaccuracies that are reproducible; often due to a problem that persists throughout the experiment
Fix: difficult to detect but can spot and correct through lots of care
→ random error
Random statistical calculations in the measured data due to the precision limitation of
measurement device
Fix: collect more data → can be evaluated through statistical analysis and can be reduced by
averaging over large number of observations
QUIZ 7
1. You calculate the probability of obtaining the observed difference in mean BDI scores is 0.06
with 30 participants
a. Do a new study with a larger number of participants to discern whether the drug has a
modest effect that was unlikely to be detected given small sample size
2. Assuming a large sample size (n>100) and assuming random and independent sampling, the
distribution of sample means
a. Approaches a normal distribution
3. All of these are reasonable synonyms for statistically significant
a. Unlikely, surprising, unexpected, rare
4. You do an online survey and 1 in 7 people respond (350 sent to, 50 responses)
a. Response rate is so low that one might wonder whether the sample of people who
responded to the survey is representative of the population
5. A patient with anxiety believes that if she drinks vodka when she gets constipated it will relieve
her constipation does this mean that vodka is effective in relieving constipation?
a. No, regression to the mean is a real possibility here
b. Not necessarily, as the baseline was not considered – the patient might be
disremembering as she pays more attention to this issue on days when she drinks
6. You want to develop a new drug that improves creativity. Despite your best efforts, you cannot
show that there is a significant difference between the creativity scores of people who do vs. don't
take the drug. In reality, the drug is actually effective in improving creativity. You
a. Committed a type II error
7. You do a study on whether an antidepressant drug is effective and show that there is a significant
difference between people who do and who don't get the drugs in terms of their depressive
symptoms. In reality, the drug does not work. You
a. Committed a type I error
8. You want to develop a drug to increase IQ. So far, you have created 4 candidate substances - A,
B, C and D. They all shifted the group IQ mean (tested on 30 volunteers each) away from the
population mean. You calculated the following parameters A: z-score of 2.5 B: Group mean =
115, SD = 15 C: Group mean = 130, SEM = 2 D: Group mean = 130, SEM = 4 Which of the 4
outcomes is most unlikely - and thus most promising to increase IQ?
a. Substance C
9. As you increase sample size, the standard error of the means (SEM)
a. Decreases as a function of the square root of the sample size
10. You randomly sample 4 attendants of a power posing seminar. You note that 3 of the 4 exhibit
higher levels of confidence than the median of the general population. The probability of
obtaining this result by chance (if those attending power posing seminars do not differ in their
levels of confidence from the general population) is
a. 4*(0.5)4= 0.25
b. Probability of getting 3 higher than median * ½^sample
c. xxxo | xxox | xoxx | oxxx → 4 possibilities
11. You randomly sample 5 inmates in a maximum security prison. You note that all 5 exhibit levels
of aggression that are higher than the median of the general population. The probability of
obtaining this result by chance (if inmates in maximum security prisons do not differ in their
levels of aggression from the general population) is
a. 1*(0.5)5 = 0.03125
b. xxxxx → only one possibility
QUIZ 8
1. An independent samples t-test with two samples and 41 observations in first sample and 39 in the
second sample have how many degrees of freedom
a. 78
2. What is true about degrees of freedom
a. They denote the number of values in a calculation that are free to vary
b. They need to be adjusted when estimating a sample parameter from the sample itself and
when using other parameters to do so that were also estimated from the sample
c. To avoid “double-dipping” we have to reduce the degrees of freedom for each parameter
we calculate from the sample itself
3. Standard error of the sample mean is derived by taking the standard deviation of the sample and
dividing it by
a. Square root of n
4. You want to know whether people with beards are more perceived to be more trustworthy. As a
student researcher, you only have funds to recruit 10 models. The most powerful design to reveal
that there is a difference is to
a. Randomly pick 10 bearded models, ask people to rate them before and after the beard
was shaved off then do a t-test for correlated groups.
5. The t-test is designed to establish differences between sample means for these situations
a. Unknown population variation
b. Known population variation
6. 100 observations in sample and calculate 4 parameters how many dof
a. 96
QUIZ 9
1. You decide to study the effects of smoking, drinking and partying on life satisfaction. To do so,
you assign people randomly to one of two smoking conditions (smoking or not), one of three
drinking conditions (no alcohol, 1 drink per day, several drinks per day) and partying conditions
(no partying, 1 hour of partying per day, 2 hours of partying per day). This design has
a. 3 factors, 18 total conditions
i. Smoking (2) x drinking (3) x partying (3) = 18
2. Idea behind ANOVA is
a. Analyze variance by dividing it in variance between and within groups
3. Only main effects causes factor lines to be
a. Parallel
4. Only interaction and no main effects causes factor lines to be
a. Crossing
5. Number of t-tests if you wanted to test all possible mean differences between 16 groups
a. (n*n-1) / 2 = (16*15) / 2 = 120
6. Advantages and concerns associated with Bonferroni correction
a. Brings the overall alpha level back to the intended level, but increases the risk of missing
real effects for any given comparison
7. F-distribution has this many parameters
a. 2
8. 5 way 5x5x2x2x2 ANOVA assuming there is no effect of any of the treatments, if you compared
all groups with t-tests at alpha 0.05 you expect how many false positives
a. (5x5x2x2x2) = 200 conditions
b. (200*199) / 2 = 19900 pairwise comparisons
c. 19900 * 0.05 = 995 ~ 1000 type 1 errors
d. 25 = 32 terms
QUIZ 10
1. What is true about eta squared
a. It is a measure that is analogous to R squared in correlation and regression
b. It can be calculated by dividing the product of F times the between degrees of freedom by
the product of F times the between groups of freedom plus the within degrees of freedom
c. It can be calculated by dividing sum of squares between by sum of squares total
d. It is a measure of variance explained by ANOVA
2. Plausible way to write 2x3x4 ANOVA
a. Score = Grand mean + Effect Factor 1 + Effect Factor 2 + Effect Factor 3 + Interaction
between Factors 1&2 + Interaction between Factors 1&3 + Interaction between Factors
2&3 + Interaction between Factors 1&2&3 + Error
3. Total number of all possible interaction effects in full general linear model for 5x5x5x5 ANOVA
a. 4 terms so 24 = 16
b. 5 main effects (intercept, me A, me B, me C, me D)
c. So 16 - 5 = 11 interaction effects
4. How many 4-way interaction effects in 2x2x2x2x2 ANOVA
a. 5
b. ABCDx | ABCxE | ABxDE | AxCDE | xBCDE
5. ANOVA values: what proportion of variance can be accounted for by model?
Corrected Model - SS: 1205
Intercept - SS: 3500
Error - SS: 4900
Corrected Total - SS: 6105
QUIZ 11
1. Chi-squared t
a. Nonparametric significance test
b. Assumes data to be at least nominal
c. Compares expected and observed frequencies
2. As number of dof increases chi-squared distribution
a. Peak of probability distribution shrinks and moves to the right as the distribution
broadens
3. Why is it not a good idea to run a t-test if one has ordinal data that is not normally distributed
a. t-tests compares means and the mean is not meaningful on the level of ordinal data
b. mean is only good representation of the distribution if it is normal
4. Mann-Whitney U (Wilcoxon rank-sum) test
a. Assumes data to be at least ordinal
b. Compares medians between two samples
c. Basic idea is to arrange all values from both distributions in order, then assign ranks and
calculate the sum of ranks from one sample, then compare it with the ranksum of the
other sample
5. You wonder if weather (rain, cloudy or sunny) makes a difference on mood (happy or not). To
determine this, you sample the moods of 200 people on a sunny day, 200 people on a rainy day
and 200 people on a cloudy day and record whether they were happy or not. You then want to do
a chi square test on this data. What are the degrees of freedom?
a. 2 (types of weather 3-1 = 2)
QUIZ 12
1. You want to show that there is an effect of positive thinking on task completion. You compare
two groups - one that engages in positive thinking and one that doesn't. You keep adding
participants one by one until the t-test you do at every step yields a significant result. This is
known as
a. P-hacking
2. Recording data of many dependent variables, testing all possible relationships and then reporting
only those that are significant is known as
a. Researcher degrees of freedom
3. Good measure to combat publication bias
a. Pre-registration
4. What is effect size
a. Mean difference divided by the pooled standard deviation
5. Why are significant results easier to publish?
a. Much easier to interpret
b. Much less common - if one compares everything with everything, most things won't be
meaningfully related
c. Inability to find something might reflect low power on behalf of the researchers
d. Inability to find something might reflect inadequate operationalization of IV or DV, not
anything about reality.
6. You are interested in whether NYU students prefer Fresh&Co or Pret A Manger, so you give 100
NYU students a choice between a gift card to either place and record their choices. You then
conduct a chi square test:
Sum of (observed-expected)2/expected
((18-50)2/50) + ((82-50)2/50) = 40.96
n = 100 participants; k = 2 groups
published papers on this effect and make a histogram of the p-values reported in the literature.
The following distribution of p-values would be most indicative of p-hacking and publication bias
a. A bimodal distribution with one of the peaks just below p = 0.05
11. You compare the number of depressive episodes in a control group vs. a group of participants on
antidepressants. Both groups have the same number of participants. The mean number of
depressive episodes in group 1 is 10. Then mean number of depressive episodes in group 2 is 5.
The standard deviation of group 1 is 3. The standard deviation of group 2 is 2. The effect size is
a. (Mean1 - Mean2) / √ ❑
b. (10-5)/ √ ❑ = 1.9611
12. You compare the IQ on people on NZT with people not on NZT. Both groups have the same
number of participants. The mean IQ in the NZT group is 125, the mean IQ in the control group is
100. The standard deviation of both groups is 15. The effect size is
a. (125-100) / √ ❑ = 1.666
13. A study investigated the effects on dreaming of ingesting 240 mg vitamin B6 (pyridoxine
hydrochloride) before bed for five consecutive days. The B6 group reported 5 items of dream
recall on average whereas the placebo-control group reports 3 items of dream recall on average.
The pooled standard deviation is 4 items. The effect size is
a. Mean1 - Mean2 / pooled SD
b. (5-3) / 4 = 0.5
14. An experimental study investigated the effect of mental imagery training on a ball throwing task.
In the mental imagery group, participants threw the ball 50 yards on average, with a standard
deviation of 50. In the control group, participants threw the ball 45 yards on average, with a
standard error of 5. Each group consists of 100 participants. The effect size is
a. (50-45) / √ ❑ = 0.105117
b. (50-45) / √ ❑ = 0.09513
QUIZ 13
1. In principle, this would make a null-result much more easy to publish as it is more informative
a. Higher power
2. Alpha 0.05, R is 0.1, power is 0.95 → what is probability effect reported in study is real
a. (power*R) / ((power*R)+alpha)
b. (0.95*0.1) / ((0.95*0.1)+0.05) = 0.65517
3. Alpha 0.05, R is 0.7, power is 0.8 → probability effect reported in study is real
a. (0.8*0.7) / ((0.8*0.7)+0.05) = 0.91803
4. Alpha 0.05, R is 0.35, power is 0.1→ what is probability effect reported in study is real
a. (0.1*0.35) / ((0.1*0.35)+0.05) = 0.41176
5. Alpha 0.05, R is 0.35, power is 0.9→ what is probability effect reported in study is real
a. (0.9*0.35) / ((0.9*0.35)+0.05) = 0.86301
6. Increasing power usually achieved by
a. Increasing sample size
7. Fundamental relationship between confidence interval and statistical significance
a. The confidence interval is determined by the significance level alpha, so if something is
significant at a certain level, the confidence interval will have a certain width
8. Fundamental relationship between power and confidence intervals
a. Higher power = narrower confidence interval
9. Sample mean 100, SD 10, crit z-value at 95% is 1.96, 20 people in sample
a. Upper bound: sample mean + (z*(SD/ √❑ )
i. 100 + (1.96*(10/ √ ❑ = 104.382
b. Lower bound: sample mean - (z*(SD/ √ ❑ )
i. 100 - (1.96*(10/ √ ❑ = 95.617
QUIZ 14
1. When would it be better to use a permutation test instead of a test?
a. When distribution of mean differences is not normal
b. When distribution of mean differences is not known
2. Could one use a correlation matrix to do an MDS?
a. Yes, but only if there is no negative correlation present, as distances can’t be negative
3. You do a PCA on 256 factors, 1st-7th eigenvalues are 70, 40, 10, 7, 5 and 3… remaining
eigenvalues are under 1 → how many should you consider as “real”
a. 7
4. Bootstrapping = drawing with replacement
Correlations = PCA
Permutation test = drawing without replacement
Distances = MDS
5. Permutation tests give a better estimate of true p value except when
a. Assumptions about the underlying population distribution are met
6. Central assumption of resampling methods
a. The sample we are resampling from is representative of the population it came from
7. Primary concern about using resampling methods on small samples
a. Rare events will necessarily be under/overrepresented (i.e. be estimated as more or less
common than they actually are)
8. When doing MDS what is interpretable
a. Correlations, distance between individual items, distances on resulting map and ordinal
scaled data
b. Orientation not interpretable
QUIZ 15
1. What does Bayes’ theorem allow you to do?
a. The inversion of conditional probabilities
2. What are the primary applications of Bayes theorem?
a. Updating of beliefs
b. Optimal integration of prior and new information
3. An event A is independent of an event B if
a. The probability of A = probability of A given B
b. The probability of the intersection of A and B is equal to the probability of A times the
probability of B
4. The probability of the intersection of A and B is equal to
a. Unable to be determined from this information (must know if A is independent from B)
5. If events A and B are mutually exclusive, the probability of their union is equal to
an idiot. What is the posterior probability that the class is going to suck (and ought to be
dropped)?
a. Use posterior probability program = 0.88
b. Probability A = 0.25
c. Probability B given A = 0.7
d. Probability B = 0.2
19. You are part of a profiling unit for serious crimes. The probability that a serial killer lives in the
neighborhood is 1 in a million. The probability that there are multiple murders in the
neighborhood if there is a serial killer is 0.99. The lifetime probability of there being one murder
in the neighborhood is one in a thousand and the probability of three murders in a neighborhood
within a year is 1 in a million. What is the probability that there is a serial killer on the loose if
there were three murders within a year in the neighborhood?
a. Use posterior probability program = 0.99
b. Probability A = 1/1,000,000
c. Probability B given A = 0.99
d. Probability B = 1/1,000,000
20. The probability that someone is cheating on their spouse is 0.3. The probability that someone
cheats on their exam if they are cheating on their spouse is 0.9. The probability that someone
cheats on their exam if they are not cheating on their spouse is 0.1. The probability that someone
is cheating on their spouse if they are cheating on the exam is
a. Use bayes theorem program = 0.79411
b. Prior probability = 0.3
c. True positive = 0.9
d. False positive = 0.1
21. You are getting married tomorrow, at an outdoor ceremony in the arid steppe. In recent years, it
has rained there only 5 days out of each year. Unfortunately, the forecasters have predicted rain
for tomorrow. When it actually rains, they correctly forecast rain 80% of the time. When it doesn't
rain, they still forecasted rain 20% of the time. The probability that it will rain on your wedding
day is
a. Use bayes theorem program = 0.05263
b. Prior probability = 5/365
c. True positive = 0.8
d. False positive = 0.2
b. This measure is meaningless, as the positive and negative deviations from the sample
mean cancel out, due to how the sample mean is defined
5. Why does correlation not imply causality
a. Correlation is just a statistical relation, the relation between two variables in a correlation
is bidirectional and there are other variables that could mediate the observed correlation
6. A measurement is considered to have no criterion validity if
a. The don’t predict anything in the real world
7. The t-distribution is similar to normal distribution but at smaller n’s it has
a. Fatter tails
8. Basic idea behind an ANOVA is
a. Analyze variance by dividing it in variance between and within groups
9. Basic idea behind Mann-Whitney U test
a. Arrange all values from both distributions in order, then assign ranks and calculate the
sum of ranks from one sample
10. Given measurement on ordinal scale we can
a. Meaningfully interpret relative magnitude of two numbers