4310 Exam 2

1. Validity: the extent to which inferences made from a test are appropriate, meaningful, and
a. Does the test measure what it is supposed to measure
b. Actually validate the use of an instrument, not the instrument itself
c. Valid=appropriate, meaningful, useful, for the purpose, respondents, circumstances
i. E.g: IQ tests for driver’s license? School placement?
d. Types of Validity
i. Content validity
1. Property of a test such that the test items sample the universe item for
which the test is designed
2. How to establish
a. Content Expert
b. Do items represent all possible items?
c. How well do the number of items reflect what was taught?
3. How representative is the measure of the entire domain of content that
is supposed to measure?
4. Usually used in the context of achievement tests for certification and
5. Examples: Driving test, First Aid certification, college quizzes and exams,
ii. Criterion validity
1. Any variable one wants to “predict” by measuring another
2. E.g: SATs predict College GPA, typing tests predict clerical “competence”
3. How is it established?
a. Concurrent validity
i. How well does my test correlate with the outcomes of
similar test right now?
b. Predictive validity
i. How well does my test predict performance on a similar
measure in the future?
c. Relationships between an instrument and independent events
such that the test either “predicts” or correlates with the event.
d. “events” can occur before, during, or after the measurement.
i. Neuropsychological tests “predict” brain impairment,
confirmed by MRI
iii. Construct validity
1. Abstract concept made up of interrelated variables
a. E.G honesty, intelligence, depression
2. The results of measurement follow from the theory/hypothesis
3. The results correlate with other, related measures
4. How do we determine construct validity?
a. Review the literature on the theoretical construct you are trying
to measure
b. Administer your measure and other measures of the same
construct, plus measures of other unrelated constructs to a
sample of subjects.
c. Compare the correlations between your measure and other
d. Related constructs should have strong correlations. Unrelated
constructs should have weak correlations
e. Reliability
i. The extent to which a measurement is free of error
ii. How close the observation is to the true score
f. Validity
i. The extent to which a measurement scale does what it says it does
ii. How meaningful, appropriate and useful is the measurement applied?
1. Examples:
a. Are heart rate monitors a valid instrument for measuring
physical activity?
b. Is the army’s new physical fitness test valid?
g. Validity and Reliability
i. Two different, but related concepts
1. A test can be reliable and not valid
2. A test cannot be valid until it is reliable because…
a. A test cannot do what is supposed to do (validity) until it does
what It is supposed to do consistently (reliability)
h. Take home messages
i. Science is only as good as its measures
ii. Scale development is hard; so use someone else’s reliable, valid measures to the
extent possible
iii. How can you measure your variables? Outcome? What can you compare it to?
iv. Error is bad, Variance is good
v. Use interval/ratio measures when possible
2. Z-score
a. The number of standard deviations that a given value x is above or below the mean
b. Z- score
i. Commonly used standard score
ii. Allows comparison and interpretation of virtually any distribution
iii. Can be calculated from interval and ratio scores only
iv. Indicates how many standard deviations a score is above or below the mean
v. Communicates a score’s relative location in a distribution
c. Interpreting Z scores
i. Whenever a value is less than the mean, its corresponding z score is negative
ii. Ordinary values: z score between -2 and 2
iii. Unusual values z score < -2 or z score >2
d. Normal Distribution
i. Famous bell curve
ii. A very well defined distribution that is common in nature
iii. 95% of data in a normal distribution are in the interval -1.96 < z < 1.96

v. Z score Formulas

vi. Sample:

vii. Population:
1. Round z to 2 decimal places
viii. Percentile and Quartile
1. Partition a set of sorted data according to relative number of values
2. Q1(First Quartile) separates the bottom 25% of sorted values from the
top 75%
3. Q2 (second Quartile) same as the median; seperates the bottom 50%
sorted values from the top 50%
4. Q3 (Third quartile) separates the bottom 75% of sorted values from the
top 25%
5. Quartiles divide ranked scores into four equal parts
6. Percentiles partition data into 100 groups
ix. Exploratory Data Analysis (EDA)
1. The process of using statistical tools (such as graphs, measures of
center, and measures of variation to investigate data sets in order to
understand their important characteristics
x. Outlier
1. A value that is located far away from almost all of the other values
2. An outlier can have a dramatic effect on the mean and standard
3. An outlier can have a dramatic effect on the scale of the histogram so
that the true nature of the distribution is obscured
3. Probability
a. 0 ≤ p ≤ 1
b. Represents how likely a specific event is to occur
c. For random events: # of desired outcomes divided by total # of possible outcomes
i. The number of desired outcomes, divided by the total number of possible
ii. If you flip a fair coin, and you want to get tails, you have one (1) outcome that
will satisfy that event, and you have two (2) possible outcomes p=0.5 or 50%
d. Probability (p) is used to state how confident we can be about the existence of a
particular statistical relationship
e. A smaller probability value means that we can be less confident that the observed
statistical relationship is real
i. Range from certain, likely, 50/50, unlikely, impossible
f. For any unknown event A, A’s probability can be expressed as: 0  P(A)  1
g. The probability of two events, A and B, both happening is the product of each event
i. P(A & B) = P(A) x P(B)
ii. Example: The probability of drawing a Jack from each of two different decks of
cards: P(J) x P(J) = (1/13)(1/13)=0.0059
h. Rare Event Rule
i. When data occurs that is extremely improbable, we must question the
assumption that it occurred randomly
ii. Example: An all-male jury is selected for a controversial case involving women’s
1. What are the odds of 12 male jurors being randomly selected from the
2. p = (0.5)12 = 0.00024
4. Hypothesis
a. Formal hypotheses
i. Inferential statistics always starts with a claim
ii. Examples:
1. “you can lose weight by switching to a high fat/low carb diet.”
2. “people who attend church/temple/mosque regularly have lower
cancer rates.”
3. “my uncle is psychic
b. Null hypothesis (H0)
i. Negation of a claim; Skeptical choice
ii. H0 says:
1. The claim is false
2. There is no effect
3. Your uncle is not psychic
c. Research hypothesis (H1)
i. Formal statement of the claim; assertive and positive
ii. H1 says:
1. The claim is true
2. A has an effect on B
3. There is a relationship between A and B
d. Positive Result
i. We assume that H0 is true when we analyze our data
ii. We reject H0 if it is very unlikely to result in what we observed
iii. In research, we are very conservative and skeptical. We do not reject H0 unless
we really have to.
iv. Rejecting H0 is a positive result
v. Not rejecting H0 is a negative result
e. Example
i. Claim: Men are taller than women
X men  X women
ii. Research hypothesis:

iii. Null hypothesis: men
 women

iv. This calls for a right-tailed test

v. Claim: Hand span is correlated with grip strength

vi. Research hypothesis r 0

vii. Null hypothesis:

 0
viii. This calls for a two tailed test
f. Test Statistic
i. a value that comes from your sample data
ii. Used to test the null hypothesis
1. E.g z-score
2. Linear correlation coefficient, r
3. Proportion of successful trial, P
4. Difference of means,
g. Research hypothesis
i. The research hypothesis is a formal statement of the claim
ii. H1 says:
1. “Yes, A has an effect on B.”
2. “There is a relationship between A and B.”
3. “There is a difference between A and B.”
4. “A reduces B”
5. “A is greater than B”
6. “A increases as B decreases”
h. Null hypothesis
i. The negation of the research hypothesis
ii. H0 says:
1. “No, there is no effect.”
2. “There is no relationship between A and B.”
3. “There is no difference between A and B.”
i. Scientific Method
i. Assume that H0 is true
ii. Select an appropriate sample
iii. Perform experiment (or make observations)
iv. Collect data
v. Given that H0 is true, is it likely that you would end with the data that you got?
1. Yes: Fail to reject H0
2. No: Reject H0 ; the evidence is conclusive
j. Test statistic
i. A value that is calculated from your sample data
ii. It describes how extreme is your data
iii. If H0 is true, the test statistic is a random variable from a known distribution
k. Significance Level
i. Denoted by alpha is probability representing how rare or unusual (or extreme)
must a test statistic be in order to reject the null hypothesis
ii. In this class its usually 5% or 0.05
l. Critical Value

ii. Critical value is a value of the test statistic that is used to determine the result of
the hypothesis test
iii. If the test statistic has a smaller probability than the critical value, the null
hypothesis will be rejected
m. Conclusions in Hypothesis Testing
i. Always test the null hypothesis
ii. The initial conclusion will always be one of the following:
1. Reject the null hypothesis
2. Fail to reject the null hypothesis
n. Decision Criterion
i. Reject H0 if the test statistic fails within the critical region
ii. Fail to reject H0 if the test statistic does not fall within the critical region
o. P-value
i. The probability of getting a value more extreme than the test statistic by
chance, assuming that the null hypothesis is actually true
ii. If the p-value is less than the level of significance, we reject the null hypothesis.
p. P-value method
i. Reject H0 if the P-value   (where  is the significance level, such as 0.05).
ii. Fail to reject H0 if the P-value > .
q. Another option
i. Instead of using a significance level such as 0.05, simply identify the p-value and
leave the decision to the reader


u. Type I Error
i. The mistake of rejecting the null hypothesis when it is true
ii. The symbol alpha I used to represent the probability of type 1 error
v. Type II Error
i. The mistake of failing to reject the null hypothesis when it is false
ii. The symbol beta is used to represent the probability of a type II error.
w. Controlling Type I and Type II Errors
i. For any fixed alpha, an increase in the sample size, n, will cause a decrease in
ii. For any fixed sample size, n, a decrease in alpha will cause an increase in beta.
Conversely, an increase in alpha will cause a decrease in beta
iii. To decrease both alpha and beta, increase the sample size
x. Hypothesis tests
i. Critical value method
1. Based on statistical model, find a value that partitions 95% of the
“usual” values from 5% of the “unusual values
ii. P-value method
1. The probability of getting your test statistic or one more extreme if H0 is
iii. Type I Errors
1. False positive
2. When you reject a H0 that is true
3. Level of significance, a
iv. Type II Errors
1. False negative
2. When you fail to reject a H0 that is false
3. Beta
5. One sample Z test
a. One sample Z test: a special hypothesis test for comparing a sample to a population

b. H1:
X  X  X 
c. H0:
X 
d. Requires a priori knowledge of the population mean and SD (like census data)
e. Useful for questions like:
i. Do left-handed people have higher IQs than the general population?
ii. Do UH students consume more energy drinks than other college students?
f. Assume H0. The sample was randomly selected from the population in question
g. Therefore, any difference between x(bar) and µ is due to random effects (sampling
X   SEM  
h. Test statistic: SEM n
i. If null hypothesis is true, z has a normal distribution
6. The t-test
a. A special hypothesis test that is used to determine if there is significant difference
between two groups

b. Ex:
c. Like all hypothesis tests, a t-test will tell you whether or not you should reject the null
d. In other words are you results statistically significant?
e. Aka student’s t-test
f. Step 1: calculate the t value
i. You need the following:
ii. X1: Mean value of group 1
iii. X2: Mean value of group 2
iv. n1: number of subjects in group 1
v. n2: number of subjects in group 2
vi. s1: Standard deviation of group 1
vii. s2: Standard deviation of group 2
g. Step II: determine the degrees of freedom
i. Df=n1 + n2 -2
ii. Degrees of freedom affect the shape of the t-distribution
iii. The t-distribution looks similar to the standard normal distribution
iv. It is symmetrical
v. Mean:t=0
h. Step 3: Determine the critical value of t
i. Remember the critical value is the cutoff value of the test statistic that will
cause us to reject H0
i. Step 4: Compare your t-value to the critical value
i. Make a decision:
1. Reject H0
2. Don’t reject H0
3. Student’s t-test is used to determine if there is a significant difference
between two groups
4. It is a quick& easy test that is applicable in many studies

