Download as pdf or txt
Download as pdf or txt
You are on page 1of 131

CHAPTER 1

Overview of Statistics

Descriptive statistics refers to summarizing data rather than generalizing about the population.
t When we do not infer, we are only describing the available sample data.
Estimating parameters and testing hypotheses are important aspects of descriptive statistics. f When
we generalize to a population we are using inferential statistics.

Inconsistent treatment of data by a researcher is a symptom of poor survey or research design. f


Good survey data can still be misused or misinterpreted.

Empirical data are collected through observations and/or experiments. t Empirical data are contrasted
with a priori estimates (e.g., expecting 10 heads in 20 coin flips).

Business intelligence refers to collecting, storing, accessing, and analyzing data on the company's
operations in order to make better business decisions. t See Wikipedia for similar definitions of
business intelligence.

When a statistician omits data contrary to her findings in a study, she is justified as long as the sample
supports her objective. f We do not omit data unless it is proven to be an error.

A strong correlation between A and B would imply that B is caused by A. f Temporal sequence does
not prove causation.

The post hoc fallacy says that when B follows A then B is caused by A. t Temporal sequence does not
prove causation.
1.1 STATISTICS IN BUSINESS
A statistical test may be significant yet have no practical importance. t Large samples sometimes
reveal tiny effects that may not matter very much.

Valid statistical inferences cannot be made when sample sizes are small. f Small samples may be all
that we have, and statistics does have rules for them.

Statistics is an essential part of critical thinking because it allows us to transform the empirical
evidence from a sample so it will agree with our preferred conclusions. f Ethical analysts challenge
their beliefs with data rather than forcing data to their beliefs.

Statistical challenges include imperfect data, practical constraints, and ethical dilemmas. t The list
is longer, but these three are big challenges.

A business data analyst needs a PhD in statistics. f Every business person does some statistics.

The science of statistics tells us whether the sample evidence is convincing. t There are clear
scientific rules for statistical inference.

Pitfalls to consider in a statistical test include nonrandom samples, small sample size, and lack of
causal links. t These are among many other pitfalls.

In business communication, a table of numbers is preferred to a graph because it is more able to


convey meaning. f Although tables can show exact numbers, a good graph may be more helpful.

Statistical data analysis can often distinguish between real vs. perceived ethical issues. t Proper
framing of a question may reveal that there is no real ethical issue.

1.2 STATISTICS CHALLENGES:


Excel has limited use in business because advanced statistical software is widely available. f Small
businesses may lack advanced software (and training to use it).

Statistics helps surmount language barriers to solve problems in multinational businesses. t Statistics
is part of the international language of science.
Statistics can help you handle either too little or too much information. t Statistical tasks include
sampling to obtain more information or finding meaning in large piles of data.

Predicting a presidential candidate's percentage of the statewide vote from a sample of 800 voters
would be an example of inferential statistics. t Generalizing from a sample is an inference.

Surveying electric vehicle owners would provide a representative random sample of Americans' views
on global warming policies. f Not a random sample of all drivers.

An example of descriptive statistics would be reporting the percentage of students in your


accounting class that attended the review session for the last exam. t As long as you don't generalize,
it is a descriptive statistic.

1.3 STATISTICS SKILLS NEEDED:


"Bob must be rich. He's a lawyer, and lawyers make lots of money." This statement best illustrates
which fallacy?: Generalizing from an average to an individual. Many lawyers do not work for big firms.
(Remember My Cousin Vinnie?)
which is not an ethical obligation of a statistician: To support client wishes in drawing conclusions from
the data

which of the following statements is correct: Statistics is the science of collecting, organizing,
analyzing, interpreting, and presenting data.

which is least likely to be an application where statistics will be useful: Choosing the wording of a
corporate policy prohibiting smoking.

Because 25 percent of the students in my morning statistics class watch eight or more hours of
television a week, I conclude that 25 percent of all students at the university watch eight or more
hours of television a week. The most important logical weakness of this conclusion would be: using a
sample that may not be representative of all students.

which of the following is not a characteristic of an ideal statistician?: Advocates client's objectives
which of the following statements is not true: Estimating parameters is an important aspect of
descriptive statistics.
which is not a practical constraint facing the business researcher or data analyst?: Survey
respondents usually will tell the truth if well compensated.

which is not an essential characteristic of a good business data analyst: Has a Ph.D. or master's
degree in statistics. No advanced degree is needed for basic statistics, which is why all business
students study it.

An ethical statistical consultant would not always: support management's desired conclusions.

The NASA experiences with the Challenger and Columbia disasters suggest that: limited data may
still contain important clues.

which is not a goal of the ethical data analyst: To learn to downplay inconvenient data
which of the following statements is not true?: For day-to-day business data analysis, most firms rely
on a large staff of expert statisticians.

"Smoking is not harmful. My Aunt Harriet smoked, but lived to age 90." This best illustrates which
fallacy? : Small sample generalization

which best illustrates the distinction between statistical significance and practical importance?: "Our
new manufacturing technique has increased the life of the 80 GB USB AsimoDrive external hard disk
significantly, from 240,000 hours to 250,000 hours.
"Circulation fell in the month after the new editor took over the newspaper Oxnard News Herald. The
new editor should be fired." which is not a serious fallacy in this conclusion?: Using a biased sample

An ethical data analyst would be least likely to: rely on consultants for all calculations. When you
farm out your calculations, you have lost control of your work.

"Tom's SUV rolled over. SUVs are dangerous." This best illustrates which fallacy?: Small sample
generalization

"Bob didn't wear his lucky T-shirt to class, so he failed his chemistry exam." This best illustrates which
fallacy?: Post hoc reasoning

which is not a reason for an average student to study statistics?: Learn stock market strategies. To
learn about the stock market, you should probably study finance.

which is not a likely area of application of statistics in business?: Questioning the executives' strategic
decisions.

which is not a likely task of descriptive statistics?: Estimating unknown parameters

We would associate the term inferential statistics with which task?: Estimating unknown parameters
CHAPTER 2

Data Collection
2.1 VARIABLES AND DATA

Categorical and Numerical Data


A data set may contain a mixture of data types. Two broad categories are categorical data and
numerical data, as shown in Figure 2.1.
Categorical Data Categorical data (also called qualitative data) have values that are described by
words rather than numbers. For example, structural lumber can be classified by the lumber type (e.g.,
fir, hemlock, pine), automobile styles can be classified by size (e.g., full, midsize, compact,
subcompact), and movies can be categorized using common movie clas- sifications (e.g., action and
adventure, children and family, classics, comedy, documentary).

(-) last year's annual report, Thompson distributors indicated that it had 12 regional warehouses. this
is an example of ordinal level data. f "number of" is a count, which is ratio data because a zero exists
(better than ordinal).
nominal data refer to data that can be ordered in a natural way. f nominal (categorical) data would be
called ordinal only if categories can be ranked
this year, Oxnard university produced two football all-americans. this is an example of continuous
data. f the "number of" anything is discrete.
the type of statistical test that we can perform is independent of the level of measurement of the
variable of interest.f some statistical operations are restricted unless you have ratio or interval data.
your weight recorded at your annual physical would not be ratio data, because you cannot have zero
weight. f zero is only a reference point, not necessarily an observable data value.
the level of measurement for categorical data is nominal.t categorical and nominal are equivalent
terms.
temperature measured in degrees fahrenheit is an example of interval data. t for temperature, scale
distances are meaningful (20 to 25 is the same as 50 to 55 degrees), and 0 degrees fahrenheit does
not mean the absence of heat, so it is not a ratio measurement.
the closing price of a stock is an example of ratio data. t true zero exists as a reference, whether or
not it is observed.
the statistical abstract of the united states is a huge annual compendium of data for the united states,
and it is available online free of charge. t a useful reference for business (e.g., for marketing,
economics, or znance).
ordinal data can be treated as if it were nominal data but not vice versa. t you can always go back to a
lower level of measurement (but not vice versa).

Numerical Data Numerical or quantitative data arise from counting, measuring some- thing, or
some kind of mathematical operation. For example, we could count the number of auto insurance
claims filed in March (e.g., 114 claims) or sales for last quarter (e.g., $4,920), or we could measure
the amount of snowfall over the last 24 hours (e.g., 3.4 inches). Most accounting data, economic
indicators, and financial ratios are quantitative, as are physical measurements.

(-) responses on a seven-point likert scale are usually treated as ratio data. f no true zero point exists
on a likert scale.
(+) likert scales are especially important in opinion polls and marketing surveys.t likert scales are
used in all kinds of surveys
ordinal data are data that can be ranked based on some natural characteristic of the items. t for
example, the eras jurassic, paleozoic, and mesozoic can be ranked in time.
(+)ratio data are distinguished from interval data by the presence of a zcategorical data have values
that are described by words rather than numbers.t categories are nominal data but could also be
ranked (e.g., sophomore, junior, senior).
numerical data can be either discrete or continuous.t numerical data can be counts (e.g., cars owned)
or continuous scales (e.g., height).
categorical data are also referred to as nominal or qualitative data.t categories are nominal data (non
numerical), sometimes called qualitative data.
the number of checks processed at a bank in a day is an example of categorical data.f integers are
actually numerical data.
the number of planes per day that land at an airport is an example of discrete data.t integers are
discrete numerical data.
the weight of a bag of dog food is an example of discrete data.f weight is measured on a continuous
scale.
ero reference point. t the true zero is a reference that need not be observable.
it is better to attempt a census of a large population instead of relying on a sample. f a census may
founder on cost and time, while samples can be quick and accurate.
judgment sampling and convenience sampling are nonrandom sampling techniques. t to be random,
every item must have the same chance of being chosen.

2.2 LEVEL OF MEASUREMENT


Statisticians sometimes refer to four levels of measurement for data: nominal, ordinal, inter- val, and
ratio. This typology was proposed over 60 years ago by psychologist S. S. Stevens. The allowable
statistical tests depend on the measurement level. The criteria are summarized in Figure 2.3.
Nominal Measurement
a problem with judgment sampling is that the sample may not reflect the population. t while better than
mere convenience, judgment may still have flaws.
when the population is large, a sample estimate is usually preferable to a census. t a census may
founder on cost and time, while samples can be quick and accurate.
sampling error is avoidable by choosing the sample scientifically. t sampling error is unavoidable,
though it can be reduced by careful sampling.
a sampling frame is used to identify the target population in a statistical study.t only some portion of
the population may be targeted (e.g., independent voters)
by taking a systematic sample, in which we select every 50th shopper arriving at a specific store, we
are approximating a random sample of shoppers. t there is no bias if this method is implemented
correctly.

a worker collecting data from every other shopper who leaves a store is taking a simple random
sample of customer opinion. f not unless the target population is customers who shopped today (cf.,
all customers).
creating a list of people by taking the third name listed on every 10th page of the phone book is an
example of convenience sampling. f this resembles two-stage cluster sampling combined with
systematic sampling.
internet surveys posted on popular websites have no bias since anyone can reply.f self-selection bias
exists (respondents may be atypical).

2.3 SAMPLING CONCEPTS


analysis of month-by-month changes in stock market prices during the most recent recession would
require the use of time series data. t data collected and recorded over time would be a time series.
a cluster sample is a type of stratified sample that is based on geographical location. t for example,
sampling voters randomly within random zip codes.
an advantage of a systematic sample is that no list of enumerated data items is required. t systematic
sampling works with a list (like random sampling) but also with an example thout one.

telephone surveys often have a low response rate and fail to reach the desired population.t phone
surveys are cheaper, but it is hard to avoid these problems.
mail surveys are attractive because of their high response rates. f mail surveys have low response
rates and invite self-selection bias.

a problem with convenience sampling is that the target population is not well defined. t convenience
sampling is quick but not random, and the target population is unclear.
if you randomly sample 50 students about their favorite places to eat, the data collected would be
referred to as cross-sectional data. t data for individuals would be a cross section (not a time series).
the number of fedex shipping centers in each of 50 cities would be ordinal level data. f the "number
of" anything is ratio data because a true zero reference point exists.

2.4 SAMPLING METHODS


There are two main categories of sampling methods. In random sampling items are chosen by
randomization or a chance procedure. The idea of random sampling is to produce a sample that is
representative of the population. Non-random sampling is less scientific but is some- times used for
expediency.
internet surveys posted on popular websites such as msn.com suffer from nonresponse bias. t
nonresponse or self-selection bias is rampant in such surveys.

different variables are usually shown as columns of a multivariate data set. t it is customary to use a
column for each variable, while each row is an observation.

each row in a multivariate data matrix is an observation (e.g., an individual response). t it is customary
to use a column for each variable, while each row is an observation.
a bivariate data set has only two observations on a variable. f bivariate refers to the number of
variables, not the number of observations.
running times for 3,000 runners in a 5k race would be a multivariate data set. f regardless of the
number of observations, we have only one variable (running time)
running times for 500 runners in a 5k race would be a univariate data set. t regardless of the number
of observations, we have only one variable (running time).

2.5 DATA SOURCES


One goal of a statistics course is to help you learn where to find data that might be
needed. Fortunately, many excellent sources are widely available, either in libraries
or
through private purchase. Table 2.10 summarizes a few of them.

list of the salaries, ages, and years of experience for 50 ceos is a multivariate data set.t we
would have a data matrix with 50 rows and 3 columns.
the daily closing price of apple stock over the past month would be a time series.t data
collected over time is a time series.
the number of words on 50 randomly chosen textbook pages would be cross-sectional data.t
data were not collected over time, so we have cross-sectional data
likert scale with an even number of scale points between "strongly agree" and "strongly
disagree" is intended to prevent "neutral" choices.t an even number of scale points (e.g., 4)
forces the respondent to "lean" toward one end of the scale or the other
private statistical databases (e.gcrsp) are usually free. f private research databases
generally require a subscription (often expensive).
an investment firm rates bonds for aardco inc. as "b+," while bonds of deva corp. are rated
"aa." which level of measurement would be appropriate for such data: ordinal
- ranks are clear, but interval would require assumed equal scale distances (doubtful).
which variable is least likely to be regarded as ratio data
+student's evaluation of a professor's teaching (likert scale)
-likert scales have no true zero.
which of the following is numerical data: the fuel economy (mpg) of your car
measurements from a sample are called:statistics.
quantitative variables use which two levels of measurement: interval and ratio

temperature in degrees fahrenheit is an example of a(n) interval variable.

using a sample to make generalizations about an aspect of a population is called:statistical


inference.
- generalizing from a sample to a population is an inference
your telephone area code is an example of a nominal variable.
which is least likely to be regarded as a ratio variable: a critic's rating of a restaurant on a 1
to 4 scale
automobile exhaust emission of co2 (milligrams per mile) is ratio data.

your rating of the food served at a local restaurant using a three-point scale of 0 = gross, 1 =
decent, 2 = yummy is ordinal data.
the number of passengers "bumped" on a particular airline flight is ratio data.
which should not be regarded as a continuous random variable
+ number of personal fouls by the miami heat in a game
- counting things yields integer (discrete) data.
which of the following is not true
+ the number of checks processed at a bank in a day is categorical data.
- the "number of" anything is a discrete numerical variable.
which of the following is true
+ the duration (minutes) of a flight from boston to minneapolis is ratio data.
- true zero exists (not observable, but as a reference point), so ratios have meaning
which statement is correct: cluster sampling is useful when strata characteristics are
unknown.
a likert scale: yields interval data if scale distances are equal.
which is most nearly correct regarding sampling error: it cannot be eliminated by any
statistical sampling method.
statement is false: selection bias means that many respondents dislike the interviewer.
judgment sampling is sometimes preferred over random sampling, for example, when: time
is short and the sampling budget is limited.
an advantage of convenience samples is that: they are often quicker and cheaper.

2.6 SURVEYS
Most survey research follows the same basic steps. These steps may overlap in time:

● Step 1:State the goals of the research.


● Step 2:Develop the budget (time, money, staff ).
● Step 3:Create a research design (target population, frame, sample size). Choose a
survey type and method of administration.

● Step 4:Design a data collection instrument (questionnaire).


● Step 5:Pretest the survey instrument and revise as needed.

● Step 6:Administer the survey (follow up if needed)


● Step 7:Code the data and analyze it.

before deciding whether to assess heavy fines against noisy airlines, which sampling
method would the federal aviation administration probably use to measure the peak noise
from departing jets as measured by a ground-level observer at a point one mile from the end
of the departure runway: stratified sample.
professor hardtack chose a sample of 7 students from his statistics class of 35 students by
picking every student who was wearing red that day. which kind of sample is this:
convenience sample
thirty work orders are selected from a zling cabinet containing 500 work order folders by
choosing every 15th folder. which sampling method is this: systematic sample
which of the following is not a likely reason for sampling: the expense of obtaining random
numbers
comparing a census of a large population to a sample drawn from it, we expect that the:
sample is usually a more practical method of obtaining the desired information
Survey Types
Surveys fall into five general categories: mail, telephone, interview, web, and direct obser-
vation. They differ in cost, response rate, data quality, time required, and survey staff train-
ing requirements. Table 2.11 lists some common types of surveys and a few of their salient
strengths/weaknesses.

a stratized sample is sometimes recommended when:. distinguishable strata can be


identified in the populations.

a random sample is one in which the: probability that an item is selected for the sample is
the same for all population items.

an advantage of convenience samples over random samples is that: data collection cost is
reduced.

to measure satisfaction with its cell phone service, at&t takes a stratified sample of its
customers by age, gender, and location. which is an advantage of this type of sampling, as
opposed to other sampling methods: it can give more accurate results.

an accounting professor wishing to know how many mba students would take a summer
elective in international accounting did a survey of the class she was teaching. which kind of
sample is this: convenience sample

a binary variable (also called a dichotomous variable or dummy variable) has: only two
possible values.

a population has groups that have a small amount of variation within them, but large
variation among or between the groups themselves. the proper sampling technique is:
stratified

a manager chose two people from his team of eight to give an oral presentation because she
felt they were representative of the whole team's views. what sampling technique did she
use in choosing these two people: judgment

sampling bias can best be reduced by: utilizing random sampling.

a sampling technique used when groups are defined by their geographical location is:cluster
sampling.

if we choose 500 random numbers using excel's function =randbetween(1,99), we would


most likely find that:. some numbers would occur more than once.

a problem with nonrandom sampling is that: not every item in the population has the same
chance of being selected, as it should.
from its 32 regions, the faa selects 6 regions, and then randomly audits 25 departing
commercial flights in each region for compliance with legal fuel and weight requirements. this
is an example of: cluster sampling.

which of the following is a correct statement: an advantage of a systematic sample is that no


list of enumerated data items is required.

which of the following is false: the target population must first be defined by a full list or data
file of all individuals.

when we are choosing a random sample and we do not place chosen units back into the
population, we are: sampling without replacement.
which method is likely to be used by a journalism student who is casually surveying opinions
of students about the university's cafeteria food for an article that she is writing: convenience
sample
which of the following is false: coverage error is when respondents give untruthful answers.
which is a time series variable: net earnings reported by xena corp. for the last 10 quarters

an observation in a data set would refer to: a single row that contains one or more observed
variables.

a multivariate data set contains: more than two variables

the centers for disease control and prevention (cdc) wants to estimate the average extra
hospital stay that occurs when heart surgery patients experience postoperative atrial
zbrillation. they divide the united states into nine regions. in each region, hospitals are
selected at random within each hospital size group (small, medium, large). in each hospital,
heart surgery patients are sampled according to known percentages by age group (under
50, 50 to 64, 65 and over) and gender (male, female). this procedure combines which
sampling methods: cluster, stratified, and simple random

which statement is correct: selecting every fifth shopper arriving at a store will approximate
a random sample of shoppers.

which is a categorical variable: the brand of jeans you usually wear

which is a discrete variable:


+the number of pairs of jeans that you own
- the "number of" anything is discrete numerical data.

a section of the population we have targeted for analysis is: a frame.

which is not a time series variable: closing checkbook balances of 30 students on december
31 of this year

a good likert scale may not have: unequal distances between scale points.
a likert scale with an odd number of scale points between "strongly agree" and "strongly
disagree": is often used in marketing surveys.

a likert scale with an even number of scale points between "strongly agree" and "strongly
disagree": is intended to prevent "neutral" choices.

which statement is correct: web searches (e.g., google) often yield unverifiable data.

which statement is correct: Government data sources (e.g., www.bls.gov) usually are free.

CHAPTER 3:
Describing data visually

I.STEM-AND-LEAF DISPLAYS AND DOT PLOTS


it is easier to read the data values on a 3D column chart than on a 2D column chart. f Dot
plots are similar to histograms with many bins (classes). t
The column chart should be avoided if you are plotting time series data. f
The line chart is appropriate for categorical (qualitative) data. f
The Pareto chart is used to display the "vital few" causes of problems. t
Excel's pyramid chart is generally preferred to a bar chart. f
Excel's pyramid charts make it easier to read the data values. f

Compared to a dot plot, we lose some detail when we present data in a frequency distribution.
t
Stacked dot plots are useful in understanding the association between two paired quantitative
variables (X, Y). f

II. FREQUENCY DISTRIBUTIONS AND HISTOGRAMS


A frequency distribution is a table formed by classifying n data values into k classes called
bins (we adopt this terminology from Excel). The bin limits define the values to be included
in each bin.
Log scales are common because most people are familiar with them.f
Sturges' Rule should override judgment about the "right" number of histogram bins.f
Sturges' Rule is merely a suggestion, not an ironclad requirement. t
Excel's 3D pie charts are usually clearer than 2D pie charts.f
A common error with pie charts is using too few "slices."f
A pie chart can generally be used instead of a bar chart.f
A bar chart can sometimes be used instead of a line chart for time series data.t
Pie charts are attractive to statisticians, but are rarely used in business or general media.f

Pie charts are useful in displaying frequencies that sum to a total. t


Dot plots may not reveal the shape of a distribution when the sample is small. t
Scatter plots are used to visualize association in samples of paired data (X, Y). t
The zero origin rule may be waived for bar charts if the objective is merely to visualize
relative change over time. f
In a bimodal histogram, the two highest bars will have the same height. f
A frequency distribution is a tabulation of n data values into classes called bins. t
A dot plot would be useful in visualizing scores on an exam in a class of 30 students. t
A frequency distribution usually has equal bin widths. t
Line charts are not used for cross-sectional data. t
Dot Plots
A dot plot is another simple graphical display of n individual values of numerical data. The
basic steps in making a dot plot are to (1) make a scale that covers the data range, (2) mark
axis demarcations and label them, and (3) plot each data value as a dot above the scale at its
approximate location. If more than one data value lies at approximately the same X-axis
location, the dots are piled up vertically. Figure 3.1 shows a dot plot for 44 P/E ratios.

A scatter plot is useful in visualizing trends over time. f


A scatter plot requires two quantitative variables (i.e., not categorical data). t
The number of bins in this histogram (caffeine content in mg/oz for 65 soft drinks) is
consistent with Sturges' Rule. f
Because most data values are on the left, we would say that this dot plot (burglary rates per
100,000 persons in 350 U.S. cities) shows a distribution that is skewed to the left (negatively
skewed). f
It is possible to construct a histogram or frequency polygon with open-ended classes. f
Except for the Y-axis scaling, a histogram will look the same if we use relative frequencies
instead of raw frequencies (with the same bin limits). t
The ____ can be used to differentiate the "vital few" causes of quality problems from the
"trivial many" causes of quality problems.: Pareto chart
not a characteristic of a dot plot: wide bins
display is most likely to reveal association between X and Y: scatter plot

III. Shape

criterion is least likely to be used in choosing bins (classes) in a frequency distribution:


Always starting at zero
the following is true: line charts are not used for cross-sectional data
Histograms generally do not reveal the: exact data range
A column chart would be least suitable to display which data: Annual compensation of 500
company CEOs
A line chart would not be suitable to display which data: Annual compensation of the top 50
CEOs
is not a tip for effective column charts: The nonzero origin rule may be waived for financial
reports
not a tip for effective line charts: Line charts are better than charts to display cross-sectional
data
a reason for using a log scale for time series data: it helps compare growth in time series of
dissimilar magnitude

not a characteristic of pie charts: exploded and 3-D pie charts will allow more slices
Excel's pyramid charts: should be avoided despite their visual appeal
not a reason why pie charts are popular in business: they are more precise than line charts,
despite their low visual impact
data would be suitable for a pie chart: oxnard university student category (undergradute,
masters, doctoral)
data would be suitable for a pie chart: percent vote in the last election by party (democrat,
republican, other)
data would be suitable for a pie chart: the number of us primary care clinics by type (urban,
suburban, rural)
III. Tips for Effective Frequency Distributions
Here are some general tips to keep in mind when making frequency distributions and
histograms.
1. Check Sturges’ Rule first, but only as a suggestion for the number of bins.
2. Choose an appropriate bin width.
3. Choose bin limits that are multiples of the bin width.
4. Make sure that the range is covered, and add bins if necessary.
5. Skewed data may require more bins to reveal sufficient detail.

IV. Scatter plots


Scatter plots are: often fitted with a linear equation in Excel
not a characteristic of an effective summary table: data to be compared should be displayed in
rows, not columns
Ex: Birth rate and efficiency
Effective summary tables generally: round their data to three or four significant digits
Pivot tables: show cross-tabulations of data
the following is least useful in visualizing categorical data: line chart
is considered a novelty chart in Excel: Pyramid chart
We would use a pivot table to: cross-tabulate frequencies of occurrence of two variables
not considered a deceptive graphical technique: axis demarcations
is not considered a deceptive graphical technique: 2D graphs
is the most serious deceptive graphical technique: nonezero origin
is not a poor graphing technique: labeled axis scale
of these deficiencies would be considered a major graphical deception: bar widths
proportional to bar height
is not a characteristic of a log scale for time series data: log scales are generally familiar to
the average reader
is not a characteristic of using a log scale to display time series data: general business
audiences find it easier to interpret a log scale
This histogram shows Chris's golf scores in his last 77 rounds at Devil's Ridge. Which is not
a correct statement: the number of bins is consistent with Sturges’ Rule
is not revealed on a scatter plot: missing data values due to nonresponses
The distribution pictured below is: bimodal and skewed right
The distribution pictured below is: skewed left
The graph below illustrates which deceptive technique: Area trick

V. Deceptive graph
-a characteristic of a histogram's bars: the bar widths show class intervals and their heights
indicate frequencies
Below is a frequency distribution of earnings of 50 contractors in a country: the class interval
limits are ambiguous
Bob found an error in the following frequency distribution. What is it: The classes are not
collectively exhaustive
The point halfway between the bin limits in a frequency distribution is known as the: bin
midpoint
When using a dot plot with a small sample, which is least apparent: the overall shape of the
distribution
If you have 256 data points, how many classes (bins) would Sturges' Rule suggest: 9
If you have 32 data points, how many classes (bins) would Sturges' Rule suggest: 6
statement is not true concerning Sturges' Rule: it proposes adding one class (bin) to the
histogram for each extra observation
To classify prices from 62 recent home sales, Sturges' Rule would recommend: 7 classes
A histogram can be defined as: a chart whose bar widths show class intervals and whose
heights indicate frequencies
An open-ended bin (e.g., "50 and over") might be seen in a frequency distribution when:
extremely large data values exist
The width of a class in a frequency distribution is known as the: class interval
A population is of size 5,500 observations. When the data are represented in a relative
frequency distribution, the relative frequency of a given interval is 0.15. The frequency in this
interval is equal to: 825
A population has 75 observations. One class interval has a frequency of 15 observations. The
relative frequency in this category is: 0.2
Below is a sorted stem-and-leaf diagram for the measured speeds (miles per hour) of 49
randomly chosen vehicles on highway I-80 in Nebraska. How many vehicles were traveling
exactly the speed limit (70 mph): 1

Below is a sorted stem-and-leaf diagram for the measured speeds (miles per hour) of 49
randomly chosen vehicles on highway I-80 in Nebraska. What is the highest observed speed:
92
Below is a sorted stem-and-leaf diagram for the measured speeds (miles per hour) of 49
randomly chosen vehicles on highway I-80 in Nebraska. What is the mode: 65
Below is a sorted stem-and-leaf diagram for the measured speeds (miles per hour) of 49
randomly chosen vehicles on highway I-80 in Nebraska. What is the fourth slowest speed in
the sorted data array: 61
Below is a sorted stem-and-leaf diagram for the measured speeds (miles per hour) of 49
randomly chosen vehicles on highway I-80 in Nebraska. The modal class is: 70 but less than
80
A statistician prepared a bar chart showing, in descending order, the frequency of six
underlying causes of general aviation accidents (pilot error, mechanical problems,
disorientation, miscommunication, controller error, other). What would we call this type of
chart: Pareto chart

CHAPTER 4

Descriptive Statistics
4.1 NUMERICAL DESCRIPTION
For a sample of numerical data, we are interested in three key characteristics: center,
variability, and shape.

Preliminary Analysis
Before calculating any statistics, we consider how the data were collected.
A data set with two values that are tied for the highest number of occurrences is called
bimodal.t Bimodal means two modes.
The midrange is not greatly affected by outliers.f Extremes distort the midrange (average of
highest and lowest data values).
The second quartile is the same as the median.t The second quartile, the median, and the 50th
percentile are the same thing.
A trimmed mean may be preferable to a mean when a data set has extreme values.t
Trimming diminishes the effect of outliers
One benefit of the box plot is that it clearly displays the standard deviation.f A box plot
shows quartiles.

4.2 MEASURES OF CENTER


Mean the sum of the data values divided by the number of data items.
Median
The median (denoted M) is the 50th percentile or midpoint of the sorted sample data set x1, x2, . . . ,
xn. It separates the upper and lower halves of the sorted observations

One benefit of the box plot is that it clearly displays the standard deviation.f A box plot
shows quartiles. It is inappropriate to apply the Empirical Rule to a population that is
right-skewed.t The E.R. applies to normal populations.
Given the data set 10, 5, 2, 6, 3, 4, 20, the median value is 5.t Sort and End middle value.
Given the data set 2, 5, 10, 6, 3, the median value is 3.f Sort and End middle value: 2 3 5 6
10.
When data are right-skewed, we expect the median to be greater than the mean.f It's the other
way around, as the mean will be pulled up by extremes
The sum of the deviations around the mean is always zero.t The mean is the fulcrum
(balancing point), so deviations must sum to zero.
The midhinge is a robust measure of center when there are outliers.t Outliers have little effect
on the midhinge (average of the 25th and 75th percentiles).

Mode
The mode is the most frequently occurring data value. It may be similar to the mean
and median if data values near the center of the sorted array tend to occur often. But it
may also be quite different from the mean and median. A data set may have multiple
modes or no mode at all.

For example, consider these four students’ scores on five quizzes:

Lee’s scores: 60, 70, 70, 70, 80 Mean = 70, Median = 70, Mode = 70
Pat’s scores: 45, 45, 70, 90, 100 Mean = 70, Median = 70, Mode = 45
Sam’s scores: 50, 60, 70, 80, 90 Mean = 70, Median = 70, Mode = none
Xiao’s scores: 50, 50, 70, 90, 90 Mean = 70, Median = 70, Modes = 50, 90

If there are 19 data values, the median will have 10 values above it and 9 below it since n is
odd. f When n is odd, the median is the middle member of the sorted data set. In this case, the
median is x10 and there will be 9 below x10 (x1,..., x9) and 9 above x10 (x11,..., x19).
If there are 20 data values, the median will be halfway between two data values. t Median is
between two data values when n is even.
If the standard deviations of two samples are the same, so are their coefficients of variation.f
The means may differ, which affects the C.V
A certain health maintenance organization (HMO) examined the number of office visits by its
members in the last year. This data set would probably be skewed to the left due to low
outliers.f Lower bound is zero, but high extremes are likely for sicker individuals.
A certain health maintenance organization (HMO) examined the number of office visits by its
members in the last year. For this data set, the mean is probably not a very good measure of a
"typical" person's office visits.t Lower bound is zero, but high extremes are likely for sicker
individuals.
In a left-skewed distribution, we expect that the median will be greater than the mean.t Mean
is likely to be pulled down by low extremes.
Referring to this box plot of ice cream fat content, the median seems more "typical" of fat
content than the midrange as a measure of center.t
Referring to this box plot of ice cream fat content, the mean would exceed the median.f
Referring to this box plot of ice cream fat content, the skewness would be negative.t
Referring to this graph of ice cream fat content, the second quartile is about 61.t

The range as a measure of variability is very sensitive to extreme data values.t Range depends
only on highest and lowest data values, so it is easily distorted.
In calculating the sample variance, the sum of the squared deviations around the mean is
divided by n - 1 to avoid underestimating the unknown population variance.t
Outliers are data values that fall beyond ±2 standard deviations from the mean.f Outliers are 3
standard deviations from the mean.
GEOMETRIC MEAN (Trung bình nhân)
Kurtosis cannot be judged accurately by looking at a histogram.t Histograms are affected by
scaling, so peakness is hard to judge.
A platykurtic distribution is more sharply peaked (i.e., thinner tails) than a normal
distribution.f Platykurtic is flatter than a normal distribution (thicker tails).
A leptokurtic distribution is more sharply peaked (i.e., thinner tails) than a normal
distribution.t Leptokurtic is more sharply peaked and has thinner tails.
A positive kurtosis coefficient in Excel indicates a leptokurtic condition in a distribution.t The
sign of Excel's kurtosis coefficient indicates the kurtosis direction relative to a normal
distribution
GROWTH RATE(căn tất cả năm trừ 1 của cuối chia đầu, rồi tất cả trừ 1)
A sample consists of the following data: 7, 11, 12, 18, 20, 22, 43. Using the "three standard
deviation" criterion, the last observation (X = 43) would be considered an outlier.f 43 is not
more than three standard deviations above the mean for this data set
The coefficient of variation is: a unit-free statistic.
Which is not an advantage of the method of medians to End Q1 and Q3: Same method as
Excel's =QUARTILE.EXC function.
Which is a characteristic of the mean as a measure of center: It utilizes all the information in
a sample.
The position of the median is: (n + 1)/2 in any sample.
Which is a characteristic of the trimmed mean as a measure of center: It is similar to the
mean if there are offsetting high and low extremes.
Which is not a characteristic of the geometric mean as a measure of center: It is similar to
the mean if the data are skewed right.
Which is not a characteristic of the standard deviation: It is not applicable when data are
continuous.

MIDRANGE( lớn nhất cộng nhỏ nhất chia 2)


Trimmed Mean( dùng để loại trừ thấp+ cao nhất)
The trimmed mean is calculated like any other mean, except that the highest and
lowest k percent of the observations in the sorted data array are removed. The
trimmed mean mitigates the effects of extreme high values on either end.

Ví dụ về một Mean Trimmed


Ví dụ: giả sử một cuộc thi trượt băng nghệ thuật có số điểm sau: 6,0, 8,1, 8,3, 9,1 và 9,9.
Giá trị trung bình của các điểm sẽ bằng:

● ((6,0 + 8,1 + 8,3 + 9,1 + 9,9) / 5) = 8,28


Để cắt giảm tổng cộng 40% giá trị trung bình, chúng tôi loại bỏ 20% giá trị thấp nhất và 20%
giá trị cao nhất, loại bỏ điểm 6,0 và 9,9.
Tiếp theo, chúng tôi tính giá trị trung bình dựa trên phép tính:

● (8,1 + 8,3 + 9,1) / 3 = 8,50


Nói cách khác, giá trị trung bình được cắt giảm ở mức 40% sẽ bằng 8,5 so với 8,28, điều
này làm giảm độ lệch ngoại lệ và có tác dụng làm tăng mức trung bình được báo cáo thêm
0,22 điểm.

Which of the following is not a valid description of an outlier: A data value that lies below Q1
or above Q3
If samples are from a normal distribution with μ = 100 and σ = 10, we expect: about 68
percent of the data within 90 to 110.
In a sample of 10,000 observations from a normal population, how many would you expect
to lie beyond three standard deviations of the mean: About 27
The Excel formula for the standard deviation of a sample array named Data is:
=STDEV.S(Data).
Which is not true of an outlier: It is best discarded to get a better mean.
Estimating the mean from grouped data will tend to be most accurate when: observations
are distributed uniformly within classes.
Which is true of the kurtosis of a distribution:It is risky to assess kurtosis if the sample size is
less than 50.
Which is true of skewness: Skewness often is evidenced by one or more outliers.
Which is not true of the Empirical Rule: It applies to any distribution.
EXCEL FOR CHAPTER 4:

Excel function is designed to calculate z = (x - μ)/σ for a column of data: =STANDARDIZE


Excel function would be least useful to calculate the quartiles for a column of data:
=STANDARDIZE
If Excel's sample skewness coefficient is positive, we conclude that: we should consult a
table of percentiles that takes sample size into consideration.
If Excel's sample kurtosis coefficient is negative, we conclude that: we should consult a table
of percentiles that takes sample size into consideration.
The Excel formula for the standard deviation of a sample array named Data is:
=STDEV.S(Data).
Which is not true of an outlier: It is best discarded to get a better mean.

4.3 MEASURES OF VARIABILITY


We can use a statistic such as the mean to describe the center of a distribution. But it is just as
important to describe variation around the center. Consider possible sample distributions of
study time spent by several college students taking an economics class:
TỤ LẠI 1 ĐIỂM SPREAD RA

Range
The range is the difference between the largest and smallest observations:
Range = xmax - xmin

Variance and Standard Deviation( Phương sai & Độ lệch chuẩn)


The mean is the balancing point of the distribution, so if we just sum these differences and
take the average, we will always get zero, which obviously doesn’t give us a useful measure
of variability.
One way to avoid this is to square the differ- ences before we find the average.

The population variance (denoted σ2, where σ is the lowercase Greek letter “sigma”) is
defined as the sum of squared deviations from the mean divided by the population size:

EXP:
There are 45 students in a class. 5 students were randomly selected from this class and
their heights
131 148 139 142 152

Sample size n=5


Sample mean x= ( 131+148+139+142+152)/5 = 142,4

Which of the following is not a valid description of an outlier: A data value that lies below Q1
or above Q3
If samples are from a normal distribution with μ = 100 and σ = 10, we expect: about 68
percent of the data within 90 to 110.
In a sample of 10,000 observations from a normal population, how many would you expect
to lie beyond three standard deviations of the mean: About 27
Estimating the mean from grouped data will tend to be most accurate when: observations
are distributed uniformly within classes.
Which is true of the kurtosis of a distribution:It is risky to assess kurtosis if the sample size is
less than 50.
Which is true of skewness: Skewness often is evidenced by one or more outliers.
Which is not true of the Empirical Rule: It applies to any distribution.
correct statement concerning the median: In a left-skewed distribution, we expect that the
median will exceed the mean.
With nominal data we can find the mode.t
Exam scores in a small class were 10, 10, 20, 20, 40, 60, 80, 80, 90, 100, 100. For this data
set, which statement is incorrect concerning measures of center: The geometric mean is
35.05.
Exam scores in a small class were 0, 50, 50, 70, 70, 80, 90, 90, 100, 100. For this data set,
which statement is incorrect concerning measures of center: The median is 70.
Exam scores in a random sample of students were 0, 50, 50, 70, 70, 80, 90, 90, 90, 100.
Which statement is incorrect: The midrange and mean are almost the same.
For U.S. adult males, the mean height is 178 cm with a standard deviation of 8 cm and the
mean weight is 84 kg with a standard deviation of 8 kg. Elmer is 170 cm tall and weighs 70
kg. It is most nearly correct to say that: Elmer's weight is more unusual than his height.
John scored 85 on Prof. Hardtack's exam (Q1 = 40 and Q3 = 60). Based on the fences,
which is correct: John is not an outlier.
John scored 35 on Prof. Johnson's exam (Q1 = 70 and Q3 = 80). Based on the fences,
which is correct: John is an outlier.
A population consists of the following data: 7, 11, 12, 18, 20, 22, 25. The population variance
is: 36.82
Consider the following data: 6, 7, 17, 51, 3, 17, 23, and 69. The range and the median are:
66&17
When a sample has an odd number of observations, the median is the: observation in the
center of the data array.
As a measure of variability, compared to the range, an advantage of the standard deviation
is: considering all data values.
Which two statistics o\er robust measures of center when outliers are present:Median and
trimmed mean.

4.4 STANDARDIZED DATA


The standard deviation is an important measure of variability because of its many
roles statistics. One of its main uses is to gauge the position of items within a data
array.

Chebyshev’s Theorem
The French mathematician Jules Bienaymé (1796–1878) and the Russian
mathematician Paf- nuty Chebyshev (1821–1894) proved that, for any data set, no
matter how it is distributed, the percentage of observations that lie within k standard
deviations of the mean (i.e., within μ 6 kσ) must be at least 100 [1 2 1yk2]. Commonly
called Chebyshev’s Theorem, it says that for any population with mean μ and
standard deviation σ:

Chebyshev's Theorem says that at most 50 percent of the data lie within 2 standard
deviations of the mean.f At least 75 percent by Chebyshev.
Chebyshev's Theorem says that at least 95 percent of the data lie within 2 standard
deviations of the mean.f At least 75 percent by Chebyshev.

The Empirical Rule assumes that the distribution of data follows a normal curve.t Unlike
Chebyshev, the E.R. assumes a normal population.
The Empirical Rule can be applied to any distribution, unlike Chebyshev's theorem.f The
E.R. assumes a normal population, while Chebyshev applies to any population.

When applying the Empirical Rule to a distribution of grades, if a student scored one
standard deviation below the mean, then she would be at the 25th percentile of the
distribution.f About 15.87 percent (not 25 percent) are less than one standard deviation
below the mean (in a normal distribution).
Chebyshev's Theorem: applies to all samples.

A sample of 50 breakfast customers of McDonald's showed the spending below. Which


statement is least likely to be correct: The mean is a reasonable measure of center.

VenalCo Market Research surveyed 50 individuals who recently purchased a certain CD,
revealing the age distribution shown below. Which statement is least defensible: The mean
age probably exceeds the median age.

Given a sample of three items (X = 4, 6, 5), which statement is incorrect: The geometric
mean is 5.2

A sample of customers from Barnsboro National Bank shows an average account balance of
$315 with a standard deviation of $87. A sample of customers from Wellington Savings and
Loan shows an average account balance of $8350 with a standard deviation of $1800.
Which statement about account balances is correct: Barnsboro Bank has more variation.

Histograms are best used to: assess the shape of the distribution.
The scatter plot shows the relationship between two variables.
If the mean and median of a population are the same, then its distribution is: symmetric
In the following data set {7, 5, 0, 2, 7, 15, 5, 2, 7, 18, 7, 3, 0}, the value 7 is: the mode
The median of 600, 800, 1000, 1200 is: 900

The 25th percentile for waiting time in a doctor's office is 19 minutes. The 75th percentile is
31 minutes. The interquartile range is: 12 minutes.
The 25th percentile for waiting time in a doctor's office is 19 minutes. The 75th percentile is
31 minutes. Which is incorrect regarding the fences: A waiting time of 45 minutes exceeds
the upper inner fence.

When using Chebyshev's Theorem, the minimum percentage of sample observations that
will fall within two standard deviations of the mean will be smaller than the percentage within
two standard deviations if a normal distribution is assumed (Empirical Rule).

Which distribution is least likely to be skewed to the right by high values: Cost of a plain
McDonald's hamburger in n U.S. cities

Based on daily measurements, Bob's weight has a mean of 200 pounds with a standard
deviation of 16 pounds, while Mary's weight has a mean of 125 pounds with a standard
deviation of 15 pounds. Who has the smaller relative variation. Bob

Frieda is 67 inches tall and weighs 135 pounds. Women her age have a mean height of 65
inches with a standard deviation of 2.5 inches and a mean weight of 125 pounds with a
standard deviation of 10 pounds. In relative terms, it is correct to say that: Frieda's height is
more unusual than her weight.
Which statement is false: The skewness coefficient is zero in a sample from any normal
distribution.

The values of xmin and xmax can be inferred accurately except in a: scatter plot.

Which of the following statements is likely to be true: The interquartile range offers a
measure of income inequality among California residents.
Which statistics offer robust (resistant to outliers) measures of center: Median, midhinge,
trimmed mean

The Empirical Rule says that: about 32 percent of the data are beyond one standard
deviation from the mean.

Three randomly chosen Seattle students were asked how many round trips they made to
Canada last year. Their replies were 3, 4, 5. The geometric mean is: 3.915

Three randomly chosen California students were asked how many times they drove to
Mexico last year. Their replies were 4, 5, 6. The geometric mean is: 4.93

Three randomly chosen Colorado students were asked how many times they went rock
climbing last month. Their replies were 5, 6, 7. The standard deviation is:1.00

Patient survival times after a certain type of surgery have a very right-skewed distribution
due to a few high outliers. Consequently, which statement is most likely to be correct: Mean
> Trimmed mean

So far this year, stock A has had a mean price of $6.58 per share with a standard deviation
of $1.88, while stock B has had a mean price of $10.57 per share with a standard deviation
of $3.02. Which stock is more volatile: They are the same.

Outliers are indicated using fences on a: box plot.


Which is not a measure of variability: trimmed mean
Twelve randomly chosen students were asked how many times they had missed class
during a certain semester, with this result: 3, 2, 1, 2, 1, 5, 9, 1, 2, 3, 3, 10. The geometric
mean is: 2.604

Twelve randomly chosen students were asked how many times they had missed class
during a certain semester, with this result: 3, 2, 1, 2, 1, 5, 9, 1, 2, 3, 3, 10. The median is: 2.5

One disadvantage of the range is that: only extreme values are used in its calculation.

Which is a characteristic of the standard deviation: It is measured in the same units as the
mean.

Twelve randomly chosen students were asked how many times they had missed class
during a certain semester, with this result: 2, 1, 5, 1, 1, 3, 4, 3, 1, 1, 5, 18. For this sample,
the geometric mean is: 2.376
Twelve randomly chosen students were asked how many times they had missed class
during a certain semester, with this result: 2, 1, 5, 1, 1, 3, 4, 3, 1, 1, 5, 18. For this sample,
the median is: 2.5

Twelve randomly chosen students were asked how many times they had missed class
during a certain semester, with this result: 2, 1, 5, 1, 1, 3, 4, 3, 1, 1, 5, 18. For this sample,
which measure of center is least representative of the "typical" student? Midrange

Here are statistics on order sizes of Megalith Construction Supply's shipments of two kinds
of construction materials last year. Which order sizes have greater variability: Girders

The quartiles of a distribution are most clearly revealed in which display: Box plot
The sum of the deviations around the mean is: always zero

What does the graph below (profit/sales ratios for 25 Fortune 500 companies) reveal: That
the interquartile range is about 8

Find the sample correlation coefficient for the following data: 9556
X: 3 5 7 9 11 13 19 21

A reporter for the campus paper asked Eve randomly chosen students how many
occupants, including the driver, ride to school in their cars. The responses were 1, 1, 1, 1, 6.
The coe*cient of variation is: 112%

A smooth distribution with one mode is negatively skewed (skewed to the left). The median
of the distribution is $65. Which of the following is a reasonable value for the distribution
mean: 54

In a positively skewed distribution, the percentage of observations that fall below the median
is: about 50 percent.

Which is a weakness of the mode: It is inappropriate for continuous data.


The mode is least appropriate for: continuous data.

Craig operates a part-time snow-plowing business using a 2002 GMC 2500 HD extended
cab short box truck. This box plot of Craig's MPG on 195 tanks of gas does not support
which statement: This is a very right-skewed distribution.

Estimate the mean exam score for the 50 students in Prof. Axolotl's class. 62.0
A survey of salary increases received during a recent year by 44 working MBA students is
shown. Find the approximate mean percent raise: 6.39

The following frequency distribution shows the amount earned yesterday by employees of a
large Las Vegas casino. Estimate the mean daily earnings.$117.13
The following table is the frequency distribution of parking fees for a day: The mean parking
fee is: $7.07

Find the standard deviation of this sample: 4, 7, 9, 12, 15: 4.278


The 25th percentile for waiting time in a doctor's office is 10 minutes. The 75th percentile is
30 minutes. Which is incorrect regarding the fences: A waiting time of 45 minutes would be
an outlier.

Five homes were recently sold in Oxnard Acres. Four of the homes sold for $400,000, while
the Efth home sold for $2.5 million. Which measure of central tendency best represents a
typical home price in Oxnard Acres: The median or mode.

In Tokyo, construction workers earn an average of ×420,000 (yen) per month with a standard
deviation of ×20,000, while in Hamburg, Germany, construction workers earn an average of
€3,200 (euros) per month with a standard deviation of €57. Who is earning relatively more, a
worker making ×460,000 per month in Tokyo or one earning €3,300 per month in Hamburg:
The Tokyo worker is relatively better of

Which statement is false? Explain: If μ = 52 and σ = 15, then X = 81 would be an outlier.

Which is not a measure of variability: Midhinge

If Q1 = 150 and Q3 = 250, the upper fences (inner and outer) are: 400 and 550

Which of the following statements is likely to apply to the incomes of 50 randomly chosen
taxpayers in California: The midhinge would be a robust measure of center.

A certain health maintenance organization (HMO) examined the number of office visits by
each of its members in the last year. For this data set, we would anticipate that the geometric
mean would be: zero because some HMO members would not have an office visit.

Three randomly chosen Colorado students were asked how many times they went rock
climbing last month. Their replies were 5, 6, 7. The coefficient of variation is: 16.7 percent.

The mean of a population is 50 and the median is 40. Which histogram is most likely for
samples from this population: Sample A (skew right)

CHAPTER 5

Events A and B are mutually exclusive when: +their joint probability is zero.

If two events are complementary, then we know that: +the sum of their probabilities is one

Regarding probability, which of the following is correct + When two events A and B are
independent, the joint probability of the events can be found by multiplying the probabilities
of the individual events.
Independent events A and B would be consistent with which of the following statements:+
P(A) = .4, P(B) = .5, P(A∩B) = .2.-For independence, the product P(A)P(B) must equal
P(A∩B).

Find the probability that either event A or B occurs if the chance of A occurring is .5, the
chance of B occurring is .3, and events A and B are independent:+.65-Given that the events
are independent, the product P(A)P(B) must equal P(A∩B). Thus, P(A or B) = P(A) + P(B) -
P(A∩B) = .50 + .30 - (.50)(.30) = .80 - .15 = .65 using the General Law of Addition.

Regarding the rules of probability, which of the following statements is correct +. The
probability of A and its complement will sum to one

Within a given population, 22 percent of the people are smokers, 57 percent of the people
are males, and 12 percent are males who smoke. If a person is chosen at random from the
population, what is the probability that the selected person is either a male or a smoker +. 67
-Use the General Law of Addition P(A or B) = P(A) + P(B) - P(A∩B).

Information was collected on those who attended the opening of a new movie. The analysis
found that 56 percent of the moviegoers were female, 26 percent were under age 25, and 17
percent were females under the age of 25. Find the probability that a moviegoer is either
female or under age 25.+ .65-Use the General Law of Addition P(A or B) = P(A) + P(B) -
P(A∩B).

Given the contingency table shown here, Find P(V) +.20 -This is a marginal probability P(V)
= 40/200 = .20.

Given the contingency table shown here, FInd P(V | W). + .2375-This is a conditional
probability P(V|W) = 19/80.

Given the contingency table shown here, End the probability P(V´), that is, the probability of
the complement of V.+.80-Calculate the probability of the complement of V by subtracting
from its marginal probability P(V) = 40/200 to get P(V´) = 1 - P(V) = 1 - 40/200.
Given the contingency table shown here, End P(W∩S) +.12-This is a joint probability P(W
and S) = 24/200

Given the contingency table shown here, End P(A or M).+ .6250-Use the General Law of
Addition P(A or M) = 100/200 + 50/200 - 25/200

Given the contingency table shown here, End P(A2).+ .1842-This is a marginal probability:
P(A2) = 86/467.

Given the contingency table shown here, End P(A3∩B2). +.0942-This is a joint probability:
P(A3 and B2) = 44/467
Given the contingency table shown here, End P(A2 | B3).+ .1893-This is a conditional
probability: P(A2|B3) = 32/169.

Given the contingency table shown here, End P(A1 or B2).+ .3854-Apply the General Law of
Addition: P(A1 or B2) = 44/467 + 150/467 - 14/467.

Given the contingency table shown here, End P(A1∩A2).+.00-This is a joint probability. The
important thing here is that events A1 and A2 are mutually exclusive and so both events
cannot occur.

Given the contingency table shown here, End the probability that either event A2 or event B2
will occur.+ .4454-Use the General Law of Addition: P(A2 or B2) = 86/467 + 150/467 -
28/467.

Given the contingency table shown here, End P(B).+.45-This is a marginal probability: P(B)
= 90/200.
Given the contingency table shown here, End P(A or B).+.60-Use the General Law of
Addition: P(A or B) = 80/200 + 90/200 - 50/200.

Given the contingency table shown here, End P(B | A]).+ . 625-This is a conditional
probability: P(B|A) = 50/80.

Given the contingency table shown here, what is the probability that a randomly chosen
employee who is under age 25 would be absent 2 or more days.+. 375-This is a conditional
probability: P(B'|A) = 30/80.
Oxnard Casualty wants to ensure that their e-mail server has 99.98 percent reliability. They
will use several independent servers in parallel, each of which is 95 percent reliable. What is
the smallest number of independent Ele servers that will accomplish the goal.+3-1 -
P(F1∩F2∩F3) = 1 - (.05) (.05) (.05) = 1 - .000125 = .999875, so 3 servers will do.

Given the contingency table shown here, does the decision to retire appear independent of
the employee type? Survey question: Do you plan on retiring or keep working when you turn
65.+Yes -Does the product of the marginal probabilities equal their joint probability? This can
be checked by asking whether P(M and R) = P(M) P(R). In this example, because
(31/124)(52/124) = 13/124, we can see that M and R are independent events.

Given the contingency table shown here, End the probability that a randomly chosen
employee is a line worker who plans to retire at age 65. Given the contingency table shown
here, what is the probability that a randomly chosen employee who is under age 25 would be
absent 2 or more days Survey question: Do you plan on retiring or keep working when you
turn 65.+.315-This is a joint probability: P(L and R) = 39/124

Given the contingency table shown here, End P(R∩L). Survey question: Do you plan on
retiring or keep working when you turn 65.+315-This is a joint probability: P(R∩L) = 39/124.

Given the contingency table shown here, End P(W | M). Survey question: Do you plan on
retiring or keep working when you turn 65.+.581-This is a conditional probability: P(W | M) =
18/31.

Given the contingency table shown here, End P(L or W). Survey question: Do you plan on
retiring or keep working when you turn 65.+.895-Use the General Law of Addition: P(L or W)
= 93/124 + 72/124 - 54/124.
Ramjac Company wants to set up k independent Ele servers, each capable of
running the company's intranet. Each server has average "uptime" of 98 percent.
What must k be to achieve 99.999 percent probability that the intranet will be
"up".+3-1 - P(F1∩F2∩F3) = 1 - (.02)(.02)(.02) = 1 - .000008 = .999992, so 3 servers
will do.

Given the contingency table shown here, what is the probability that a mother in the
study smoked during pregnancy.+.2591-This is a marginal probability: P(smoked) =
1122/4331.

Given the contingency table shown here, what is the probability that a mother
smoked during pregnancy if her education level was below high school.+.3804-This
is a conditional probability: P(smoked | below high school) = 393/1033.

Given the contingency table shown here, what is the probability that a mother smoked during
pregnancy and had a college degree.+.0111-This is a joint probability: P(smoked and college)
= 48/4331.

Given the contingency table shown here, what is the probability that a mother smoked during
pregnancy or that she graduated from college.+.3861-Use the General Law of Addition:
1122/4331 + 598/4331 - 48/4331.
Given the contingency table shown here, if a mother attended some college but did not have
a degree, what is the probability that she did not smoke during her pregnancy.+.8399-This is
a conditional probability: 635/756.

Given the contingency table shown here, End the probability that a mother with some college
smoked during pregnancy. if a randomly chosen student attends a religious school, what is
the probability the location is rural.+.1601-This is a conditional probability: 121/7561.

Given the contingency table shown here, if a survey participant is selected at random, what
is the probability he/she is an undergrad who favors the change to a quarter
system.+.135-This is a joint probability: P(U and S) = 27/200.

Given the contingency table shown here, if a faculty member is chosen at random, what is
the probability he/she opposes the change to a quarter system.+.40-This is a marginal
probability: P(N | F) = 20/50 = .40.
Given the contingency table shown here, what is the probability that a participant selected at
random is a graduate student and opposes the change to a quarter system.+.135-This is a
joint probability: P(G and N) = 27/200.

Given the contingency table shown here, what is the probability that a student attends a
public school in a rural area.+.135-This is a marginal probability.

Given the contingency table shown here, if a randomly chosen student attends a religious
school, what is the probability the location is rural.+.167-This is a conditional probability: P(R
| L) = 5/30 = .167.

Given the contingency table shown here, if a randomly chosen student attends school in an
inner-city location, what is the probability that it is a public school.+.500-This is a conditional
probability: P(P | I) = 35/70 = .500.

Given the contingency table shown here, Find P(E). +.300-This is a marginal probability:
P(E) = 300/1000 = .300
Given the contingency table shown here, End P(E | F). +.340-This is a conditional
probability: P(E | F) = 160/470 = .340.

Given the contingency table shown here, End P(A∩M).+.210-This is a joint probability:
P(A∩M) = 210/1000 = .210.

Given the contingency table shown here, End P(F or G).+.650-Use the General Law of
Addition: P(F or G) = 470/1000 + 340/1000 - 160/1000.

Given the contingency table shown here, Find the probability that a randomly chosen
individual is a female and economics major.+.1600-This is a joint probability: P(F and E) =
160/1000 = .16.

Debbie has two stocks, X and Y. Consider the following events: X = the event that the price
of stock X has increased Y = the event that the price of stock Y has increased The event
"the price of stock X has increased and the price of stock Y has not increased" may be
written as.+ X ∩ Y′- This is a joint probability that also entails the notation for an event's
complement.

If P(A | B) = 0.40 and P(B) = 0.30, End P(A∩B).+.120-Use the deEnition for conditional
probability.

A company is producing two types of ski goggles. Thirty percent of the production is of type
A, and the rest is of type B. Five percent of all type A goggles are returned within 10 days
after the sale, whereas only two percent of type B are returned. If a pair of goggles is
returned within the Erst 10 days after the sale, the probability that the goggles returned are
of type B is.+. 483-Review Bayes' Theorem, and perhaps make a table or tree.

Given the contingency table shown here, End the joint probability that a call sampled at
random out of this population is local and 2-5 minutes long.+.3125-This is a joint probability.
You must add the column frequencies.

Given the contingency table shown here, if a call is sampled at random, End the marginal
probability that the call is long distance.+.3750-You must Erst add the column frequencies.

if a call is sampled at random, the conditional probability that the call is not "6+" minutes long
given that it is a long distance call is: +.9667-Calculate the conditional probability 1 - 10/300
= .9667.

The following table gives a classiFIcation of the 10,000 shareholders of Oxnard Xylophone
Distributors, Inc. A few numbers are missing from the table. Given that a shareholder holding
500-999 shares is picked, there is a 0.625 probability that the shareholder will be a woman.
Consequently, what is the number of men holding 1000 or more shares.+500-Multiply by the
column total and subtract to Ell in the remaining frequencies.
`
In any sample space P(A | B) and P(B | A): +Are equal only if P(A) = P(B)= Use the definition
of conditional probability.

If P(A∩B) = 0.50, can P(A) = 0.20:+ If P(A) = 0.20, then P(A∩B) cannot equal 0.50-The given
information contains a contradiction, because P(A∩B) cannot exceed P(A)

The following relationship always holds true for events A and B in a sample space. +P(A∩B)
= P(A | B) P(B).-Use the deEnition of conditional probability: P(A | B) = P(A∩B)/P(B).

The following probabilities are given about events A and B in a sample space: P(A) = 0.30,
P(B) = 0.40, P(A or B) = 0.60. We can say that:+P(A∩B) = 0.10-Apply the General Rule of
Addition: P(A or B) = P(A) + P(B) - P(A∩B).

if P(A) = 0.35, P(B) = 0.60, and P(A or B) = 0.70, then:+P(A∩B) = . 25-Apply the General
Rule of Addition: P(A or B) = P(A) + P(B) - P(A∩B).

The following table shows the survival experience of 1,000 males who retire at age 65.Based
on these data, the probability that a 75-year-old male will survive to age 80 is:+. 0.769-Given
that 775 have survived to 75, the probability is 596 divided by 775.

Given the contingency table shown here, Find P(G | M):+.3333-This is a conditional
probability: P(G | M) = 18/54.
Given the contingency table shown here, Find P(V or S).+.3825-Use the General Rule of
Addition: P(V or S) = 72/400 + 100/400 - 19/400.

Given the contingency table shown here, FInd P(V|S).+.1900

The manager of Ardmore Pharmacy knows that 25 percent of the customers entering the
store buy prescription drugs, 65 percent buy over-the-counter drugs, and 18 percent buy
both types of drugs. What is the probability that a randomly selected customer will buy at
least one of these two types of drugs:+.72-Use the General Rule of Addition: P(A or B) =
P(A) + P(B) - P(A∩B) = .25 + .65 - .18.

Two events are complementary (i.e., they are complements) if: +they are disjoint and their
probabilities sum to one-Review rules of probability.

Which statement is false:+If A and B are mutually exclusive events, then P(A or B) =
0-Review rules of probability and counting rules.

The number of unique orders in which Eve items (A, B, C, D, E) can be arranged
is:+120-Apply rules of counting: 5 × 4 × 3 × 2 × 1 = 120.

If four items are chosen at random without replacement from seven items, in how many
ways can the four items be arranged, treating each arrangement as a dikerent event (i.e., if
order is important):+840-This is 7P4.
How many ways can we choose three items at random without replacement from Eve items
(A, B, C, D, E) if the order of the selected items is not important:+10-This is 5C3.

The probability that event A occurs, given that event B has occurred, is an example of:
+conditional probability

if each of two independent Ele servers has a reliability of 93 percent and either alone can run
the website, then the overall website availability is:+.9951-Follow the textbook example of
reliability for independent events.

In a certain city, 5 percent of all drivers have expired licenses, 10 percent have an unpaid
parking ticket, and 1 percent have both an expired license and an unpaid parking ticket. Are
these events independent:+No-For independence we would require P(A)P(B) = P(A∩B).

In a certain city, 5 percent of all drivers have expired licenses and 10 percent have an unpaid
parking ticket. If these events are independent, what is the probability that a driver has both
an expired license and an unpaid parking ticket:+.005-By independence P(A∩B) = P(A)P(B)
= (.05)(.10).

If two events are collectively exhaustive, what is the probability that one or the other will
occur:+1.00-Review deFInition of probabilities (collectively exhaustive covers all the
possibilities).

Which best exempliFIes a subjective probability:+The probability that the summer Olympic
games will be held in Chicago in 2020-Subjective probabilities are not based on empirical
frequencies.

Which best exempliFIes the classical deFInition of probability:+The probability that a pair of
dice will come up 7 when they are rolled-Classical probability is determined a priori by the
nature of the experiment.

Which best exempliFIes the empirical deFInition of probability:+The probability that a


checked bag on Flight 1872 will weigh less than 30 pounds-Empirical probabilities are based
on observed frequencies.

From the following tree, fIND the probability that a randomly chosen person will get the FLU
vaccine and will also get the flu.+.07-Multiply down the branch: .70 × .10 = .07.
from the following tree, fINDd the probability that a randomly chosen person will not get a
vaccination and will not get the FLU:+.18-Multiply down the branch: .30 × .60 = .18.

From the following tree, find the probability that a randomly chosen person will get the
flu:+.19-Multiply down two branches and add .07 to .12. That is (.70)(.10) + (.30)(.40).

At Joe's Restaurant, 80 percent of the diners are new customers (N), while 20 percent are
returning customers (R). Fifty percent of the new customers pay by credit card, compared
with 70 percent of the regular customers. If a customer pays by credit card, what is the
probability that the customer is a new customer:+.7407-Review Bayes' Theorem, and
perhaps make a table or tree.

At Dolon General Hospital, 30 percent of the patients have Medicare insurance (M) while 70
percent do not have Medicare insurance (M´). Twenty percent of the Medicare patients arrive
by ambulance, compared with 10 percent of the non-Medicare patients. If a patient arrives
by ambulance, what is the probability that the patient has Medicare insurance:+.
4615-Review Bayes' Theorem, and perhaps make a table or tree.
CHAPTER 6
DISCRETE PROBABILITY DISTRIBUTIONS

6.1 DISCRETE PROBABILITY DISTRIBUTIONS

A random variable is a function or rule that assigns a numerical value to each outcome in the
sample space of a stochastic (chance) experiment. t .Review definition of discrete random
variable.

A discrete random variable has a countable number of distinct values. t .Review definition of
discrete random variable. But "countable" does not necessarily imply that we know the upper
limit (e.g., number of computer virus attacks per year).

A/ Random Variables

The expected value of a discrete random variable E(X) is the sum of all X values weighted by
their respective probabilities. t .Review definition of expected value. The mean is a weighted
average.
A discrete distribution can be described by its probability density function (PDF) or by
its cumulative distribution function (CDF). t .Review deRnition of PDF (point
probability) and CDF (cumulative sum of probabilities).
A random variable may be discrete or continuous, but not both. t .Review definition of
discrete and continuous. Discrete implies enumerable.
To describe the number of blemishes per sheet of white bond paper, we would use a
discrete uniform distribution. f . Not all X values would be equally likely and we have
no upper limit (Poisson distribution would be better).
The outcomes for the sum of two dice can be described as a discrete uniform
distribution. f .The sum of two dice follows a triangular distribution, as was shown in
Chapter 5

B/ Probability Distributions
A discrete binomial distribution is skewed right when π > .50. f .Most outcomes would
be on the right, so a longer left tail exists.
When π = .70 the discrete binomial distribution is negatively skewed. t .Most
outcomes would be on the right, so a longer left tail exists.
The Poisson distribution describes the number of occurrences within a randomly chosen
unit of time or space.t .Poisson describes events per unit of time.
The Poisson distribution can be skewed either left or right, depending on λ. f .Poisson is
always right-skewed.

6.2 EXPECTED VALUE AND VARIANCE

Although the shape of the Poisson distribution is positively skewed, it becomes more
nearly symmetric as its mean becomes larger. t . Although always right-skewed, the
Poisson approaches a normal as the mean increases.
As a rule of thumb, the Poisson distribution can be used to approximate a binomial
distribution when n ≥ 20 and π ≤ .05. t .The Poisson is a better approximation to a
binomial when n is large and π is small.

Application: Life insurance: The hypergeometric distribution is skewed right. f .The


hypergeometric is skewed right if s/N < .50 (and conversely).
The hypergeometric distribution assumes that the probability of a success remains the
same from one trial to the next. f .In the hypergeometric, π is not constant, because we
are sampling without replacement.
The hypergeometric distribution is not applicable if sampling is done with replacement. t
The hypergeometric is used when there is no replacement in sampling from a finite
population.

Application: Raffle tickets: As a rule of thumb, the binomial distribution can be used to
approximate the hypergeometric distribution whenever the population is at least 20 times
as large as the sample. t .It is safe to use the binomial-hypergeometric approximation if
n/N < .05.
An example of a geometric random variable is the number of pine trees with pine beetle
infestation in a random sample of 15 pine trees in Colorado. f .This is a binomial
experiment, assuming π is constant.
Calculating the probability of getting three aces in a hand of Rve cards dealt from a deck
of 52 cards would require the use of a hypergeometric distribution. t .This is a
hypergeometric experiment (sampling without replacement).
The Poisson distribution is appropriate to describe the number of babies born in a small
hospital on a given day. t .Events per unit of time with no clear upper limit suggests a
Poisson event.
The gender (M, F) of a randomly chosen unborn child is a Bernoulli event. t . Bernoulli
events have two outcomes (0 or 1).

The Poisson distribution has only one parameter. t .The one Poisson parameter is its
mean.
The standard deviation of a Poisson random variable is the square root of its mean. t Yes,
because the mean and variance of a Poisson are the same.
Customer arrivals per unit of time would tend to follow a binomial distribution. f .This
would be a Poisson (arrivals per unit of time).
The two outcomes (success, failure) in the Bernoulli model are equally likely. f .No, the
probability of success need not be .50.
The expected value of a random variable is its mean. t .The mean is another name for
expected value.
A discrete probability distribution: assigns a probability to each possible value of the
random variable. A discrete PDF assigns a probability to each X value
The number of male babies in a sample of 10 randomly chosen babies is a: binomial
random variable. Constant probability of success in n trials.

A discrete random variable: can be treated as continuous when it has a large range of
values.For example, the Sunday vehicle count on a freeway is a discrete (but large)
number.
The time until failure of a vehicle headlamp is not a discrete random variable. Time is
continuous.

6.4 BINOMIAL DISTRIBUTION

The hourly earnings of a call center employee in Boston is not a discrete random
variable. Someone's earnings would be more like a continuous measurement.
which statement is incorrect: The hypergeometric distribution is symmetric. A
hypergeometric distribution is symmetric only if s/N = .50.
The random variable X is the number of shots it takes before you make the first free
throw in basketball. Assuming the probability of success (making a free throw) is
constant from trial to trial, what type of distribution does X follow: Geometric.
Bernoulli distribution
Geometric model describes the number of trials until the first success
which probability model is most nearly appropriate to describe the number of burned-out
fluorescent tubes in a classroom with 12 fluorescent tubes, assuming a constant
probability of a burned-out tube?Binomial. n = 12 Bernoulli trials with fixed probability
of success would be a binomial model.
which distribution is most nearly appropriate to describe the number of fatalities in
Texas in a given year due to poisonous snakebites: Poisson. Events per unit of time with
no clear upper limit would resemble a Poisson distribution

which model would you use to describe the probability that a call-center operator will
make the first sale on the third call, assuming a constant probability of making a
sale?Geometric. Geometric describes the number of trials to first success.
Binomial shape
In a randomly chosen week, which probability model would you use to describe the
number of accidents at the intersection of two streets?Poisson. Events per unit of time
with no clear upper limit would resemble a Poisson distribution.
which model best describes the number of nonworking web URLs ("This page cannot be
displayed") you encounter in a randomly chosen minute while surRng websites for
Florida vacation rental condos?Poisson. Events per unit of time with no clear upper limit
would resemble a Poisson distribution.
Using the binomial formula
which probability model would you use to describe the number of damaged printers in a
random sample of 4 printers taken from a shipment of 28 printers that contains 3
damaged printers?Hypergeometric. Sampling (n = 4 printers) without replacement with
known number of "successes" (s = 3 damaged printers) in the population (N = 28
printers).
which model best describes the number of incorrect fare quotations by a well-trained
airline ticket agent between 2 p.m. and 3 p.m. on a particular Thursday?Poisson. Events
per unit of time with no clear upper limit would resemble a Poisson distribution.
which model best describes the number of blemishes per sheet of white bond
paper?Poisson. Events per unit of area with no clear upper limit would resemble a
Poisson distribution.
Compound events
To ensure quality, customer calls for airline fare quotations are monitored at random. On
a particular Thursday afternoon, ticket agent Bob gives 40 fare quotations, of which 4
are incorrect. In a random sample of 8 of these customer calls, which model best
describes the number of incorrect quotations Bob will make?Hypergeometric. Sampling
(n = 8 calls selected) without replacement with known number of "successes" (s = 4
incorrect quotes) in the population (N = 40 quotes).

6.5 POISSON DISTRIBUTION

Poisson Processes
The number of people injured in rafting expeditions on the Colorado River on a
randomly chosen Thursday in August is best described by which model: Poisson.
Independent events per unit of time with no clear upper limit would be Poisson.
On a particular Thursday in August, 40 Grand Canyon tourists enter a drawing for a free
mule ride. Ten of the entrants are European tourists. Five entrants are selected at random
to get the free mule ride. which model best describes the number of European tourists in
the random sample: Hypergeometric. Sampling (n = 5 tourists selected) without
replacement with known number of "successes" (s = 10 Europeans) in the population (N
= 40).
Characteristics of Poisson distribution
which model best describes the number of births in a hospital until the first twins are
delivered?Geometric. Geometric distribution describes the number of trials until the first
success

On a randomly chosen Wednesday, which probability model would you use to describe
the number of convenience store robberies in Los Angeles?Poisson. Events per unit of
time with no clear upper limit would be Poisson.
which probability model would you use to describe the number of customers served at a
certain California Pizza Kitchen until the first customer orders split pea soup?Geometric.
Geometric distribution describes the number of trials until the first success.
Using the Poisson formula
which distribution has a mean of 5?Hypergeometric with N = 100, n = 10, s = 50.
Review model parameters. The hypergeometric mean is ns/N = (10)(50)/100 = 5.
Of the following, the one that most resembles a Poisson random variable is the number
of: annual power failures at your residence. Independent arrivals per unit of time with no
clear upper limit would be Poisson.
(+) A charity rahe prize is $1,000. The charity sells 4,000 rahe tickets. One winner will
be selected at random. At what ticket price would a ticket buyer expect to break even:
$0.25. Expected winning is (1/4000) × $1000 = $0.25.
A die is rolled. If it rolls to a 1, 2, or 3, you win $2. If it rolls to a 4, 5, or 6, you lose $1.
Calculate the expected winnings.$0.50, E(X) = (3/6) × $2 + (3/6) × (-$1) = $0.50.
(+) A fair die is rolled. If it comes up 1 or 2 you win $2. If it comes up 3, 4, 5, or 6, you
lose $1. Calculate the expected winnings. $0, E(X) = (2/6) × $2 + (4/6) × (-$1) =
$0.6667 - $0.6667 = 0.
(+) A carnival has a game of chance: a fair coin is tossed. If it lands heads you win
$1.00, and if it lands tails you lose $0.50. How much should a ticket to play this game
cost if the carnival wants to break even?$0.25. E(X) = (.5) × $1 + (.5) × (-$.50) = $0.50 -
$0.25 = $0.25.

EXAMPLE
1. Ephemeral Services Corporation (ESCO) knows that nine other companies besides
ESCO are bidding for a $900,000 government contract. Each company has an equal
chance of being awarded the contract. If ESCO has already spent $100,000 in
developing its bidding proposal, what is its expected net profit? $0, E(X) = (1/9) ×
$900,000 = $100,000. ESCO only can expect to cover its sunk cost (no profit).
2. The discrete random variable X is the number of students that show up for Professor
Smith's olce hours on Monday afternoons. The table below shows the probability
distribution for X. what is the expected value E(X) for this distribution: 1.0, For each X,
multiply X time P(X) and sum the values.

X 0 1 2 3 Total

P(X) 0.40 0.30 0.20 0.10 1.00

3. The discrete random variable X is the number of students that show up for Professor
Smith's olce hours on Monday afternoons. The table below shows the probability distribution
for X. what is the probability that at least 1 student comes to olce hours on any given
Monday? . 60, P(X≥1)=1-P(X=0)=1-.40=.60.
X 0 1 2 3 Total

P(X) 0.40 0.30 0.20 0.10 1.00

4. The discrete random variable X is the number of students that show up for Professor
Smith's olce hours on Monday afternoons. The table below shows the probability distribution
for X. what is the probability that fewer than 2 students come to olce hours on any given
Monday? 0.70, P(X<2)=P(X=0)+P(X=1)=.40+.30=.70.
X 0 1 2 3 Total

P(X) 0.40 0.30 0.20 0.10 1.00


5. The discrete random variable X is the number of passengers waiting at a bus stop. The
table below shows the probability distribution for X. what is the expected value E(X) for this
distribution? 1.3 . For each X, multiply X times P(X) and sum the values.
X 0 1 2 3 Total

P(X) 0.40 0.30 0.20 0.10 1.00

6. Given the following probability distribution, what is the expected value of the random
variable X? 205, For each X, multiply X times P(X) and sum the values.
X P(X)

100 .10

150 .20

200 .30

250 .30

300 .10

Sum 1.00

6.6 HYPERGEOMETRIC DISTRIBUTION

Characteristics of Hypergeometric Distribution


which of the following characterizes a Bernoulli process: A random experiment that has only
two outcomes.Review characteristics of the Bernoulli (binary) process.

The binomial distribution describes the number of: "successes" in n Bernoulli trials.Review
characteristics of the binomial distribution (repeated binary trials).

which of the following is not a requirement of a binomial distribution: Equally likely


outcomes, Review characteristics of the binomial distribution (repeated binary trials).
The binomial distribution is symmetrical when: π = ½ and 1 - π = ½. The binomial
distribution is skewed unless π = .50.

The variance will reach a maximum in a binomial distribution when: π = ½ and 1 - π = ½.


Review formula for the binomial distribution standard deviation.

which distribution is most strongly right-skewed?Binomial with n = 50, π = .10, The binomial
is right-skewed when π < .50.

A random variable is binomially distributed with n = 16 and π = .40. The expected value and
standard deviation of the variables are: 6.40 and 1.96, Review formulas for the binomial
distribution mean and standard deviation.

Using the Hypergeometric formula


The expected value (mean) of a binomial variable is 15. The number of trials is 20. The
probability of "success" is . 75, Set E(X) = nπ = (20)π = 15 and solve for π.

If 90 percent of automobiles in Orange County have both headlights working, what is the
probability that in a sample of eight automobiles, at least seven will have both headlights
working? . 8131, Use Appendix A with n = 8 and π = .90 to Rnd P(X ≥ 7) or else use the
Excel function =1-BINOM.DIST(6,8,.90,1) = .8131.

EXAMPLE
1. In Quebec, 90 percent of the population subscribes to the Roman Catholic religion. In a
random sample of eight Quebecois, Rnd the probability that the sample contains at least Rve
Roman Catholics . 9950, Use Appendix A with n = 8 and π = .90 to Rnd P(X ≥ 5) or else use
the Excel function =1-BINOM.DIST(4,8,.90,1) = .99498.

2. Hardluck Harry has a batting average of .200 (i.e., a 20 percent chance of a hit each time
he's at bat). Scouts for a rival baseball club secretly observe Harry's performance 12 random
times at bat. what is the probability that Harry will get more than 2 hits? . 4417, Use
Appendix A with n = 12 and π = .20 to Rnd P(X ≥ 3) or else use the Excel function
=1-BINOM.DIST(2,12,.20,1) = .44165.

3. The probability that a visitor to an animal shelter will adopt a dog is .20. Out of nine visits,
what is the probability that at least one dog will be adopted? . 8658, Use Appendix A with n
= 9 and π = .20 to Rnd P(X ≥ 1) or else use the Excel function =1-BINOM.DIST(0,9,.20,1) =
.865778.
4. Based on experience, 60 percent of the women who request a pregnancy test at a certain
clinic are actually pregnant. In a random sample of 12 women, what is the probability that at
least 10 are pregnant? . 0835, Use Appendix A with n = 12 and π = .60 to Rnd P(X ≥ 10) or
else use the Excel function =1-BINOM.DIST(9,12,.60,1) = .08344.

5. If 5 percent of automobiles in Oakland County have one burned-out headlight, what is the
probability that, in a sample of 10 automobiles, none will have a burned-out headlight? .
5987, Use Appendix A with n = 10 and π = .05 Rnd P(X = 0) or else use the Excel function
=BINOM.DIST(0,10,.05,0) = .59874.

6.7 GEOMETRIC DISTRIBUTION


Jankord Jewelers permits the return of their diamond wedding rings, provided the return
occurs within two weeks. Typically, 10 percent are returned. If eight rings are sold today,
what is the probability that fewer than three will be returned: .9619

The probability that an Oxnard University student is carrying a backpack is .70. If 10 students
are observed at random, what is the probability that fewer than 7 will be carrying backpacks:
.3504

An insurance company is issuing 16 car insurance policies. Suppose the probability for a
claim during a year is 15 percent. If the binomial probability distribution is applicable, then
the probability that there will be at least two claims during the year is equal to: . 7161, Use
Appendix A with n = 16 and π = .15 to Rnd P(X ≥ 2) or else use the Excel function
=1-BINOM.DIST(1,16,.15,1) = .7161.

A random variable X is distributed binomially with n = 8 and π = 0.70. The standard


deviation of the variable X is approximately: 1.296, Use the formula for the binomial standard
deviation.
Suppose X is binomially distributed with n = 12 and π = .20. The probability that X will be
less than or equal to 3 is . 7946, Use Appendix A with n = 12 and π = .20 to Rnd P(X ≤ 3) or
else use the Excel function =BINOM.DIST(3,12,.2,1) = .79457.

6.8 TRANSFORMATIONS OF RANDOM VARIABLES


which Excel function would generate a single random X value for a binomial random variable
with parameters n = 16 and π = .25?=BINOM.INV(16,.25,RAND()), This is the Excel 2010
function for the inverse of a binomial.

Linear transformation
A network has three independent Rle servers, each with 90 percent reliability. The probability
that the network will be functioning correctly (at least one server is working) at a given time
is: 99.9percent. Use Appendix A with n = 3 and π = .90.

which statement concerning the binomial distribution is correct? Its PDF covers all integer
values of X from 0 to n.Review definitions of the binomial distribution. The binomial domain
is X = 0, 1,…, n.

Historically, 2 percent of the stray dogs in Southfield are unlicensed. On a randomly chosen
day, the Southfield city animal control olcer picks up seven stray dogs. what is the probability
that fewer than two will be unlicensed? . 9921, Use Appendix A with n = 7 and π = .02.

The domain of X in a Poisson probability distribution is discrete and can include: any
nonnegative integer X value.For a Poisson random variable, X = 0, 1, 2, . . . (no upper limit)

On Saturday morning, calls arrive at TicketMaster at a rate of 108 calls per hour. what is the
probability of fewer than three calls in a randomly chosen minute? . 7306, Use Appendix B
with λ = 108/60 = 1.8.

Sums of random variables


On average, a major earthquake (Richter scale 6.0 or above) occurs three times a decade in a
certain California county. Find the probability that at least one major earthquake will occur
within the next decade. . 9502, Use Appendix B with λ = 3.0.

On average, an IRS auditor discovers 4.7 fraudulent income tax returns per day. On a
randomly chosen day, what is the probability that she discovers fewer than two? . 0518, Use
Appendix B with λ = 4.7.

On a Sunday in April, dog bite victims arrive at Carver Memorial Hospital at a historical rate
of 0.6 victims per day. On a given Sunday in April, what is the probability that exactly two
dog bite victims will arrive? . 0988, Use Appendix B with λ = 0.6.

If tubing averages 16 defects per 100 meters, what is the probability of finding exactly 2
defects in a randomly chosen 10-meter piece of tubing? . 258 4, Use Appendix B with λ =
16/10 = 1.6.

Cars are arriving at a toll booth at a rate of four per minute. what is the probability that
exactly eight cars will arrive in the next two minutes? 0.1396, Use Appendix B with λ = 4.0.

Arrival of cars per minute at a toll booth may be characterized by the Poisson distribution if
the arrivals are independent. Events per unit of time with no clear upper limit.
Covariance
The coefficient of variation for a Poisson distribution with λ = 5 is 44.7percent. Use the
coefficient of variation with standard deviation equal to the square root of the mean

The coefficient of variation for a Poisson distribution with λ = 4 is 50.0 percent. The Poisson
standard deviation is the square root of the mean.

For which binomial distribution would a Poisson approximation be unacceptable? n = 200, π


= 0.10. We want n ≥ 20 and π ≤ .05.

For which binomial distribution would a Poisson approximation be acceptable? n = 40, π =


0.03. We want n ≥ 20 and π ≤ .05 for an acceptable Poisson approximation.

For which binomial distribution would a Poisson approximation not be acceptable? n = 35, π
= 0.07. We want n ≥ 20 and π ≤ .05 for an acceptable Poisson approximation.

PRACTICE
1. The true proportion of accounts receivable with some kind of error is .02 for Venal
Enterprises. If an auditor randomly samples 200 accounts receivable, what is the approximate
Poisson probability that fewer than two will contain errors?. 0916. Since n ≥ 20 and π ≤ .05
we can set λ = nπ = (200)(.02) = 4.0 and use Appendix B to Rnd P(X ≤ 1), or else use the
Excel cumulative distribution function =POISSON.DIST(1,4.0,1) = .09158.

2. The probability that a rental car will be stolen is 0.0004. If 3500 cars are rented, what is the
approximate Poisson probability that 2 or fewer will be stolen?. 8335.Since n ≥ 20 and π ≤
.05 we can set λ = nπ = (3500)(.0004) = 1.4 and use Appendix B to Rnd P(X ≤ 2), or else use
the Excel cumulative distribution function =POISSON.DIST(2,1.4,1) = .8335.

3. The probability that a customer will use a stolen credit card to make a purchase at a certain
Target store is 0.003. If 400 purchases are made in a given day, what is the approximate
Poisson probability that 4 or fewer will be with stolen cards? . 9923. Since n ≥ 20 and π ≤ .05
we can set λ = nπ = (400)(.003) = 1.2 and use Appendix B, or else use the Excel cumulative
distribution function =POISSON.DIST(4,.003*400,1) = .9923.

4. The probability that a ticket holder will miss a flight is .005. If 180 passengers take the
flight, what is the approximate Poisson probability that at least 2 will miss the flight? . 2275.
Since n ≥ 20 and π ≤ .05 we can set λ = nπ = (.005)(180) = 0.9 and use Appendix B to Rnd
P(X ≥ 2), or else use the Excel cumulative distribution function =1-POISSON.DIST(1,0.9,1)
= .2275.

5. The probability that a certain daily flight's departure from ORD to LAX is delayed is .02.
Over six months, this flight departs 180 times. what is the approximate Poisson probability
that it will be delayed fewer than 2 times? . 1257. Since n ≥ 20 and π ≤ .05 we can set λ = nπ
= (180)(.02) = 3.6 and use Appendix B to Rnd P(X ≤ 1) or else use the Excel cumulative
distribution function =POISSON.DIST(1,3.6,1) = .12569.
(+) If X is a discrete uniform random variable ranging from 0 to 12, Rnd P(X ≥ 10) . 2308. 3
out of 13 outcomes (don't forget to count 0 as an outcome).

(+) If X is a discrete uniform random variable ranging from one to eight, Rnd P(X < 6) . 6250.
We count Rve out of eight outcomes that meet this requirement.

(+) If X is a discrete uniform random variable ranging from one to eight, its mean is 4.5. The
mean is halfway between the lower and upper limits 1 and 8.

(+) If X is a discrete uniform random variable ranging from 12 to 24, its mean is 18.0. The
mean is halfway between the lower and upper limits 12 and 24.

→ At Ersatz University, the graduating class of 480 includes 96 guest students from Latvia.
A sample of 10 students is selected at random to attend a dinner with the Board of Governors.
Use the binomial model to obtain the approximate hypergeometric probability that the sample
contains at least three Latvian students. 3222. Since n/N < .05 we can use Appendix A with n
= 10 and π = 96/480 = .20 to Rnd P(X ≥ 3).

There are 90 passengers on a commuter flight from SFO to LAX, of whom 27 are traveling
on business. In a random sample of Rve passengers, use the binomial model to find the
approximate hypergeometric probability that there is at least one business passenger . 8319.
Since n/N < .05 we can use Appendix A with n = 5 and π = 27/90 = .30 to Rnd P(X ≥ 1).

Use the binomial model to find the approximate hypergeometric probability of at least two
damaged flash drives in a sample of Rve taken from a shipment of 150 that contains 30
damaged flash drives.0.2627. Since n/N < .05 we can use Appendix A with n = 5 and π =
30/150 = .20 to Rnd P(X ≥ 2).

On a particular day, 112 of 280 passengers on a particular DTW-LAX ^ight used the e-ticket
check-in kiosk to obtain boarding passes. In a random sample of eight passengers, use the
binomial model to find the approximate hypergeometric probability that four will have used
the e-ticket check-in kiosk to obtain boarding passes . 2322. Since n/N < .05 we can use
Appendix A with n = 8 and π = 112/280 = .40 to Rnd P(X = 4).

A clinic employs nine physicians. Five of the physicians are female. Four patients arrive at
once. Assuming the doctors are assigned randomly to patients, what is the probability that all
of the assigned physicians are female?. 0397. You can't use the binomial approximation,
because we have sampled more than 5% of the population (n/N = 4/9 = .444) so we use the
hypergeometric formula with x = 4, n = 4, s = 5, N = 9 or use the Excel function
=HYPGEOM.DIST(4,4,5,9,0) = .03938.

There is a .02 probability that a customer's Visa charge will be rejected at a certain Target
store because the transaction exceeds the customer's credit limit. what is the probability that
the first such rejection occurs on the third Visa transaction? . 0192. Use the formulas for the
geometric PDF (not the CDF) with π = .02 to Rnd P(X = 3) = .02(1 - .02)3-1 = . 02(.98)2 =
.02(.9604) = .019208.

Ten percent of the corporate managers at Axolotl Industries majored in humanities. If you
start interviewing Axolotl managers, what is the probability that the first humanities major is
the Rfth manager that you interview? . 0656. Use the formulas for the geometric PDF (not the
CDF) with π = .10 to Rnd P(X = 5) = .10(1 - .10)5-1 = . 10(.90)4 = .10(.6561) = .06561.

Ten percent of the corporate managers at Axolotl Industries majored in humanities. what is
the expected number of managers to be interviewed until finding the first one with a
humanities major?10. The geometric mean is 1/π = 1/(.10) = 10.

When you send out a resume, the probability of being called for an interview is .20. what is
the probability that the first interview occurs on the fourth resume that you send out? . 1024.
Use the formulas for the geometric PDF (not the CDF) with π = .20 to Rnd P(X = 4) = .20(1 -
.20)4-1 = . 20(.80)3 = .20(.512) = .1024.

when you send out a resume, the probability of being called for an interview is .20. what is
the expected number of resumes you send out until you get the first interview?5. The
geometric mean is 1/π = 1/(.20) = 5.

when you send out a resume, the probability of being called for an interview is .20. what is
the probability that you get your first interview within the first five resumes that you send
out? . 6723. Use the formulas for the geometric CDF (not the PDF) with π = .20 to Rnd P(X ≤
5) = 1 - (1 - .20)5 = 1 - (.80)5 = 1 - .32678 = .67232.

There is a .02 probability that a customer's Visa charge will be rejected at a certain Target
store because the transaction exceeds the customer's credit limit. what is the probability that
the first such rejection occurs within the first 20 Visa transactions? . 3324. Use the formulas
for the geometric CDF (not the PDF) with π = .02 to Rnd P(X ≤ 20) = 1 - (1 - .02)20 = 1 -
(.98)20 = 1 - .6676 = .3324.

There is a .02 probability that a customer's Visa charge will be rejected at a certain Target
store because the transaction exceeds the customer's credit limit. what is the expected number
of Visa transactions until the first one is rejected? 50 The geometric mean is 1/π = 1/(.02) =
50.
The geometric distribution best describes the number of trials until the first success. Review
the definition of geometric distribution.

The CDF for the geometric distribution shows the probability that the first success will occur
within a given number of trials. Review the definition of geometric distribution.

If the probability of success is .25, what is the probability of obtaining the first success
within the first three trials? . 5781 Use the formulas for the geometric CDF (not the PDF)
with π = .25 to Rnd P(X ≤ 3) = 1 - (1 - .25)3 = 1 - (.75)3 = 1 - .421875 = .578125.

If the probability of success is .30, what is the probability of obtaining the first success
within the first five trials? . 8319 Use the formulas for the geometric CDF (not the PDF) with
π = .30 to Rnd P(X ≤ 5) = 1 - (1 - .30)5 = 1 - (.70)5 = 1 - .16807 = .83193.

A project has three independent stages that must be completed in sequence. The time to
complete each stage is a random variable. The expected times to complete the stages are μ1 =
23, μ2 = 11, μ3 = 17. The expected project completion time is 51. The means can be summed
because the stages are independent.

A project has 3 independent stages that must be completed in sequence. The time to complete
each stage is a random variable. The standard deviations of the completion times for the
stages are σ1 = 5, σ2 = 4, σ3 =36, then add them and take the square root of the sum. Be
careful—the standard deviations cannot be summed.

The standard deviation of the overall project completion time is: 8.77. The variances can be
summed because the stages are independent (Rule 4). You have to square the standard
deviations to get the variances σ12 = 25, σ22 = 16, σ32 = 36, then add them and take the square
root of the sum. Be careful—the standard deviations cannot be summed.

A stock portfolio consists of two stocks X and Y. Their daily closing prices are independent
random variables with standard deviations σX = 2.51 and σY = 5.22. what is the standard
deviation of the sum of the closing prices of these two stocks?5.79. The variances can be
summed because the stages are independent (Rule 4). You have to square the standard
deviations to get the variances σX2 = 6.3001 and σY2 = 27.2484, then add them and take the
square root of the sum. Be careful—the standard deviations cannot be summed.

A stock portfolio consists of two stocks X and Y. Their daily closing prices are correlated
random variables with variances σX2 = 3.51 and σY2 = 5.22, and covariance σXY = -1.55. what
is the standard deviation of the sum of the closing prices of these two stocks? 2.68. Use the
formula for the variance of correlated (nonindependent) events. We sum the variances and
covariance, and then take the square root: σX+Y = [σX2 + σY2 + σXY]1/2 = [3.51 + 5.22 -
1.55]1/2 = [7.18]1/2 = 2.67955.

The expected value of a random variable X is 140 and the standard deviation is 14. The
standard deviation of the random variable Y = 3X - 10 is 42. Use the rule for functions of a
random variable (Rule 2) to get σY = 3σX = (3)(14) = 42. The constant -10 merely shifts the
distribution and has no eect on the standard deviation. The mean of Y is not requested.

The expected value of a random variable X is 10 and the standard deviation is 2. The standard
deviation of the random variable Y = 2X - 10 is 4. Use the rule for functions of a random
variable (Rule 2) to get σY = 2σX = (2)(2) = 4. The constant -10 merely shifts the distribution
and has no eect on the standard deviation. The mean of Y is not requested.
CHAPTER 8
Chapter 08: Sampling Distributions and Estimation

8.1 SAMPLING AND ESTIMATION


A sample statistic is a random variable whose value depends on which population items hap-
pen to be included in the random sample. Some samples may represent the population well,
while other samples could differ greatly from the population (particularly if the sample size is
small). To illustrate sampling variation, let’s draw some random samples from a large
population of GMAT scores for MBA applicants.
• Sampling variation (uncontrollable).
• Population variation (uncontrollable).
• Sample size (controllable).
• Desired confidence in the estimate (controllable).

(-) The expected value of an unbiased estimator is equal to the parameter whose value is
being estimated. t. An unbiased estimator's expected value is the true parameter value.
All estimators are biased since sampling errors always exist to some extent. f. Some
estimators are systematically biased, regardless of sampling error.
An estimator must be unbiased if you are to use it for statistical analysis. f. An estimator can
be useful as long as its bias is known.
The efficiency of an estimator depends on the variance of the estimator's sampling
distribution. t. Efficiency is measured by the variance of the estimator's sampling distribution.
In comparing estimators, the more efficient estimator will have a smaller standard error. t.
Efficiency is measured by the variance of the estimator's sampling distribution.
A 90 percent confidence interval will be wider than a 95 percent confidence interval, ceteris
paribus. f. We can make a more precise statement about the true parameter if we are willing
to sacrifice some confidence. For example, z.025 = 1.960 (for 95 percent confidence) gives a
wider interval than Zos = 1.645 (for 90 percent confidence).The proffered statement would
also hold true for the Student's t distribution.
In constructing a confidence interval for the mean, the z distribution provides a result nearly
identical to the t distribution when n is large. t. Student's t approaches z as sample size
increases.
The Central Limit Theorem says that, if n exceeds 30, the population will be normal. f. The
population cannot be changed.
The Central Limit Theorem says that a histogram of the sample means will have a bell shape,
even if the population is skewed and the sample is small. f. A large sample size may be
required if the population is skewed.

Estimators
An estimator is a statistic derived from a sample to infer the value of a population
parameter. An estimate is the value of the estimator in a particular sample.

(-) The confidence level refers to the procedure used to construct the confidence interval,
rather than to the particular confidence interval we have constructed. t. A particular interval
either does or does not contain the true parameter.
The Central Limit Theorem guarantees an approximately normal sampling distribution when
n is sufficiently large. t. Yes, although a large sample size may be required if the population is
skewed.
A sample of size 5 shows a mean of 45.2 and a sample standard deviation of 6.4. The
standard error of the sample mean is approximately 2.86. t. The standard error is the standard
deviation divided by the square root of the sample size.
As n increases, the width of the confidence interval will decrease, ceteris paribus. t. The
standard error is the standard deviation divided by the square root of the sample size.
t
As n increases, the standard error decreases. t. The standard error is the standard deviation
divided by the square root of the sample size.
A higher confidence level leads to a narrower confidence interval, ceteris paribus. f. Higher
confidence requires more uncertainty (a wider interval). For example, z.025 = 1.960 (for 95
percent confidence) gives a wider interval than z.05 = 1.645 (for 90 percent confidence). The
proffered statement would also hold true for the Student's t distribution.
When the sample standard deviation is used to construct a confidence interval for the mean,
we would use the Student's t distribution instead of the normal distribution. t. We should use t
when the population variance is unknown.
As long as the sample is more than one item, the standard error of the sample mean will be
smaller than the standard deviation of the population. t. The standard error is the standard
deviation divided by the square root of the sample size.
For a sample size of 20, a 95 percent confidence interval using the t distribution would be
wider than one constructed using the z distribution. t. Student's t is always larger than z for
the same level of confidence.
In constructing a confidence interval for a mean, the width of the interval is dependent on the
sample size, the confidence level, and the population standard deviation. t. The confidence
interval depends on all of these.
In constructing confidence intervals, it is conservative to use the z distribution when n ≥ 30. f.
While t and z may be similar for large samples, it is more conservative to use t.

Sampling Error
Random samples vary, so an estimator is a random variable. The sampling error is the dif-
ference between an estimate and the corresponding population parameter.

(-)The Central Limit Theorem can be applied to the sample proportion. t. We are sampling a
Bernoulli population, but the CLT still applies.
The distribution of the sample proportion p = x/n is normal when n ≥ 30. f. We want at least
10 successes and 10 failures to assume that p is normally distributed.
The standard deviation of the sample proportion p = x/n increases as n increases. f. The
proffered statement is backwards because n is in the denominator of [p(1 - p)/n]1/2.
A 95 percent confidence interval constructed around p will be wider than a 90 percent
confidence interval. t. Higher confidence requires more uncertainty (a wider interval).
The sample proportion is always the midpoint of a confidence interval for the population
proportion. t. The interval is p ± z[p(1 - p)/n]1/2.
The standard error of the sample proportion is largest when π = .50. t. The value of [π(1 -
π)/n]1/2 is smaller for any value less than π = .50.
The standard error of the sample proportion does not depend on the confidence level. t. The
standard error of p is [π(1 - π)/n]1/2.
To narrow the confidence interval for π, we can either increase n or decrease the level of
confidence. t. The interval is p ± z[p(1 - p)/n]1/2.
Ceteris paribus, the narrowest confidence interval for π is achieved when p = .50. f. The value
of [p(1 - p)/n]1/2 is smaller for any value less than π = .50.
The statistic p = x/n may be assumed normally distributed when np ≥ 10 and n(1 - p) ≥ 10. t.
We want at least 10 successes and 10 failures in the sample to assume normality of p.
The Student's t distribution is always symmetric and bell-shaped, but its tails lie above the
normal. t. Student's t resembles a normal, but its PDF is above the normal PDF in the tails.
Example:
Sampling error exists because different samples will yield different values for x−, depending
on which population items happen to be included in the sample. For example, we happen to
know the true mean GMAT score, so we could calculate the sampling error for any given
sample on the previous page.

(-) The confidence interval half-width when π = .50 is called the margin of error. t. Pollsters
use this definition.
Based on the Rule of Three, if no events occur in n independent trials we can set the upper 95
percent confidence bound at 3/n. t. We need a special rule because when p = 0 we can't apply
the usual formula p ± z[p(1 - p)/n]1/2.
The sample standard deviation s is halfway between the lower and upper confidence limits
for the population σ (i.e., the confidence interval is symmetric around s). f. The chi-square
distribution is not symmetric.
In a sample size calculation, if the confidence level decreases, the size of the sample needed
will increase. f. Reduced confidence allows a smaller sample.
To calculate the sample size needed for a survey to estimate a proportion, the population
standard deviation σ must be known. f. For a proportion, the sample size formula requires π
not σ.
Assuming that π = .50 is a quick and conservative approach to use in a sample size
calculation for a proportion. t. Assuming that π = .50 is quick and safe (but may give a larger
sample than is needed).
To estimate the required sample size for a proportion, one method is to take a small pilot
sample to estimate π and then apply the sample size formula. t. This is a common method, but
assuming that π = .50 is quicker and safer.
To estimate π, you typically need a sample size equal to at least 5 percent of your population.
f. The sample size n bears no necessary relation to N.
To estimate a proportion with a 4 percent margin of error and a 95 percent confidence level,
the required sample size is over 800. f. n = (z/E)2(π)(1 - π) = (1.96/.04)2(.50)(1 - .50) =
600.25.

Properties of Estimators
Bias The bias is the difference between the expected value (i.e., the average value) of the
estimator and the true parameter.

Efficiency Efficiency refers to the variance of the estimator’s sampling distribution. Smaller
variance means a more efficient estimator.
Consistency A consistent estimator converges toward the parameter being estimated as the
sample size increases.

(-) Approximately 95 percent of the population X values will lie within the 95 percent
confidence interval for the mean. f. The confidence interval is for the true mean, not for
individual X values.
A 99 percent confidence interval has more confidence but less precision than a 95 percent
confidence interval. t. The higher confidence level widens the interval so it is less precise.
Sampling variation is not controllable by the statistician. t. Sampling variation is inevitable.
The sample mean is not a random variable when the population parameters are known. f. The
sample mean is a random variable regardless of what we know about the population.
The finite population correction factor (FPCF) can be ignored if n = 7 and N = 700. t. The
FPCF has a negligible effect when the sample is less than 5 percent of the population.
In constructing a confidence interval, the finite population correction factor (FPCF) can be
ignored if samples of 12 items are drawn from a population of 300 items. t. The FPCF has a
negligible effect when the sample is less than 5 percent of the population.
The finite population correction factor (FPCF) can be ignored when the sample size is large
relative to the population size. f. The FPCF has a negligible effect when n is small relative to
N.

8.2 CENTRAL LIMIT THEOREM


The sampling distribution of an estimator is the probability distribution of all possible
values the statistic may assume when a random sample of size n is taken. We can use one of
the most fundamental laws of statistics, the Central Limit Theorem.

(-) A sampling distribution describes the distribution of: a statistic. A statistic has a sampling
distribution.
As the sample size increases, the standard error of the mean: decreases. The standard error of
the mean is σ/(n)1/2.
which statement is most nearly correct, other things being equal? Quadrupling the sample
size roughly halves the standard error of the mean. The standard error of the mean is σ/(n)1/2
so replacing n by 4n would cut the SEM in half.
The width of a confidence interval for μ is not affected by: the sample mean. The mean is not
used in calculating the width of the confidence interval zσ/(n)1/2.
The Central Limit Theorem (CLT) implies that: the distribution of the mean is approximately
normal for large n. The sampling distribution of the mean is asymptotically normal for any
population.
The owner of Limp Pines Resort wanted to know the average age of its clients. A random
sample of 25 tourists is taken. It shows a mean age of 46 years with a standard deviation of 5
years. The width of a 98 percent CI for the true mean client age is approximately: ± 2.492
years. The width is ts/(n)1/2 = (2.492)(5)/(25)1/2 = 2.492.

8.3 SAMPLE SIZE AND STANDARD ERROR


Standard Error Declines as n Increases
Even if the population standard deviation σ is large, the sample means will fall within a
narrow __
interval as long as n is large. The key is the standard error of the mean: σx− σ√n. The
standard error decreases as n increases.

(-) In constructing a confidence interval for a mean with unknown variance with a sample of
25 items, Bob used z instead of t. "Well, at least my interval will be wider than necessary, so
it was a conservative error," said he. Is Bob's statement correct? No. z is always smaller than t
(ceteris paribus) so the interval would be narrower than is justified.
A random sample of 16 ATM transactions at the Last National Bank of Flat Rock revealed a
mean transaction time of 2.8 minutes with a standard deviation of 1.2 minutes. The width (in
minutes) of the 95 percent confidence interval for the true mean transaction time is: ± 0.639.
The width is ts/(n)1/2 = (2.131)(1.2)/(16)1/2 = 0.639.
We could narrow a 95 percent confidence interval by: using a larger sample. A larger sample
would narrow the interval width zσ/(n)1/2. The owner of Torpid Oaks B&B wanted to know
the average distance its guests had traveled. A random sample of 16 guests showed a mean
distance of 85 miles with a standard deviation of 32 miles. The 90 percent confidence interval
(in miles) for the mean is approximately: (71.0, 99.0). The interval is 85 ± ts/(n)1/2 or 85 ±
(1.753)(32)/(16)1/2 with d.f = 15 (don't use z).
A highway inspector needs an estimate of the mean weight of trucks crossing a bridge on the
interstate highway system. She selects a random sample of 49 trucks and finds a mean of 15.8
tons with a sample standard deviation of 3.85 tons. The 90 percent confidence interval for the
population mean is: 14.88 to 16.72 tons. The interval is 15.8 ± ts/(n)1/2 or 15.8 ±
(1.677)(3.85)/(49)1/2 using d.f. = 48 (don't use z).
To determine a 72 percent level of confidence for a proportion, the value of z is
approximately: ± 1.08. Look up the z value that puts 14 percent in each tail.

8.4 CONFIDENCE INTERVAL FOR A MEAN (μ) WITH KNOWN σ

What Is a Confidence Interval?


A sample mean x− calculated from a random sample x1, x2, . . . , xn is a point estimate of
the unknown population mean μ. Because samples vary, we need to indicate our uncertainty
about the true value of μ. Based on our knowledge of the sampling distribution of X−, we can
create an interval estimate for μ. We construct a confidence interval for the unknown mean μ
by adding and subtracting a margin of error from x−, the mean of our random sample. The
confidence level for this interval is expressed as a percentage such as 90, 95, or 99 percent.
(-)Professor York randomly surveyed 240 students at Oxnard University and found that 150
of the students surveyed watch more than 10 hours of television weekly. How many
additional students would Professor York have to sample to estimate the proportion of all
Oxnard University students who watch more than 10 hours of television each week within ±
3 percent with 99 percent confidence? 1489. Using p = .625 we get n = (z/E)2(π)(1 - π) =
(2.576/.03)2(.625)(.375) = 1728.06 (round up).
In constructing a 95 percent confidence interval, if you increase n to 4n, the width of your
confidence interval will (assuming other things remain the same) be: about 50 percent of its
former width. The standard error of the mean is σ/(n)1/2 so replacing n by 4n would cut the
SEM in half.
which of the following is not a characteristic of the t distribution? It approaches z as degrees
of freedom decrease. It approaches z as degrees of freedom increase.
To estimate the average annual expenses of students on books and class materials a sample of
size 36 is taken. The sample mean is $850 and the sample standard deviation is $54. A 99
percent confidence interval for the population mean is: $825.48 to $874.52. The interval is
850 ± ts/(n)1/2 or 850 ± (2.724)(54)/(36)1/2 with d.f = 35 (don't use z).

Choosing a Confidence Level


You might be tempted to assume that a higher confidence level gives a “better” estimate.
However, confidence is not free—there is a trade-off that must be made. A higher con-
fidence level leads to a wider confidence interval, as illustrated in Example 8.2. The 95
percent confidence interval is wider than the 90 percent confidence interval. In order to gain
confidence, we must accept a wider range of possible values for μ. Greater confidence
implies loss of precision (i.e., a greater margin of error).
(-) which statement is incorrect? Explain. If n = 250 and p = .06, we cannot assume normality
in a confidence interval for π. Normality of p may be assumed because np = 15 and n(1 - p) =
235.
what is the approximate width of a 90 percent confidence interval for the true population
proportion if there are 12 successes in a sample of 25? ± .164. The interval width is ± z[p(1 -
p)/n]1/2 = ± (1.645)[(.48)(.52)/25]1/2.
A poll showed that 48 out of 120 randomly chosen graduates of California medical schools
last year intended to specialize in family practice. what is the width of a 90 percent
confidence interval for the proportion that plan to specialize in family practice? ± .0736. The
interval width is ± z[p(1 - p)/n]1/2 = ± (1.645)[(.40)(.60)/120]1/2.
what is the approximate width of an 80 percent confidence interval for the true population
proportion if there are 12 successes in a sample of 80? ± .051. The interval width is ± z[p(1 -
p)/n]1/2 = ± (1.282)[(.15)(.85)/80]1/2.
A random sample of 160 commercial customers of PayMor Lumber revealed that 32 had paid
their accounts within a month of billing. The 95 percent confidence interval for the true
proportion of customers who pay within a month would be: 0.138 to 0.262. The interval is p
± z[p(1 - p)/n]1/2 = .20 ± (1.960)[(.20)(.80)/160]1/2.

Interpretation
Before actually taking a random sample we can think of the confidence level 1 2 α as a prob-
ability on the procedure used to calculate the confidence interval.

(-) A random sample of 160 commercial customers of PayMor Lumber revealed that 32 had
paid their accounts within a month of billing. Can normality be assumed for the sample
proportion? Yes. Because there were at least 10 "successes" and at least 10 "failures" in the
sample.
The conservative sample size required for a 95 percent confidence interval for π with an error
of ± 0.04 is: 601. n = (z/E)2(π)(1 - π) = (1.96/.04)2(.50)(1 - .50) = 600.25 (round up).
Last week, 108 cars received parking violations in the main university parking lot. Of these,
27 had unpaid parking tickets from a previous violation. Assuming that last week was a
random sample of all parking violators, find the 95 percent confidence interval for the
percentage of parking violators that have prior unpaid parking tickets. 16.8 to 33.2 percent.
The interval is p ± z[p(1 - p)/n]1/2 = .25 ± (1.960)[(.25)(.75)/108]1/2.
In a random sample of 810 women employees, it is found that 81 would prefer working for a
female boss. The width of the 95 percent confidence interval for the proportion of women
who prefer a female boss is: ± .0207. The width is ± z[p(1 - p)/n]1/2 or ±
(1.960)[(.10)(.90)/810]1/2 or ± .0207.
Jolly Blue Giant Health Insurance (JBGHI) is concerned about rising lab test costs and would
like to know what proportion of the positive lab tests for prostate cancer are actually proven
correct through subsequent biopsy. JBGHI demands a sample large enough to ensure an error
of ± 2 percent with 90 percent confidence. what is the necessary sample size? 1,69 2. n =
(z/E)2(π)(1 - π) = (1.645/.02)2(.50)(1 - .50) = 1691.3 (round up).

8.5 CONFIDENCE INTERVAL FOR A MEAN (μ) WITH UNKNOWN σ


Student’s t Distribution
In situations where the population is normal but its standard deviation σ is unknown, the
Student’s t distribution should be used instead of the normal z distribution. This is
particularly important when the sample size is small. When σ is unknown, the formula for a
confidence interval resembles the formula for known σ except that t replaces z and s replaces
σ.

Degrees of Freedom
Knowing the sample size allows us to calculate a parameter called degrees of freedom (some-
times abbreviated d.f.). This parameter is used to determine the value of the t statistic used in
the confidence interval formula. The degrees of freedom tell us how many observations we
used to calculate s, the sample standard deviation, less the number of intermediate estimates
we used in our calculation.
(-) A university wants to estimate the average distance that commuter students travel to get to
class with an error of ± 3 miles and 90 percent confidence. what sample size would be
needed, assuming that travel distances are normally distributed with a range of X = 0 to X =
50 miles, using the Empirical Rule μ ± 3σ to estimate σ. About 21 students. Using σ = (50 -
0)/6 = 8.333, we get n = [zσ/E]2 = [(1.645)(8.333)/3]2 = 20.9 (round up).
A financial institution wishes to estimate the mean balances owed by its credit card
customers. The population standard deviation is $300. If a 99 percent confidence interval is
used and an interval of ± $75 is desired, how many cardholders should be sampled? 107. n =
[zσ/E]2 = [(2.576)(300)/75]2 = 106.2 (round up).
A company wants to estimate the time its trucks take to drive from city A to city B. The
standard deviation is known to be 12 minutes. what sample size is required in order that error
will not exceed ± 2 minutes, with 95 percent confidence? 139 observations. n = [zσ/E]2 =
[(1.960)(12)/2]2 = 138.3 (round up).
Using the Empirical Rule μ ± 3σ to estimate σ, how many students would you need to sample
to estimate the true mean score for the class with 90 percent confidence and an error of ± 2?
About 25 students. Using σ = (87 - 51)/6 = 6, we get n = [zσ/E]2 = [(1.645)(6)/2]2 = 24.35
(round up).

Comparison of z and t
Table 8.7 (taken from Appendix D) shows that for very small samples the t-values differ
substan- tially from the normal z values. But for a given confidence level, as degrees of
freedom increase, the t-values approach the familiar normal z-values (shown at the bottom of
each column cor- responding to an infinitely large sample).

(-) Using the conventional polling definition, find the margin of error for a customer
satisfaction survey of 225 customers who have recently dined at Applebee's. ± 6.5 percent.
The margin of error is ± z[π(1 - π)/n]1/2 or ± (1.960)[(.50)(.50)/225]1/2 or ± .065.
A marketing firm is asked to estimate the percent of existing customers who would purchase
a "digital upgrade" to their basic cable TV service. The firm wants 99 percent confidence and
an error of ± 5 percent. what is the required sample size (to the next higher integer)? 664. n =
(z/E)2(π)(1 - π) = (2.576/.05)2(.50)(1 - .50) = 663.6 (round up).
An airport traffic analyst wants to estimate the proportion of daily takeoffs by small business
jets (as opposed to commercial passenger jets or other aircraft) with an error of ± 4 percent
with 90 percent confidence. what sample size should the analyst use? 423. n = (z/E)2(π)(1 -
π) = (1.645/.04)2(.50)(1 - .50) = 422.8 (round up).
Ersatz Beneficial Insurance wants to estimate the cost of damage to cars due to accidents. The
standard deviation of the cost is known to be $200. They want to estimate the mean cost
using a 95 percent confidence interval within ± $10. what is the minimum sample size n?
1537. n = [zσ/E]2 = [(1.960)(200)/10]2 = 1536.6 (round up).
Professor York randomly surveyed 240 students at Oxnard University and found that 150 of
the students surveyed watch more than 10 hours of television weekly. Develop a 95 percent
confidence interval to estimate the true proportion of students who watch more than 10 hours
of television each week. The confidence interval is: .564 to .686. The interval is p ± z[p(1 -
p)/n]1/2 = .625 ± (1.960)[(.625)(.375)/240]1/2.

Outliers and Messy Data


Outliers and messy data are common. Managers often encounter large databases contain- ing
unruly data and they must decide how to treat atypical observations.

(-) The sample proportion is in the middle of the confidence interval for the population
proportion: in any sample. The interval is p ± z[p(1 - p)/n]1/2.
For a sample of size 16, the critical values of chi-square for a 95 percent confidence interval
for the population variance are: 6.262, 27.49. Using d.f. = n - 1 = 15, we get χ2L = 6.262 and
χ2U = 27.49 from Appendix E.
For a sample of size 11, the critical values of chi-square for a 90 percent confidence interval
for the population variance are: 3.940, 18.31. d.f. = n - 1 = 10, we get χ2L = 3.940 and χ2U =
18.31 from Appendix E.
For a sample of size 18, the critical values of chi-square for a 99 percent confidence interval
for the population variance are: 5.697, 35.72. d.f. = n - 1 = 17, we get χ2L = 5.697 and χ2U =
35.72 from Appendix E.
which of the following statements is most nearly correct, other things being equal? For a
given confidence level, the z value is always smaller then the t value. As n increases, t
approaches z, but t is always larger than z.
The Central Limit Theorem (CLT): applies to any population. The appeal of the CLT is that is
applies to populations of any shape.
In which situation may the sample proportion safely be assumed to follow a normal
distribution? 12 successes in a sample of 72 items. We prefer at least 10 "successes" and at
least 10 "failures" to assume that p is normal.

Using Appendix D
Beyond d.f. 5 50, Appendix D shows d.f. in steps of 5 or 10. If Appendix D does not show
the exact degrees of freedom that you want, use the t-value for the next lower d.f.
(-) In which situation may the sample proportion safely be assumed to follow a normal
distribution? n = 30, π = .50. We want nπ ≥ 10 and n(1 - π) ≥ 10 to assume that p is normal.
If σ = 12, find the sample size to estimate the mean with an error of ± 4 and 95 percent
confidence (rounded to the next higher integer). 35. n = [zσ/E]2 = [(1.960)(12)/4]2 = 34.6
(round up).
If σ = 25, find the sample size to estimate the mean with an error of ±3 and 90 percent
confidence (rounded to the next higher integer). 188. n = [zσ/E]2 = [(1.645)(25)/3]2 = 187.9
(round up).
Sampling error can be avoided: by no method under the statistician's control. Sampling error
occurs in any random sample used to estimate an unknown parameter.
A consistent estimator for the mean: converges on the true parameter μ as the sample size
increases. The variance becomes smaller and the estimator approaches the parameter as n
increases.
Concerning confidence intervals, which statement is most nearly correct? We use the
Student's t distribution when σ is unknown. Student's t distribution widens the confidence
interval when σ is unknown.
The standard error of the mean decreases when: the standard deviation decreases or n
increases. The standard error of the mean σ/(n1/2) depends on n and σ.
For a given sample size, the higher the confidence level, the: greater the interval width. To
have more confidence, we must widen the interval. For example, z.025 = 1.960 (for 95
percent confidence) gives a wider interval than z.05 = 1.645 (for 90 percent confidence). The
proffered statement would also be true for the Student's t distribution.
A sample is taken and a confidence interval is constructed for the mean of the distribution. At
the center of the interval is always which value? The sample mean. If a normal population
has parameters μ = 40 and σ = 8, then for a sample size n = 4: the standard error of the
sample mean is approximately 4. The standard error is σ/(n1/2) = (8)/(41/2) = 4.

CHAPTER 9
One-sample Hypothesis Testing
9.1 LOGIC OF HYPOTHESIS TESTING
The power curve plots β on the Y axis and the test statistic on the X axis. f
A smaller probability of Type II error implies higher power of the test. t
Varying the true mean is a movement along the power curve, not a shift in the curve. t
Increasing the sample size shifts the power curve upward, ceteris paribus. t
Increasing the level of significance shifts the power curve upward, ceteris paribus. t

A power curve for a mean is at its lowest point when the true μ is very near μ0. t
Larger samples lead to increased power, ceteris paribus. t
In graphing power curves, there is a different power curve for each sample size n. t
In hypothesis testing, we are trying to reject the alternative hypothesis. f
In hypothesis testing, we are trying to prove the null hypothesis. f
When σ is unknown, it is more conservative to use z instead of t for the critical value. f

Steps in Hypothesis testing


Step 1: State the hypothesis.
Use: z test when σ is known
t test when σ is unknown
⍺ (level of significance): probability of rejecting the null hypothesis when it is true
Step 2: Specify the decision rules (what level of inconsistency of data will lead to the
rejection of the hypothesis)
Step 3: Data collection
Step 4: Decision making
Step 5: Take actions based on the decision made

EXAMPLE
1. The process that produces Sonora Bars (a type of candy) is intended to produce bars with a
mean weight of 56 gm. The process standard deviation is known to be 0.77 gm. A random
sample of 49 candy bars yields a mean weight of 55.82 gm.
(+) Which are the hypotheses: H0: μ ≥ 56, H1: μ < 56
(+) Find the test statistic: -1636
(+) Find the p-value: Between .05 and .10
2. A sample of 16 ATM transactions shows a mean transaction time of 67 seconds with a
standard deviation of 12 seconds.
(+) Find the test statistic: 2333
(+) State the hypotheses: H0: μ ≤ 60, H1: μ > 60
(+) Find the critical value at ⍺ = .01: 2602
3. For a right-tailed test of a hypothesis for a single population mean with n = 10, the value
of the test statistic was t = 1.411. The p-value is: between .10 and .05.
Last year, 10 percent of all teenagers purchased a new iPhone. This year, a sample of 260
randomly chosen teenagers showed that 39 had purchased a new iPhone.
(+) test statistic: 2687
(+) critical value at α = .05: 1645
(+) p-value is approximately: .0036
4. The Melodic Kortholt Company will change its current health plan if at least half the
employees are dissatisfied with it. A trial sample of 25 employees shows that 16 are
dissatisfied.
(+) normality of the sample proportion can be assumed.
(+) The p-value for a right-tailed test is .0808
(+) For a right-tailed test, the test statistic would be 1400

9.2 TYPE I AND TYPE II ERROR


Type I error (⍺) also called false positive: Rejecting null hypothesis when it is true
Type II error (β) also called false negative: Not rejecting hypothesis when it is false
For a given sample size, when we increase the probability of Type I error, the probability of a
Type II error: decrease
After testing a hypothesis, we decided to reject the null hypothesis. Thus, we are exposed to:
type I error. which statement about α is not correct: It is equal to 1 - β.
which of the following is correct: When sample size increases, both α and β may decrease.
which of the following is incorrect: the probability of rejecting a true null hypothesis
increases as n increases.
John rejected his null hypothesis in a right-tailed test for a mean at ⍺ = .025 because his
critical t value was 2.000 and his calculated t value was 2.345. We can be sure that: John did
not commit Type II error.
"My careful physical examination shows no evidence of any serious problem," said
Doctor Morpheus. (đề dài): You will waste money on an unnecessary lab test.
which of the following statements is correct: Increasing α will make it more likely that we
will reject H0, ceteris paribus.
"I believe your airplane's engine is sound," states the mechanic. (đề dài): the pilot will be out
$1,500 unnecessarily.

A study over a 10-year period showed that a certain mammogram test had a 50 percent
rate of false positives. This indicates that: about half the tests showed a cancer that didn't
exist.
You are driving a van packed with camping gear (total weight 3,500 pounds including
yourself and family) into a northern wilderness area. (đề dài): your kids will think you're a
chicken.
After lowering the landing gear, the pilot notices that the "gear down and locked" light is
not illuminated: The landing is delayed unnecessarily while the bulb and gear are checked.
As you are crossing a field at the farm, your country cousin Jake assures you, "Don't
worry about that old bull coming toward us. He’s harmless: You will run away for no good
reason.

Relationship between ⍺ and β


We want to avoid false positive → small ⍺
We also want to avoid false negative → small β
BUT they have an inverse relationship (tỉ lệ nghịch)
Example: More sensitive airport weapon detector (reduced β) means more inconvenience for
safe passengers (increased ⍺)

EXAMPLE:
1. The level of significance refers to the probability of making a Type II error. f
2. The level of significance refers to the probability of making a Type I error. t
3. A simultaneous reduction in both ⍺ and β will require a larger sample size. t
4. The probability of rejecting a false null hypothesis increases as the sample size increases,
other things being equal. t
5. When the probability of a Type I error increases, the probability of a Type II error must
decrease, ceteris paribus. t
6. A false positive in a drug test for steroids is a Type II error. f
8. If a judge acquits every defendant, the judge will never commit a Type I error (H0 is the
hypothesis of innocence). t
9. When your sample size increases,the chance of both TypeI and TypeII error will increase. f
10. A Type II error can only occur when you fail to reject H0. t
11. A Type I error can only occur if you reject H0. t
12. John rejected H0 so we know definitely that he did not commit Type II error. t
13. In hypothesis testing we cannot prove a null hypothesis is true. t
14. For a given level of significance (⍺), increasing the sample size will increase the.
probability of Type II error because there are more ways to make an incorrect decision. f
15. For a given sample size, reducing the level of significance will decrease the probability of
making a Type II error. f
16. The probability of a false positive is decreased if we reduce ⍺. t
17. A hypothesis test may be statistically significant, yet have little practical importance. t
18. Compared to using ⍺ = .01, choosing ⍺ = .001 will make it less likely that a true null
hypothesis will be rejected. t
19. A two-tailed hypothesis test for H0: μ = 15 at ⍺ = .10 is analogous to asking if a 90
percent confidence interval for μ contains 15. t

9.3 DECISION RULES AND CRITICAL VALUES


A/ Decision rules
For a given sample size and ⍺ level, the Student's t value always exceeds the z value. t
For a given level of significance, the critical value of Student's t increases as n increases. f
For a sample of 9 items, the critical value of Student's t for a left-tailed test of a mean at ⍺ =
.05 is -1.860. t
Holding other factors constant, it is harder to reject the null hypothesis for a mean when
conducting a two-tailed test rather than a one-tailed test. t
If we desire ⍺ = .10, then a p-value of 0.13 would lead us to reject the null hypothesis. f
The p-value is the probability of the sample result(or one more extreme)assuming H0 is true. t
The probability of rejecting a true null hypothesis is the significance level of the test. t
A null hypothesis is rejected when the calculated p-value is less than the critical value of the
test statistic. f
In a right-tailed test, the null hypothesis is rejected when the value of the test statistic exceeds
the critical value. t
The critical value of a hypothesis test is based on the researcher's selected level of
significance. t
If the null and alternative hypotheses are H0: μ ≤ 100 and H1: μ > 100, the test is right-tailed. t
The null hypothesis is rejected when the p-value exceeds the level of significance. f

B/ One-tailed and Two-tailed test


In the nation of Gondor, the EPA requires that half the new cars sold will meet a certain
particulate emission standard a year later. A sample of 64 one-year-old cars revealed
that only 24 met the particulate emission standard. The test statistic to see whether the
proportion is below the requirement is: -2000
The hypotheses H0: π ≥ .40, H1: π < .40 would require: a left-tailed test.
The direction of the test is indicated by
which way the symbol points in H1

For a given null hypothesis and level of significance, the critical value for a two-tailed test is
greater than the critical value for a one-tailed test. t
For a given H0 and level of significance, if you reject the H0 for a one tailed-test, you would
also reject H0 for a two-tailed test. t
If the hypothesized proportion is π0 = .025 in a sample of size 120, it is safe to assume
normality of the sample proportion p. f
For a mean, we would expect the test statistic to be near zero if the null hypothesis is true. t
In the hypothesis H0: π = π0, the value of π0 is derived from the sample. f
In testing the hypothesis H0: π ≤ π0, H1: π > π0, we would use a right-tailed test. t
To test the hypothesis H0: π = .0125 using n = 160, it is safe to assume normality of p. t
In testing a proportion, normality of p can be assumed if nπ0 ≥ 10 and n(1 - π0) ≥ 10. t
Power is the probability of rejecting the null hypothesis when it is false and is equal to 1 - β. t
Other things being equal, a smaller standard deviation implies higher power. t
The power of a test is the probability that the test will reject a false null hypothesis. t
The height of the power curve shows the probability of accepting a true null hypothesis. f

C/ Critical Values
● t-test (when σ is unknown and sample size < 30)
Find df = n-1 ; ⍺ will be given
Look up the t distribution table (pay attention to one-tailed or two-tailed)
t Test statistic:

For a test of a mean, which of the following is incorrect: H0 is rejected when the calculated
p-value is less than the critical value of the test statistic.
Guidelines for the Jolly Blue Giant Health Insurance Company say that the average
hospitalization for a triple hernia operation should not exceed 30 hour. A diligent
auditor studied records of 16 randomly chosen triple hernia operations at Hackmore
Hospital and found a mean hospital stay of 40 hours with a standard deviation of 20
hours. "Aha!" she cried, "the average stay exceeds the guideline." At α = .025.
(+) the critical value for a right-tailed test of her hypothesis is 2131.
(+) The value of the test statistic for her hypothesis is 2000.
(+) The p-value for a right-tailed test of her hypothesis is between .025 and .05.
● z-test (when σ is known and sample size > 30)
One-tailed test: ⍺
Two-tailed test: ⍺/2
Look up the number in the table
Add the values of intersecting row and column to get the z value
For a left-tailed test, a negative sign needs to be added at the end of the calculation

z Test statistic:
For a right-tailed test of a hypothesis for a population mean with n = 14, the value of the
test statistic was t = 1.863. The p-value is: between .05 and .025.
Hypothesis tests for a mean using the critical value method require: specifying α in advance.
The level of significance is not: the chance of accepting a true null hypothesis.
The critical value in a hypothesis test: separates the acceptance and rejection regions.
which is not a likely reason to choose the z distribution for a hypothesis test of a mean: The
value of σ is very large.
Dullco Manufacturing claims that its alkaline batteries last at least 40 hours on average in
a certain type of portable CD player. But tests on a random sample of 18 batteries from a
day's large production run showed a mean battery life of 37.8 hours with a standard
deviation of 5.4 hours.
(+) To test DullCo's hypothesis, the test statistic is -1.728
(+) In a left-tailed test at α = .05, which is the most accurate statement: We would face a
rather close decision.
(+) To test DullCo's hypothesis, the p-value is slightly greater than .05.
P-value
Nếu dùng p method thì ko cần tính critical value mà chỉ cần tính z test statistic, dò bảng sẽ ra
p value
P-value < ⍺ thì reject H0 ngược lại thì fail to reject.
which of the following is not a valid null hypothesis: H0: μ ≠ 0
Given that in a one-tail test you cannot reject H0, can you reject H0 in a two-tailed test at
the same ⍺: no
Given H0: μ ≥ 18 and H1: μ < 18, we would commit Type I error if we: conclude that μ < 18
when the truth is that μ ≥ 18.
Ajax Peanut Butter's quality control allows 2 percent of the jars to exceed the quality
standard for insect fragments. A sample of 150 jars from the current day's production
reveals that 30 exceed the quality standard for insect fragments. which is incorrect: Normality
of p may safely be assumed in the hypothesis test.

which is not true of p-values: They measure the probability of an incorrect decision.
For tests of a mean, if other factors are held constant, which statement is correct: It is harder
to reject the null hypothesis in a two-tailed test rather than a one-tailed test.
For a sample size of n = 100, and σ = 10, we want to test the hypothesis H0: μ = 100. The
sample mean is 103. The test statistic is: 3000
hen testing the hypothesis H0: μ = 100 with n = 100 and σ2= 100, we find that the
sample mean is 97. The test statistic is: -3000
Given a normal distribution with σ = 3, we want to test the hypothesis H0: μ = 20. We find
that the sample mean is 21. The test statistic is: impossible to find without more information.
In testing a proportion, which of the following statements is incorrect: When the sample
proportion is p = .02 and n = 150, it is safe to assume normality.
which of the following is not a characteristic of the t distribution: It is similar to the z
distribution when n is small.

PRACTICE
1. At α = .05, the critical value to test the hypothesis H0: π ≥ .40, H1: π < .40 would be: -1645
2. In a test of a mean, the reported p-value is .025. Using α =.05 the conclusion would be to:
reject the null hypothesis.
3. which of the following decisions could result in a Type II error for a test: Fail to reject the
null hypothesis
4. If sample size increases from 25 to 100 and the level of significance stays the same,
then: the risk of Type II error would decrease.
5. "Currently, only 20 percent of arrested drug pushers are convicted," cried candidate
Courageous Calvin in a campaign speech: 0.0668
6. In a right-tailed test, a statistician got a z test statistic of 1.47. What is the p-value: .0709
In a left-tailed test, a statistician got a z test statistic of -1.720. What is the p-value: .0427
In a two-tailed test, a statistician got a z test statistic of 1.47. What is the p-value: .1416
which of the following statements is true: Power of the test rises if the true mean is farther
from the hypothesized mean.
High power in a hypothesis test about 1 sample mean is likely to be associated with: small σ.
The power of a test is the probability of: concluding H1 when H1 is true.
which is not a step in hypothesis testing: Find the test statistic from a table.
which is an invalid alternative hypothesis: H1: μ = 18
which is a valid null hypothesis: H0: μ = 18
7. A two-tailed hypothesis test for H0: π = .30 at α = .05 is analogous to: asking if the 95
percent confidence interval for π contains .30.
For a right-tailed test of hypothesis for a population mean with known σ, the test statistic was
z = 1.79. The p-value is: .0367
If n = 25 and α = .05 in a right-tailed test of a mean with unknown σ, the critical value
is: 1.711
8. The researcher's null hypothesis is H0: σ2≤ 22. A sample of n = 25 items yields a sample
variance of s2= 28.5.
(+) The critical value of chi-square for a right-tailed test at α =05 is: 36.42
(+) The test statistic is: 31.09
The researcher's null hypotheses is H0: σ2= 420. A sample of n = 18 items yields a sample
variance of s2= 512.
(+) The critical values of chi-square for a two-tailed test at α= .05 are: 7.564 and 30.19
(+) The test statistic is: 20.72
In hypothesis testing, Type I error is: the probability of rejecting H0 when H0 is true.
In hypothesis testing, the value of β is: the probability of concluding H0 when H1 is true.
9. Regarding the probability of Type I error (α) and Type II error (β), which statement is
true: Power = 1 - β.
In the hypothesis H0: μ = μ0, the value of μ0 is not derived from: the sample.
In testing the hypotheses H0: π ≤ π0, H1: π > π0, we would use a: right-tailed test.
10. We can assume that the sample proportion is normally distributed if: we have both 10
successes and 10 failures in the sample.
CHAPTER 10
Two-Sample
Hypothesis Tests
10.1 TWO-SAMPLE TESTS

- Automotive
- Marketing
- Environment
- Safety
- Education

10.2 COMPARING TWO MEANS: INDEPENDENT SAMPLES


Format of Hypotheses

Test Statistic

2 2
For the case where we know the values of the population variances, σ1 and σ2, the test
statistic is a z-score. We would use the standard normal distribution to find p-values or
critical values of 𝑧α
A new policy of "flex hours" is proposed. Random sampling showed that 28 of 50 female
workers favored the change, while 22 of 50 male workers favored the change. Management
wonders if there is a difference between the two groups. What is the p- value for a two-tailed
test: .2301. Combined proportion is pc = (28 + 22)/(50 + 50) = .50, so zcalc = (.56 -
.44)/[.50(1 - .50)/50 + 50(1 - .50)/50]1/2 = -1.20 and 2 × P(Z < -1.20) = 2 × .1151 = .2302 (or
.2301 using Excel).
Two well-known aviation training schools are being compared using random samples of their
graduates. It is found that 70 of 140 graduates of Fly-More Academy passed their FAA
exams on the first try, compared with 104 of 260 graduates of Blue Yonder Institute. To
compare the pass rates, the pooled proportion would be: .435. Combined proportion is pc =
(70 + 104)/(140 + 260) = .435.
2 2 2 2
We would need to rely on sample estimates 𝑠1 and 𝑠2 for the population variances, σ1 and σ2.
By assuming that the population variances are equal, we are allowed to pool the sample
2 2
variances by taking a weighted average of 𝑠1 and 𝑠2 to calculate an estimate of the common
2 2
population variance. Weights are assigned to 𝑠1 and 𝑠2 based on their respective degrees of
freedom (𝑛1 - 1) and (𝑛2 - 1). Because we are pooling the sample variances, the common
2
variance estimate is called the pooled variance and is denoted 𝑠2.

Of 200 youthful gamers (under 18) who tried the new Z-Box-Plus game, 160 rated it
"excellent," compared with only 144 of 200 adult gamers (18 or over). The 95 percent
confidence interval for the difference of proportions would be approximately: [-.003, +.163].
Do not pool the proportions when you calculate the standard error of p1 - p2.
Carver Memorial Hospital's surgeons have a new procedure that they think will decrease the
time to perform an appendectomy. A sample of 8 appendectomies using the old method had
a mean of 38 minutes with a variance of 36 minutes, while a sample of 10 appendectomies
using the experimental method had a mean of 29 minutes with a variance of 16 minutes. For
a right-tail test for equal means (assume equal variances), the critical value at α = .10 is:
1.337. For d.f. = (n1 - 1) + (n2 - 1) = 7 + 9 = 16, we get t.10 = 1.337.
2 2
If the unknown variances σ1 and σ2 are assumed unequal, we do not pool the variances.
2 2 2 2
Instead, we replace σ1 and σ2 with the sample variances 𝑠1 and 𝑠2. This is a more
conservative assumption than Case 2. Under these conditions the distribution of the random
variable 𝑋1 - 𝑋2 is no longer certain, a difficulty known as the Behrens-Fisher problem.
However, the comparison of means can reliably be performed using a Student’s t test with
Welch’s adjusted degrees of freedom.
A medical researcher wondered if there is a significant difference between the mean birth
weight of boy and girl babies. Random samples of 5 babies' weights (pounds) for each
gender showed the following:
Boys: 8.0|4.7|7.3|6.2|3.4
Girls: 5.3|2.8|6.4|6.8|7.4

In a left-tailed test comparing two means with unknown variances assumed to be equal, the
test statistic was t = -1.81 with sample sizes of n1 = 8 and n2 = 12. The p-value would be:
between .025 and .05. For d.f. = 18, Appendix D gives t.05 = 1.734 and t.025 = 2.101, or for
an exact answer you can use the Excel function =T.DIST(-1.81,8+12-2,1) = .04351.
In a left-tailed test comparing two means with variances unknown but assumed to be equal,
the sample sizes were n1 = 8 and n2 = 12. At α = .05, the critical value would be: -1.734. For
d.f. = 18, Appendix D gives t.05 = -1.734.
In a test for equality of two proportions, the sample proportions were p1 = 12/50 and p2 =
18/50. The test statistic is approximately: -1.31. Use combined proportion pc = (x1 + x2)/(n1
+ n2) = (12 + 18)/(50 + 50) = .30 in zcalc.
In a test for equality of two proportions, the sample proportions were p1 = 12/50 and p2 =
18/50. The pooled proportion is: .30. Use combined proportion pc = (x1 + x2)/(n1 + n2) = (12
+ 18)/(50 + 50) = .30 in the calculation.
Steps in Hypothesis testing
Step 1: State the hypothesis.
Use: z test when σ is known
t test when σ is unknown
⍺ (level of significance): probability of rejecting the null hypothesis when it is true
Step 2: Specify the decision rules (what level of inconsistency of data will lead to the
rejection of the hypothesis)
Step 3: Data collection
Step 4: Decision making
Step 5: Take actions based on the decision made

- If the sample sizes are equal, the Case 2 and Case 3 test statistics will always be
identical, but the degrees of freedom (and hence the critical values) may differ. If you
have no information about the population variances, then the best choice is Case 3.
- The fewer assumptions you make about your populations, the less likely you are to
make a mistake in your conclusions
- Unequal sample sizes are common, and the formulas still apply. However, there are
advantages to equal sample sizes.
John wants to compare two means. His sample statistics were and
. Assuming equal variances, the pooled variance is: 4.5. The pooled
variance is [(n1 - 1)s12 + (n2 - 1)s22]/[(n1 - 1) + (n2 - 1)] = 4.5.
In a random sample of patient records in Cutter Memorial Hospital, six-month postoperative
exams were given in 90 out of 200 prostatectomy patients, while in Paymor Hospital such
exams were given in 110 out of 200 cases. In comparing these two proportions, normality of
the difference may be assumed because: nπ ≥ 10 and n(1 - π) ≥ 10 for each sample taken
separately. We have at least 10 successes (x1 = 90, x2 = 110) and 10 failures (n1 - x1 = 110,
n2 - x2 = 90).
Large Samples

10.3 CONFIDENCE INTERVAL FOR THE DIFFERENCE


OF TWO MEANS, μ1 - μ2

10.4 COMPARING TWO MEANS: PAIRED SAMPLES

Management of Melodic Kortholt Company compared absenteeism rates in two plants on the
third Monday in November. Of Plant A's 800 employees, 120 were absent. Of Plant B's 1200
employees, 144 were absent. MegaStat's results for a two-tailed test are shown below.

At α = .05, the two-tailed test for a difference in proportions is: not quite significant. Because
the p-value is slightly greater than .05, we cannot reject H0.
10.5 COMPARING TWO PROPORTIONS
Sample Proportions

Pooled Proportion
Test Statistic

10.6 CONFIDENCE INTERVAL FOR THE DIFFERENCE


OF TWO PROPORTIONS, π1 - π2

10.7 COMPARING TWO VARIANCES


The F Test

- If the null hypothesis of equal variances is true, this ratio should be near 1

Two-Tailed F Test
Folded F Test

Examples:

To test the researcher's hypothesis, we should use the: independent samples t-test.
Although arranged side by side, these are unrelated data (independent samples).
In a test of a new surgical procedure, the five most respected surgeons in FlatBroke
Township were invited to Carver Hospital. Each surgeon was assigned two patients of the
same age, gender, and overall health. One patient was operated upon in the old way, and
the other in the new way. Both procedures are considered equally safe. The surgery times
are shown below:

The time (in minutes) to complete each procedure was


A B C D E
carefully recorded. In a right-tailed test for a difference of
Old 36 55 28 40 62 means, the test statistic is: 3.162. The test statistic is
tcalc = (5 - 0)/[(3.5355)/51/2] = 3.162.
New 31 45 28 35 57

A corporate analyst is testing whether mean inventory turnover has increased. Inventory
turnover in six randomly chosen product distribution centers (PDCs) is shown.
This Last
year

PDC 1 5.1 4.1

2 3.9 2.9

3 4.8 2.8

4 3.4 3.4

5 4.6 2.6

6 7.7 4.7
The degrees of freedom for the appropriate test would be: 5. These are paired samples, so
d.f. = 6 - 1 = 5.
The table below shows the mean number of daily errors by air traffic controller trainees
during the first two weeks on the job. We want to perform a paired t-test at α = .05 to see if
the mean daily errors decreased significantly.

The test statistic is: 1.25.


T1 2 3 4 5 6 7
Paired data test statistic is tcalc
week 1 5.1 3 12.1 6.2 11.5 7.8 2.2 = (0.8286 - 0)/[(1.7547)/71/2] =
1.249.
2 3.2 2.2 8.7 7.7 9.4 7.8 3.1

Does the Speedo Fastskin II Male Hi-Neck Bodyskin competition racing swimsuit improve a
swimmer's 200-yard individual medley performance times? A test of 100 randomly chosen
male varsity swimmers at several different universities showed that 66 enjoyed improved
times, compared with only 54 of 100 female varsity swimmers. To test for equality in the
proportions of men versus women who experienced improvement, the test statistic is
approximately: 1.73. Combined proportion is pc = (66 + 54)/(100 + 100) = .60, so zcalc =
(.66 - .54)/[.60(1 - .60)/100 + .60(1 - .60)/100]1/2 = 1.73.
Group 1 has a mean of 13.4 and group 2 has a mean of 15.2. Both populations are known to
have a variance of 9.0 and each sample consists of 18 items. What is the test statistic to test
for equality of population means: -1.800. With known variances, zcalc = (13.4 - 15.2)/[9.0/18
+ 9.0/18]1/2 = -1.800.
which is not a type of comparison for which you would anticipate a two-sample test: Current
versus Target. The point of comparison is between two samples, not a benchmark or target.
The coach of an adult Master's Swim class selected eight swimmers within each of the two
age groups shown below. A 50-yard freestyle time is recorded for each swimmer. The
resulting times (seconds) are shown below.
which statistical test would you choose to compare the two groups: t-test for independent
samples with unknown variances. Despite being arranged side-by-side, there is no link
between the columns. The similar standard deviations suggest that it would be reasonable to
"pool" the variances (pun intended) although this question was not posed.
Assuming unequal variances in a t-test for a zero difference of two means, we would: use a
complicated formula for the degrees of freedom. The formula for Welch's adjusted degrees
of freedom is not easy without a computer.

CHAPTER 11

Overview of Statistics

I.EXPLAINING VARIATIONS
One-factor ANOVA is a procedure intended to compare the variances of c samples.f

-ANOVA compares several means (although its test statistic is based on variances).

Analysis of variance is a procedure intended to compare the means of c samples.


t
-Although its test statistic is based on variances, ANOVA compares several means.

If you have four factors (call them A, B, C, and D) in an ANOVA experiment


with replication, you could have a maximum of four different two-factor
interactions. f
-There could be six two-way interactions: AB, AC, AD, BC, BD, CD.

Hartley's test measures the equality of the means for several groups. f

-Hartley's test is designed to detect unequal population variances.


Hartley's test is to check for unequal variances for c groups. t

-Unequal population variances would violate an ANOVA assumption.

Comparison of c means in one-factor ANOVA can equivalently be done by using c

individual t-tests on c pairs of means at the same α. f


-Multiple two-sample t-tests from the same data set would inflate the overall α.

ANOVA assumes equal variances within each treatment group. t


-ANOVA checks for unequal means, while assuming homogeneous variances.

Three-factor ANOVA is required if we have three treatment groups (i.e., three


data columns). f
-If there are only three columns of data, we only have one factor (with three
treatments). The hypothesis is whether the three treatment group means are the
same.

ANOVA assumes normal populations. t

-Populations are assumed to be normally distributed and to have equal variances.

Tukey's test compares pairs of treatment means in an ANOVA. t

-Tukey's test is a follow-up to ANOVA to detect which pairs of means differ (if any).

Tukey's test is similar to a two-sample t-test except that it pools the variances for all
c samples. t

-There is a strong analogy with the two-sample t-test, except that we pool all the
variances.

Tukey's test is not needed if we have the overall F statistic for the ANOVA. f

ONE-FACTOR ANOVA ( COMPLETELY RANDOMIZED MODEL)

1. Data format:

-Tukey's test is a follow-up to ANOVA to detect which pairs of means differ (if any).

Interaction plots that show crossing lines indicate likely interactions. t

-Interaction plots provide an intuitive visual way of seeing possible interactions.

Interaction plots that show parallel lines would suggest interaction effects. f

-Interaction plots that show crossing lines indicate likely interactions.

In a two-factor ANOVA with three columns and four rows, there can be more than
two interaction effects. f

-There can only be one interaction (row × column).

Sample sizes must be equal in one-factor ANOVA. f


-Sample sizes often are equal by design, but it is not necessary.

In a 3×4 randomized block (two-factor unreplicated) ANOVA, we have 12 treatment


groups. t

2.Group means

-Each row/column combination is a treatment group.

One-factor ANOVA with two groups is equivalent to a two-tailed t-test. t

-The p-values will be the same in either test as long as the t-test is two-tailed.

One factor ANOVA stacked data for five groups will be arranged in five separate
columns. f

-One column will contain the data, while a second column names the group.

Hartley's test is the largest sample mean divided by the smallest sample mean. f

-Hartley's test statistic is the ratio of s 2 max to s 2 min.

Tukey's test for five groups would require 10 comparisons of means. t

-The number of possible comparisons is c(c - 1)/2 = 5(4)/2 = 10.

ANOVA is robust to violations of the equal-variance assumption as long as group


sizes are equal. t

-Studies suggest that equal group sizes strengthen the ANOVA test.

Levene's test for homogeneity of variance is attractive because it does not depend
on the assumption of normality. t

-While Hartley's test is sensitive to nonnormality, Levene's test statistic is not.

Tukey's test with seven groups would entail 21 comparisons of means. t

-The number of possible comparisons is c(c - 1)/2 = 7(6)/2 = 21.

Tukey's test pools all the sample variances. t

-In a Tukey test, all c sample variances are combined (weighted by their degrees of
freedom).

It is desirable, but not necessary, that sample sizes be equal in a one-factor ANOVA.
t

3. Partitioned Sum of squares:

-Studies suggest that equal group sizes strengthen the ANOVA test.

which is the Excel function to find the critical value of F for α = .05, df1 = 3, df2 = 25?
+ =F.INV.RT(.05, 3, 25)

-The equivalent Excel 2007 function would be =FINV(.05, 3, 25).

which Excel function gives the right-tail p-value for an ANOVA test with a test statistic
Fcalc = 4.52, n = 29 observations, and c = 4 groups?

+ =F.DIST.RT(4.52, 3, 25)

-The equivalent Excel 2007 function would be =FDIST(.05, 3, 25).

Variation "within" the ANOVA treatments represents:

+random variation

-Variation within groups is also called error variance or unexplained variance.

which is not an assumption of ANOVA?

+Equal population sizes for groups.

PROCEED WITH THE HYPOTHESIS TEST

-It is desirable, but not necessary, that sample sizes be equal in a one-factor
ANOVA.

In an ANOVA, when would the F-test statistic be zero?

+When the treatment means are the same.

-If each group mean equals the overall mean, then Fcalc could be zero (an unusual
situation).

ANOVA is used to compare:

+means of several groups.

-Although its test statistic is based on variances, ANOVA compares several means.

Analysis of variance is a technique used to test for:

+equality of two or more means.

-Although its test statistic is based on variances, ANOVA compares several means.

which of the following is not a characteristic of the F distribution

Step 1: State the hypotheses.

+It is negative when s1 2 is smaller than s2 2

-The F distribution is the ratio of two mean squares, so it cannot be negative.

In an ANOVA, the SSE (error) sum of squares reflects:

+the variation that is not explained by the factors.


-The error variance or unexplained variance is variation within groups.

Step 2: State the decision rule:

To test the null hypothesis H0: μ1 = μ2 = μ3 using samples from normal populations
with unknown but equal variances, we:

+can safely employ ANOVA

-As long as the variances are equal, we can safely use ANOVA.

which is not assumed in ANOVA?

+Population variances are known.

-Population variances are almost never known.

In a one-factor ANOVA, the computed value of F will be negative:

+under no circumstances.

-The F distribution is the ratio of two mean squares, so it cannot be negative.

Degrees of freedom for the between-group variation in a one-factor ANOVA with n1


= 5, n2 = 6, n3 = 7 would be: 2

-For between-group variation, we have dfnumerator = c - 1 = 3 - 1 = 2.

Degrees of freedom for the between-group variation in a one-factor ANOVA with n1


= 8, n2 = 5, n3 = 7, n4 = 9 would be:3.

-For between group variation we have dfnumerator = c - 1 = 4 - 1 = 3.

Step 3: Perform the Calculation( find F-statistic):

Using one-factor ANOVA with 30 observations we find at α = .05 that we cannot


reject the null hypothesis of equal means. We increase the sample size from 30
observations to 60 observations and obtain the same value for the sample F-test
statistic. Which is correct?

+We might now be able to reject the null hypothesis.

-With more degrees of freedom, the critical value F.05 will be smaller, so we might
reject.

One-factor analysis of variance: has less power when the number of observations
per group is not identical.

-Studies suggest that equal group sizes strengthen the power of the ANOVA test.

In a one-factor ANOVA, the total sum of squares is equal to:

+the sum of squares within groups plus the sum of squares between groups.

-The basic identify is SSbetween + SSwithin = SStotal.


MULTIPLE COMPARISIONS:

The within-treatment variation reflects:

+variation among individuals of the same group.

-Variation within groups is also called error variance or unexplained variance.

Given the following ANOVA table (some information is missing), find the F statistic:

+MStreatment = 744/4 = 186, MSerror = (751.5)/15 = 50.1, so F = 186/50.1 = 3.71

Given the following ANOVA table (some information is missing), find the critical value
of F.05.

+For df = (4, 15) we use Appendix F to get F.05 = 3.06.

Identify the degrees of freedom for the treatment and error in this one-factor ANOVA
(blanks indicate missing information).

+3, 20

-Since SS/df = MS, we know that df = SS/MS. Hence, 993/331 = 3 and 1002/50.1 =
20.

For this one-factor ANOVA (some information is missing), how many treatment
groups were there: 4

-Since SS/df = MS, we know that df = SS/MS and, hence, 654/218 = 3 = c - 1.

For this one-factor ANOVA (some information is missing), what is the F-test statistic:
1.703

-Fcalc = (MStreatment)/(MSerror) = 218/128 = 1.703.

Refer to the following partial ANOVA results from Excel (some information is
missing). The F-test statistic is: 2.84.

-Fcalc = (MSbetween)/(MSwithin) = (210.2788)/(74.15) = 2.836.

Refer to the following partial ANOVA results from Excel (some information is
missing). Degrees of freedom for between groups variation are: 3

-SSbetween = 2113.833 - 1483 = 630.833, so df = (630.833)/(210.2778) = 3.

Refer to the following partial ANOVA results from Excel (some information is
missing). SS for between groups variation will be: 630.83.

-SSbetween = 2113.833 - 1483 = 630.833.


Refer to the following partial ANOVA results from Excel (some information is
missing). The number of treatment groups is: 4

-SSbetween = 2113.833 - 1483 = 630.833, so df = (630.833)/(210.2778) = 3 = c - 1.

Refer to the following partial ANOVA results from Excel (some information is
missing). The sample size is: 24.

- (630.833)/(210.2778) = 3 and (1483)/(74.15) = 20, so 3 + 20 = 23 = n - 1.

Refer to the following partial ANOVA results from Excel (some information is
missing). Assuming equal group sizes, the number of observations in each group is:
6.

- (630.833)/(210.2778) = 3 and (1483)/(74.15) = 20, so 3 + 20 = 23 = n - 1 and n/c =


24/4 = 6.

Refer to the following partial ANOVA results from Excel (some information is
missing). Degrees of freedom for the F-test are: 3

Refer to the following partial ANOVA results from Excel (some information is
missing). The critical value of F at α = 0.05 is: 3.10

- (630.833)/(210.2778) = 3 and (1483)/(74.15) = 20, so F.05 = 3.10 for df = (3, 20)

Refer to the following partial ANOVA results from Excel (some information is
missing). At α = 0.05, the difference between group means is: not quite significant.

-The p-value is not less than .05 so we cannot reject the hypothesis of equal means.

The Internal Revenue Service wishes to study the time required to process tax
returns in three regional centers. A random sample of three tax returns is chosen
from each of three centers. The time (in days) required to process each return is
recorded as shown below. The test to use to compare the means for all three groups
would require:

+one-factor ANOVA.

-One factor (three group means to be compared).

The Internal Revenue Service wishes to study the time required to process tax
returns in three regional centers. A random sample of three tax returns is chosen
from each of three centers. The time (in days) required to process each return is
recorded as shown below. Subsequently, an ANOVA test was performed. Degrees of
freedom for the error sum of squares in the ANOVA would be: 6.

-Error df = n - c = 9 - 3 = 6.

The Internal Revenue Service wishes to study the time required to process tax
returns in three regional centers. A random sample of three tax returns is chosen
from each of three centers. The time (in days) required to process each return is
recorded as shown below. Degrees of freedom for the between-groups sum of
squares in the ANOVA would be: 2.

-Between groups df = c - 1= 3 - 1 = 2.
Prof. Gristmill sampled exam scores for five randomly chosen students from each of
his two sections of ACC 200. His sample results are shown. He could test the
population means for equality using:

+either a one-factor ANOVA or a two-tailed t-test.

-As there are only two groups, either ANOVA or a two-tailed t-test will give the same
p-value

Systolic blood pressure of randomly selected HMO patients was recorded on a


particular Wednesday, with the results shown here: The appropriate hypothesis test
is:

+one-factor ANOVA.

-One factor (four group means to be compared).

Systolic blood pressure of randomly selected HMO patients was recorded on a


particular Wednesday, with the results shown here. An ANOVA test was performed
using these data. Degrees of freedom for the between-treatments sum of squares
would be: 3

-Between-reatments df = c - 1 = 4 - 1 = 3

Systolic blood pressure of randomly selected HMO patients was recorded on a


particular Wednesday, with the results shown here. An ANOVA test was performed
using these data. What are the degrees of freedom for the error sum of squares: 16

-Error df = n - c = 20 - 4 = 16

Sound levels are measured at random moments under typical driving conditions for
various full-size truck models. The Excel ANOVA results are shown below. The test
statistic to compare the five means simultaneously is: 4.45.

-Fcalc = (154.1)/(34.6) = 4.45.

Sound levels are measured at random moments under typical driving conditions for
various full-size truck models. The ANOVA results are shown below. The test statistic
for Hartley's test for homogeneity of variance is: 5.04.

-Hartley's H = s 2 max/s 2 min = (8.944)2 /(3.983)2 = 5.04.

Refer to the following partial ANOVA results from Excel (some information is
missing).ANOVA Table. The number of treatment groups is: 5

-59 - 55 = 4 = c - 1, so c = 5

Refer to the following partial ANOVA results from Excel (some information is
missing). ANOVA Table The F statistic is: 6.91.
-Fcalc = 11,189/1619 = 6.91.

Refer to the following partial ANOVA results from Excel (some information is
missing). ANOVA Table The number of observations in the original sample was: 60

- n - 1 = 59, so n = 60.

Refer to the following partial ANOVA results from Excel (some information is
missing). ANOVA Table Using Appendix F, the 5 percent critical value for the F-test is
approximately: 2.56

- Treatment df = 59 - 55 = 4, so F.05 = 2.56 using df = (4, 50) in Appendix F.

Refer to the following partial ANOVA results from Excel (some information is
missing). ANOVA Table The p-value for the F-test would be: much less than .05.

-Fcalc = 11,189/1619 = 6.91 while F.05 = 2.56 using df = (4, 50) in Appendix F.

Refer to the following partial ANOVA results from Excel (some information is
missing). The MS (mean square) for the treatments is: 239.13

- (717.4)/3 = 239.133

Refer to the following partial ANOVA results from Excel (some information is
missing). The F statistic is: 3.38

-Between-groups MS = (717.4)/3 = 239.133, so Fcalc = (239.133)/(70.675) = 3.383.

Refer to the following partial ANOVA results from Excel (some information is
missing). The number of observations in the entire sample is: 20

- n - 1 = 19, so n = 20

Refer to the following partial ANOVA results from Excel (some information is
missing). The 5 percent critical value for the F test is: 3.24

-Error df = 19 - 3 = 16, so F.05 = 3.24 using df = (3, 16) in Appendix F.

Refer to the following partial ANOVA results from Excel (some information is
missing). Our decision about the hypothesis of equal treatment means is that the null
hypothesis:

+can be rejected at α = .05.

-The p-value is less than .05, so we conclude unequal population means.

To compare the cost of three shipping methods, a random sample of four shipments
is taken for each of three firms. The cost per shipment is shown below. In a
one-factor ANOVA, degrees of freedom for the between-groups sum of squares will
be: 2.

-Between-groups df = c - 1 = 3 - 1 = 2.

To compare the cost of three shipping methods, a random sample of four shipments
is taken for each of three firms. The cost per shipment is shown below. In a
one-factor ANOVA, degrees of freedom for the within-groups sum of squares will be:
9.

-Within-groups df = n - c = 12 - 3 = 9.

To compare the cost of three shipping methods, a random sample of four shipments
is taken for each of three firms. The cost per shipment is shown below. Degrees of
freedom for the total sum of squares in a one-factor ANOVA would be: 11.

-Total df = n - 1 = 12 - 1 = 11.

Refer to the following MegaStat output (some information is missing). The sample

size was n = 65 in a one-factor ANOVA. (Note: This question requires a Tukey table)

At α = .05, which is the critical value of the test statistic for a two-tailed test for a

significant difference in means that are to be compared simultaneously: 2.81.

-T.05 = 2.81 for df = (c, n - c) with c = 5 and n = 65.

Refer to the following MegaStat output (some information is missing). The sample

size was n = 65 in a one-factor ANOVA. Which pairs of days differ significantly?


Note: This question requires access to a Tukeytable.

+(Mon, Thu) and (Mon, Wed) and (Mon, Fri) and (Mon, Tue).

-Use T.05 = 2.81 for df = (c, n - c) with c = 5 and n = 65.

Refer to the following MegaStat output (some information is missing). The sample
size was n = 24 in a one-factor ANOVA. At α = .05, what is the critical value of the
Tukey test statistic for a two-tailed test for a significant difference in means that are
to be compared simultaneously? Note: This question requires access to a Tukey
table. +2.80

-T.05 = 2.80 for df = (c, n - c) with c = 4 and n = 24.

Refer to the following MegaStat output (some information is missing). The sample

size was n = 24 in a one-factor ANOVA. Which pairs of meds differ at α = .05? Note:
This question requires access to a Tukey table.

+Med 2, Med 4

-T.05 = 2.80 for df = (c, n - c) with c = 4 and n = 24.

What is the .05 critical value of Hartley's test statistic for a one-factor ANOVA with n1
= 5, n2 = 8, n3 = 7, n4 = 8, n5 = 6, n6 = 8? Note: This question requires access to a
Hartley table. +13.7

-H.05 = 13.7 for df = (c, (n/c) - 1) where c = 6 and n = 42, so we use df = (6, 6).

What is the .05 critical value of Tukey's test statistic for a one-factor ANOVA with n1
= 6, n2 = 6, n3 = 6? Note: This question requires access to a Tukey table. +2.60
-T.05 = 2.60 for df = (c, n - c) with c = 3 and n = 18.

What are the degrees of freedom for Hartley's test statistic for a one-factor ANOVA
with n1 = 5, n2 = 8, n3 = 7, n4 = 8, n5 = 6, n6 = 8: +6, 6

-Use df = (c, (n/c) - 1) where c = 6 and n = 42, or df = (6, 6).

What are the degrees of freedom for Tukey's test statistic for a one-factor ANOVA
with n1 = 6, n2 = 6, n3 = 6? +3, 15

-Use df = (c, n - c) with c = 3 and n = 18, or df = (3, 15).

After performing a one-factor ANOVA test, John noticed that the sample standard
deviations for his four groups were, respectively, 33, 24, 73, and 35. John should:

+use Hartley's test to check his assumptions.

-The unusually large standard deviation for group 3 suggests unequal variances.

Which statement is incorrect

+Hartley's test is needed to determine whether the means of the groups differ.

-Hartley's test compares variances (not means).

Which is not an assumption of unreplicated two-factor ANOVA (randomized block)

+There is factor interaction.

-The usual assumptions apply to a two-factor ANOVA (but no interaction estimate is


possible without replication).

Which is correct concerning a two-factor unreplicated (randomized block) ANOVA

+No interaction effect is estimated.

-We cannot estimate the interaction effect without replication in a two-factor ANOVA.

In a two-factor unreplicated (randomized block) ANOVA, what is the F statistic for the
treatment effect given that SSA (treatments) = 216, SSB (block) = 126, SSE (error) =
18:

+Can't tell without more information

-We cannot calculate the mean squares without knowing r, c, and n, so no F


statistics.

Three bottles of wine are tasted by three experts. Each rater assigns a rating (scale
is from 1 = terrible to 10 = superb). Which test would you use for the most obvious
hypothesis

+Two-factor ANOVA without replication

-Only one observation per row/column cell (two factors but no replication).
To compare the cost of three shipping methods, a firm ships material to each of four
different destinations over a six-month period. The average cost per shipment is
shown below. Which test would be appropriate

+Two-factor ANOVA without replication

-Only one observation per row/column cell (two factors but no replication).

To compare the cost of three shipping methods, a firm ships material to each of four
different destinations over a six-month period. The average cost per shipment is
shown below. For the appropriate type of ANOVA, total degrees of freedom would
be: 11

-df = n - 1 = 12 - 1 = 11.

Here is an Excel ANOVA table that summarizes the results of an experiment to


assess the effects of ambient noise level and plant location on worker productivity.
The test used α = 0.05. Is the effect of plant location significant at α = .05: +No

-The p-value is not less than .05, so plant location has no significant effect.

Here is an Excel ANOVA table that summarizes the results of an experiment to


assess the effects of ambient noise level and plant location on worker productivity.
The test used α = 0.05. +Yes

-The p-value is much less than .05, so noise level has a significant effect.

Here is an Excel ANOVA table that summarizes the results of an experiment to


assess the effects of ambient noise level and plant location on worker productivity.
The test used α = 0.05. The experimental design and ANOVA appear to be:

+unreplicated two-factor

-The absence of an interaction suggests an unreplicated two-factor model.

Here is an Excel ANOVA table that summarizes the results of an experiment to


assess the effects of ambient noise level and plant location on worker productivity.
The test used α = 0.05. The sample size is: 16

-For unreplicated two-factor ANOVA, total df = 3 + 3 + 9 = 15 = n - 1, so n = 16.

At the Seymour Clinic, the number of patients seen by three doctors over three days
is as follows: This data set would call for: +two-factor ANOVA without replication.

-Only one observation per row/column cell (two factors but no replication).

At the Seymour Clinic, the number of patients seen by three doctors over three days
is as follows: Degrees of freedom for the error sum of squares would be: 8

-For unreplicated two-factor ANOVA, the error df = (r - 1)(c - 1) = (5 - 1)(3 - 1) = 8.

Here is an Excel ANOVA table for an experiment that analyzed factors that may
affect patients' blood pressure (some information is missing). The number of
medication types is: 2

-df = 1 = (number of medications - 1), so there were 2 medications.


Here is an Excel ANOVA table for an experiment that analyzed two factors that may
affect patients' blood pressure (some information is missing). The number of patient
age groups is: 4.

-For patient age group, df = (25.0938)/(8.3646) = 3 = c - 1 (so 4 age groups).

Here is an Excel ANOVA table for an experiment that analyzed two factors that may
affect patients' blood pressure (some information is missing). The number of patients
per replication is: 4.

c - 1 = (25.0938)/(8.3646) = 3 (so 4 age groups), r - 1 = 1 (so 2 meds), total df = 1 +


3 + 3 + 24 = 31 = n - 1 (so n = 32), 8 treatments (3 × 4) and thus 32/8 = 4
replications per treatment.

Here is an Excel ANOVA table for an experiment that analyzed two factors that may
affect patients' blood pressure (some information is missing). The overall sample
size is: 32.

- c - 1 = (25.0938)/(8.3646) = 3 (so 4 age groups), r - 1 = 1 (so 2 meds), total df


= 1 + 3 + 3 + 24 = 31 = n - 1 (so n = 32).

Here is an Excel ANOVA table for an experiment that analyzed two factors that may
affect patients' blood pressure (some information is missing). At α = .05 the effect of
medication type is: significant.

-The p-value is much less than .05, so medication type has a highly significant effect.

Here is an Excel ANOVA table for an experiment that analyzed two factors that may
affect patients' blood pressure (some information is missing). At α = .01 the effect of
patient age is: +not quite significant.

-The p-value of .011 is greater than .01, so age group does not have a significant
effect at α = .01 (however, it is a very close decision).

Here is an Excel ANOVA table for an experiment that analyzed two factors that may
affect patients' blood pressure (some information is missing). At α = .10 the
interaction is: +insignificant

-The p-value is much greater than .05 so there is no significant interaction.

Three randomly chosen pieces of four types of PVC pipe of equal wall thickness are
tested to determine the burst strength (in pounds per square inch) under three
temperature conditions, yielding the results shown below. Which test would be
appropriate

+Two-factor ANOVA with replication

-Within each treatment combination we have three replications.

Three randomly chosen pieces of four types of PVC pipe of equal wall thickness are
tested to determine the burst strength (in pounds per square inch) under three
temperature conditions, yielding the results shown below. Total degrees of freedom
for the ANOVA would be: 35

-Total df = n - 1 = 36 - 1 = 35.
A firm is studying the effect of work shift and parts supplier on its defect rate
(dependent variable is defects per 1000). The resulting ANOVA results are shown
below (some information is missing). How many suppliers were there: 3

- 44 - 36 - 4 - 2 = 2 = c - 1, so there were 3 suppliers.

A firm is studying the effect of work shift and parts supplier on its defect rate
(dependent variable is defects per 1000). The resulting ANOVA results are shown
below (some information is missing). How many replications per cell were there: 5.

- n - 1 = 44 (n = 45), 44 - 36 - 4 - 2 = 2 = c - 1 (3 suppliers), r - 1 = 2 (3 shifts),


so 3 × 3 = 9 row/column cells and hence 45/9 = 5 replications per treatment
combination.

A firm is studying the effect of work shift and parts supplier on its defect rate
(dependent variable is defects per 1000). The resulting ANOVA results are shown
below (some information is missing). At α = 0.01, the effect of supplier is: +clearly
insignificant.

-The p-value is much greater than .05, so supplier has no significant effect.

A firm is studying the effect of work shift and parts supplier on its defect rate
(dependent variable is defects per 1000). The resulting ANOVA results are shown
below (some information is missing). The number of observations was: 45.

- n - 1 = 44 (n = 45).

A firm is studying the effect of work shift and parts supplier on its defect rate
(dependent variable is defects per 1000). The resulting ANOVA results are shown
below (some information is missing). At α = 0.01, the interaction effect is: +not quite
significant.

-The p-value is much less than .05, so there is a significant interaction effect.

A firm is concerned with variability in hourly output at several factories and shifts.
Here are the results of an ANOVA using output per hour as the dependent variable
(some information is missing). The original data matrix has how many treatments
(rows × columns): 6.

-r - 1 = 1 (2 factories), c - 1 = 2 (3 shifts), so 2 × 3 = 6 row/column cells.

A firm is concerned with variability in hourly output at several factories and shifts.
Here are the results of an ANOVA using output per hour as the dependent variable
(some information is missing). The number of observations in each treatment cell
(row-column intersection) is: 3.

- n - 1 = 17 (n = 18), r - 1 = 1 (2 factories), c - 1 = 2 (3 shifts), so 2 × 3 = 6


row/column cells and hence 18/6 = 3 replications per treatment combination.

A firm is concerned with variability in hourly output at several factories and shifts.
Here are the results of an ANOVA using output per hour as the dependent variable
(some information is missing). At α = 0.01 the effect of factory is: +clearly significant

-The p-value is much less than .05, so factory has a significant effect.
A firm is concerned with variability in hourly output at several factories and shifts.
Here are the results of an ANOVA using output per hour as the dependent variable
(some information is missing). The p-value for the interaction effect is going to be:

+very small (near 0)

-For interaction, Fcalc = (40454.167)/(719.444) = 56.23, so very small p-value.

Sound engineers studied factors that might affect the output (in decibels) of a rock
concert speaker system. The results of their ANOVA tests are shown (some
information is missing). Which is the number of amplifiers and positions tested: 2, 4

- r - 1 = 1 (2 amplifiers), c - 1 = 3 (4 positions).

Sound engineers studied factors that might affect the output (in decibels) of a rock
concert speaker system. The results of their ANOVA tests are shown (some
information is missing). The number of observations per cell was: 3

- n - 1 = 23 (n = 24), r - 1 = 1 (2 amplifiers), c - 1 = 3 (4 positions), so 2 × 4 = 8


row/column cells and hence 24/8 = 3 replications per treatment combination.

Sound engineers studied factors that might affect the output (in decibels) of a rock
concert speaker system. The desired level of significance was α = .05. The results of
their ANOVA tests are shown (some information is missing). The most reasonable
conclusion at α = .05 about the three sources of variation (amplifier, position, and
interaction) would be that their effects are:

+very significant, almost significant, insignificant.

-The p-value is smaller than .05 for amplifier, but not quite for position and definitely
not for the interaction term.

Sound engineers studied factors that might affect the output, in decibels, of a rock
concert speaker system. The results of their ANOVA tests are shown (some
information is missing). The F statistic for amplifier was: 10.16

-Fcalc = (99.02344)/(9.742188) = 10.16.

A multinational firm manufactures several types of 1280 × 1024 LCD displays in


several locations. They designed a sampling experiment to analyze the number of
pixels per screen that have significant color degradation after 52,560 hours (six
years of continuous use) using accelerated life testing. The Excel ANOVA table for
their experiment is shown below. Some table entries have been obscured. The
response variable (Y) is the number of degraded pixels in a given display. Degrees
freedom for display type will be: 4.

-For display type, df = (233.2333)/(58.30833) = 4.


A multinational firm manufactures several types of 1280 × 1024 LCD displays in
several locations. They designed a sampling experiment to analyze the number of
pixels per screen that have significant color degradation after 52,560 hours (six
years of continuous use) using accelerated life testing. The Excel ANOVA table for
their experiment is shown below. Some table entries have been obscured. The
response variable (Y) is the number of degraded pixels in a given display. How many
display types were there: 5

-For display type, df = (233.2333)/(58.30833) = 4 = c - 1 (so 5 display types).

A multinational firm manufactures several types of 1280 × 1024 LCD displays in


several locations. They designed a sampling experiment to analyze the number of
pixels per screen that have significant color degradation after 52,560 hours (six
years of continuous use) using accelerated life testing. The Excel ANOVA table for
their experiment is shown below. Some table entries have been obscured. The
response variable (Y) is the number of degraded pixels in a given display. How many
countries were studied: 3.

-For country, df = (202.9)/(101.45) = 2 = r - 1 (so 3 countries).

A multinational firm manufactures several types of 1280 × 1024 LCD displays in


several locations. They designed a sampling experiment to analyze the number of
pixels per screen that have significant color degradation after 52,560 hours (six
years of continuous use) using accelerated life testing. The Excel ANOVA table for
their experiment is shown below. Some table entries have been obscured. The
response variable (Y) is the number of degraded pixels in a given display. The F
statistic for display effect is: 2.39

-Fcalc = (58.30833)/(24.36667) = 2.393.

A multinational firm manufactures several types of 1280 × 1024 LCD displays in


several locations. They designed a sampling experiment to analyze the number of
pixels per screen that have significant color degradation after 52,560 hours (six
years of continuous use) using accelerated life testing. The Excel ANOVA table for
their experiment is shown below. Some table entries have been obscured. The
response variable (Y) is the number of degraded pixels in a given display. At α = .05,
the interaction effect is: +clearly insignificant.

-Fcalc = (18.47084)/(24.36667) = 0.76, which is far less than F.05 for df = (8, 45).

A multinational firm manufactures several types of 1280 × 1024 LCD displays in


several locations. They designed a sampling experiment to analyze the number of
pixels per screen that have significant color degradation after 52,560 hours (six
years of continuous use) using accelerated life testing. The Excel ANOVA table for
their experiment is shown below. Some table entries have been obscured. The
response variable (Y) is the number of degraded pixels in a given display. The
numerator degrees of freedom for the interaction test would be: 8.

- For interaction, df = (147.7667)/(18.47084) = 8.


A veterinarian notes the age (months) at which dogs are brought to the clinic to be
neutered, what kind of test would be used: Two-factor ANOVA with replication
- There are three replications and two factors.
A veterinarian notes the age (months) at which dogs are brought in to the clinic to be
neutered. Numerator degrees of freedom for the ANOVA interaction test would be: 2
- Two factor ANOVA with replication, interaction df = (r - 1)(c - 1) = (2 - 1)(3 - 1) = 2
A veterinarian notes the age (months) at which dogs are brought in to the clinic to be
neutered. Total degrees of freedom for a two-factor replicated ANOVA would be: 17
- n - 1 = 18 - 1 = 17

Refer to the following partial ANOVA results from Excel (some information is
missing), how many nozzle settings were observed: 2
- For nozzle, df = 1 = r - 1 (so 2 nozzles)
Refer to the following partial ANOVA results from Excel (some information is
missing). Degrees of freedom for pressure level would be: 2.
- For pressure, df = (8.07444)/(4.03722) = 2 = c - 1 (so 3 pressures).
Refer to the following partial ANOVA results from Excel (some information is missing)
Error degrees of freedom would be: 12
- For error, df = (8.5400)/(0.711667) = 12.
Refer to the following partial ANOVA results from Excel (some information is
missing). The overall sample size was: 18
- Divide each SS by its MS to get 1 + 2 + 2 + 12 = 17 = n - 1 (so n = 18).
Refer to the following partial ANOVA results from Excel (some information is
missing), how many pressure levels were observed: 3
- For pressure, df = (8.07444)/(4.03722) = 2 = c - 1 (so 3 pressures)
Refer to the following partial ANOVA results from Excel (some information is
missing). At α = .05, the critical F value for nozzle setting is: 4.75.
- Using Appendix F with df = (1, 12), we get F.05 = 4.75.
Refer to the following partial ANOVA results from Excel (some information is missing)
The form of the original data matrix is: 2 × 3 table.
- Divide each SS by its MS to get (r - 1) = 1, (c - 1) = 2, so r × c = 2 × 3 = 6
treatments.
Refer to the following partial ANOVA results from Excel (some information is
missing). The number of replications per treatment was: 3.
- Divide each SS by its MS to get total df = 1 + 2 + 2 + 12 = 17 = n - 1, so n = 18 and
r × c = 2 × 3 = 6 treatments, giving three replications per treatment.
Refer to the following partial ANOVA results from Excel (some information is
missing). At α = 0.05, the effect of nozzle setting is: just barely significant.
- Its p-value is slightly less than .05, so the nozzle effect is barely significant.
As shown below, a hospital recorded the number of minutes spent in post-op
recovery by three randomly chosen knee-surgery patients in each category, based
on age and type of surgery, which is the most appropriate test: Two-factor ANOVA
with replication
- Three replications per cell with two factors
Refer to the following partial ANOVA results from Excel (some information is
missing). The response variable was Y = maximum amount of water pumped from
wells (gallons per minute). The degrees of freedom for age of well is: 2.
- For age of well, df = 26 - 18 - 4 - 2 = 2 (so 3 ages)
Refer to the following partial ANOVA results from Excel (some information is
missing). The response variable was Y = maximum amount of water pumped from
wells (gallons per minute). The F statistic for depth of well is: 25.78.
- Fcalc = (1225)/(47.519) = 25.779.
Refer to the following partial ANOVA results from Excel (some information is
missing). The response variable was Y = maximum amount of water pumped from
wells (gallons per minute). The MS for interaction is: 8.17
- For interaction, we have MS = (32.667)/4 = 8.167.
Refer to the following partial ANOVA results from Excel (some information is
missing). The response variable was Y = maximum amount of water pumped from
wells (gallons per minute). The MS for age of well is: 182.33
- By subtraction, for age of well df = 26 - 18 - 4 - 2 = 2. so MS = (364.667)/(2) =
182.334.

CHAPTER 12

Regression
28/11 homework
A scatter plot is used to visualize the association (or lack of association) between
two quantitative variables.t

The correlation coefficient r measures the strength of the linear relationship


between two variables.t

Pearson's correlation coefficient (r) requires that both variables be interval or ratio data.t

If r = .55 and n = 16, then the correlation is significant at α = .05 in a two-tailed test.t

A sample correlation r = .40 indicates a stronger linear relationship than r = -.60. f


A common source of spurious correlation between X and Y is when a third unspecified
variable Z affects both X and Y. t

The correlation coefficient r always has the same sign as b1 in Y = b0 + b1X. t

The fitted intercept in a regression has little meaning if no data values near X = 0 have
been observed. t
The least squares regression line is obtained when the sum of the squared residuals is
minimized. t

In a simple regression, if the coefficient for X is positive and significantly different from zero,
then an increase in X is associated with an increase in the mean (i.e., the expected value)
of Y. t

.In least-squares regression, the residuals e1, e2, . . . , en will always have a zero meany

When using the least squares method, the column of residuals always sums to zero.
t

In the model Sales = 268 + 7.37 Ads, an additional $1 spent on ads will increase sales by
7.37 percent. f

If R2 = .36 in the model Sales = 268 + 7.37 Ads with n = 50, the two-tailed test for
correlation at α = .05 would say that there is a significant correlation between Sales and
Ads.
t

If R2 = .36 in the model Sales = 268 + 7.37 Ads, then Ads explains 36 percent of the
variation in Sales.t

. The ordinary least squares regression line always passes through the point t

The least squares regression line gives unbiased estimates of β0 and β1. t

If SSR is 1800 and SSE is 200, then R2 is .90. t

In a simple regression, the correlation coefficient r is the square root of R2 . t

The width of a prediction interval for an individual value of Y is less than standard error se. f

If SSE is near zero in a regression, the statistician will conclude that the proposed model
probably has too poor a fit to be useful. fFor a regression with 200 observations, we expect
that about 10 residuals will exceed two standard errors. t

Confidence intervals for predicted Y are less precise when the residuals are very small.f

Cause-and-effect direction between X and Y may be determined by running the regression


twice and seeing whether Y = β0 + β1X or X = β1 + β0Y has the larger R2 . f

. If you have a strong outlier in the residuals, it may represent a different causal system. t
Using the ordinary least squares method ensures that the residuals will be normally
distributed. f

. The ordinary least squares method of estimation minimizes the estimated slope and
intercept. f

A negative correlation between two variables X and Y usually yields a negative p-value for
r. f

In linear regression between two variables, a significant relationship exists when the
p-value of the t test statistic for the slope is greater than α. f

The larger the absolute value of the t statistic of the slope in a simple linear regression, the
stronger the linear relationship exists between X and Y. t

In simple linear regression, the p-value of the slope will always equal the p-value of the F
statistic. t

In simple linear regression, the coefficient of determination (R2 ) is estimated from sums of
squares in the ANOVA table. t

An observation with high leverage will have a large residual (usually an outlier). f

A prediction interval for Y is narrower than the corresponding confidence interval for the
mean of Y. f

When X is farther from its mean, the prediction interval and confidence interval for Y
become wider. t

The total sum of squares (SST) will never exceed the regression sum of squares (SSR). f

"High leverage" would refer to a data point that is poorly predicted by the model (large
residual). f

The studentized residuals permit us to detect cases where the regression predicts poorly. t

A poor prediction (large residual) indicates an observation with high leverage. f

Ill-conditioned refers to a variable whose units are too large or too small (e.g., $2,434,567).
t

A simple decimal transformation (e.g., from 18,291 to 18.291) often improves data
conditioning. t

Two-tailed t-tests are often used because any predictor that differs significantly from zero in
a two-tailed test will also be significantly greater than zero or less than zero in a one-tailed
test at the same α. t
A predictor that is significant in a one-tailed t-test will also be significant in a two-tailed test
at the same level of significance α. f

Omission of a relevant predictor is a common source of model misspecification. t


The regression line must pass through the origin. f

Outliers can be detected by examining the standardized residuals.t

. In a simple regression, the F statistic is calculated by taking the ratio of MSR to the MSE.t

The coefficient of determination is the percentage of the total variation in the response
variable Y that is explained by the predictor X. t

A different confidence interval exists for the mean value of Y for each different value of X. t

In a simple regression, there are n - 2 degrees of freedom associated with the error sum of
squares (SSE).t

A prediction interval for Y is widest when X is near its mean. f

In a two-tailed test for correlation at α = .05, a sample correlation coefficient r = 0.42 with n
= 25 is significantly different than zero. t

In correlation analysis, neither X nor Y is designated as the independent variable. t

A negative value for the correlation coefficient (r) implies a negative value for the slope
(b1). t

High leverage for an observation indicates that X is far from its mean.t
Autocorrelated errors are not usually a concern for regression models using cross-sectional
data. t

There are usually several possible regression lines that will minimize the sum of squared
errors. f

When the errors in a regression model are not independent, the regression model is said to
have autocorrelation. t

In a simple bivariate regression, Fcalc = tcalc 2 . t

Correlation analysis primarily measures the degree of the linear relationship between X and
Y. t
5/12 homework

The variable used to predict another variable is called the independent variable.

The standard error of the regression is based on squared deviations from the regression
line.

A local trucking company fitted a regression to relate the travel time (days) of its
shipments as a function of the distance traveled (miles). The fitted regression is
Time = -7.126 + 0.0214 Distance, based on a sample of 20 shipments. The
estimated standard error of the slope is 0.0053. Find the value of tcalc to test for
zero slope.
4.04

A local trucking company fitted a regression to relate the travel time (days) of its
shipments as a function of the distance traveled (miles). The fitted regression is
Time = -7.126 + .0214 Distance, based on a sample of 20 shipments. The
estimated standard error of the slope is 0.0053. Find the critical value for a
right-tailed test to see if the slope is positive, using α = .05.
1.734
.If the attendance at a baseball game is to be predicted by the equation
Attendance = 16,500 - 75 Temperature, what would be the predicted attendance
if Temperature is 90 degrees?
9,750

.A hypothesis test is conducted at the 5 percent level of significance to test


whether the population correlation is zero. If the sample consists of 25
observations and the correlation coefficient is 0.60, then the computed test
statistic would be:
.3.597
.
.Which of the following is not a characteristic of the F-test in a simple regression?
The F-test gives a different p-value than the
t-test.
A researcher's Excel results are shown below using Femlab (labor force
participation rate among females) to try to predict Cancer (death rate per 100,000
population due to cancer) in the 50 U.S. states.
The standard error is too high for this model to be of any predictive
use.

A researcher's results are shown below using Femlab (labor force participation
rate among females) to try to predict Cancer (death rate per 100,000 population
due to cancer) in the 50 U.S. states.
This model explains about 10 percent of the variation in state cancer
rates.

A researcher's results are shown below using Femlab (labor force participation
rate among females) to try to predict Cancer (death rate per 100,000 population
due to cancer) in the 50 U.S. states.
.0982

A news network stated that a study had found a positive correlation between the
number of children a worker has and his or her earnings last year. You may
conclude that:
causation is in serious doubt.

William used a sample of 68 large U.S. cities to estimate the relationship


between Crime (annual property crimes per 100,000 persons) and Income
(median annual income per capita, in dollars). His estimated regression equation
was Crime = 428 + 0.050 Income. We can conclude that:
the intercept is irrelevant since zero median income is impossible in a large
city.

Mary used a sample of 68 large U.S. cities to estimate the relationship between
Crime (annual property crimes per 100,000 persons) and Income (median annual
income per capita, in dollars). Her estimated regression equation was Crime =
428 + 0.050 Income. If Income decreases by 1000, we would expect that Crime
will:
decrease by 50.

Amelia used a random sample of 100 accounts receivable to estimate the


relationship between Days (number of days from billing to receipt of payment)
and Size (size of balance due in dollars). Her estimated regression equation was
Days = 22 + 0.0047 Size with a correlation coefficient of .300. From this
information we can conclude that:
9 percent of the variation in Days is explained by Size.

Prediction intervals for Y are narrowest when:


the value of X is near the mean of X.

If n = 15 and r = .4296, the corresponding t-statistic to test for zero correlation is:

1.71
5.
.Using a two-tailed test at α = .05 for n = 30, we would reject the hypothesis of
zero correlation if the absolute value of r exceeds:
.3609
The ordinary least squares (OLS) method of estimation will minimize:
. neither the slope nor the intercept.

.A standardized residual ei = -2.205 indicates:

a rather poor prediction.


.In a simple regression, which would suggest a significant relationship between X
and Y?
Large t statistic for the
slope
.
Which is indicative of an inverse relationship between X and Y?
A negative correlation coefficient

Which is not correct regarding the estimated slope of the OLS regression line?
It may be regarded as zero if its p-value is less than α.

Simple regression analysis means that:


We have only one explanatory variable.

The sample coefficient of correlation does not have which property?


It assumes that Y is the dependent
variable.

When comparing the 90 percent prediction and confidence intervals for a given
regression analysis:
the prediction interval is wider than the confidence
interval.

Which is not true of the coefficient of determination?


It is negative when there is an inverse relationship between X and
Y.

If the fitted regression is Y = 3.5 + 2.1X (R2 = .25, n = 25), it is incorrect to


conclude that:
Y increases 2.1 percent for a 1 percent increase in X.
In a simple regression Y = b0 + b1X where Y = number of robberies in a city (thousands of
robberies), X = size of the police force in a city (thousands of police), and n = 45 randomly
chosen large U.S. cities in 2008, we would be least likely to see which problem?
Autocorrelated residuals

When homoscedasticity exists, we expect that a plot of the residuals versus the fitted Y:
will show no pattern at all.

Which statement is not correct? . Heteroscedastic residuals will have roughly the same
variance for any value of X.

In a simple bivariate regression with 25 observations, which statement is most nearly


correct? A leverage statistic of 0.16 or more would indicate high leverage.
A regression was estimated using these variables: Y = annual value of reported bank
robbery losses in all U.S. banks ($millions), X = annual value of currency held by all U.S.
banks ($millions), n = 100 years (1912 through 2011). We would not anticipate:.
autocorrelated residuals due to time-series data.

A fitted regression for an exam in Prof. Hardtack's class showed Score = 20 + 7 Study,
where Score is the student's exam score and Study is the student's study hours. The
regression yielded R2 = 0.50 and SE = 8. Bob studied 9 hours. The quick 95 percent
prediction interval for Bob's grade is approximately 67 to 99.

Which is not an assumption of least squares regression?


Normal X values

In a simple Bivariate regression with 60 observations there will be…


residuals.
6.0
Which is correct to find the value of the coefficient of determination (R2)?
SSR/SST

The critical value for a two-tailed test of H0: β1 = 0 at α = .05 in a simple


regression with 22 observations is:
±2.086

In a sample of size n = 23, a sample correlation of r = .400 provides sufficient


evidence to conclude that the population correlation coefficient exceeds zero in a
right-tailed test at:
α = .05 but not α = .01.

In a sample of n = 23, the Student's t test statistic for a correlation of r = .500


would be:
2.646

In a sample of n = 23, the critical value of the correlation coefficient for a


two-tailed test at α = .05 is:±.412

In a sample of n = 23, the critical value of Student's t for a two-tailed test of


significance for a simple bivariate regression at α = .05 is:
±2.080

.
In a sample of n = 40, a sample correlation of r = .400 provides sufficient
evidence to conclude that the population correlation coefficient exceeds zero in a
right-tailed test at:
both α = .025 and α = .05.

.
In a sample of n = 20, the Student's t test statistic for a correlation of r = .400
would be:
1.852

.
In a sample of n = 20, the critical value of the correlation coefficient for a
two-tailed test at α = .05 is:
±.444

.
In a sample of n = 27, the critical value of Student's t for a two-tailed test of
significance for a simple bivariate regression at α = .05 is:
±2.060

.
In a sample of size n = 36, a sample correlation of r = -.450 provides sufficient
evidence to conclude that the population correlation coefficient differs
significantly from zero in a two-tailed test at:
both α = .01 and α = .05.

.
In a sample of n = 36, the Student's t test statistic for a correlation of r = -.450
would be:
-2.938

In a sample of n = 36, the critical value of the correlation coefficient for a


two-tailed test at α = .05 is:
±.329

.
In a sample of n = 36, the critical value of Student's t for a two-tailed test of
significance of the slope for a simple regression at α = .05 is:
2.032

A local trucking company fitted a regression to relate the travel time (days) of its
shipments as a function of the distance traveled (miles). The fitted regression is
Time = -7.126 + 0.0214 Distance. If Distance increases by 50 miles, the
expected Time would increase by:
1.07 days

A local trucking company fitted a regression to relate the cost of its shipments as
a function of the distance traveled. The Excel fitted regression is shown.
$143

If SSR is 2592 and SSE is 608, then:


The coefficient of determination is .81.
Find the sample correlation coefficient for the following data.

x y

3 8

7 12

5 13

9 10

11 17

13 23

15 35

17 34

.9124

Use Excel =CORREL(XData, YData) to verify your calculation using the formula for r.

Find the slope of the simple regression = b0 + b1x.


x y

3 8

7 12

5 13

9 10

11 17

13 23

19 39

21 38

1.833

Find the sample correlation coefficient for the following data.

x y

3 9

5 13

9 10

13 23

15 35

.8736

Find the slope of the simple regression = b0 + b1x.

x y

3 9

5 13

9 10

13 23
15 35
1.884
A researcher's results are shown below using n = 25 observations.
variable coefficient standard error

intercept 343.619889 61.0823514

slope -2.2833659 0.99855319


[ -4.349,-0.217].

A researcher's regression results are shown below using n = 8 observations.

variable coefficient standard error

intercept -0.1667 2.8912

slope 1.8333 0.2307


[1.268,2.398].

You might also like