Professional Documents
Culture Documents
06 Regression and Goodness of Fit
06 Regression and Goodness of Fit
DSC
Independent variable – can be controlled or
manipulated
Dependent variable - cannot be controlled or
manipulated; it depends on the independent
variable
Example:
Number of hours of study vs. exam grade
DSC
Scatter plot is a graph of the ordered pairs (x,
y) of numbers consisting of the independent
variable, x, and dependent variable, y.
It is used to determine the nature of
relationship between variables.
No relationship
DSC
It is computed from a sample data which
measures the strength and direction of a
linear relationship between two variables.
The symbol for the sample correlation
coefficient is r.
The symbol for the population correlation
coefficient is ρ (rho).
DSC
DSC
It is computed from a sample data which
measures the strength and direction of a
linear relationship between two variables.
The symbol for the sample correlation
coefficient is r.
The symbol for the population correlation
coefficient is ρ (rho).
Round off to three decimal places
DSC
Compute for the correlation coefficient of the
following data.
DSC
Step 1: Make a table.
DSC
Step 2: Complete the table as shown.
DSC
Step 3: Compute for the value of r.
DSC
Compute for the correlation coefficient of the
following data.
DSC
When correlation coefficient is significant,
determine the equation of the regression line,
which is the data’s line of best fit.
Its purpose is to enable the researcher to see
the trend and make predictions on the basis
of the data.
Determining the regression line when r is not
significant and then making predictions using
the regression line are meaningless.
DSC
Best fit means that the sum of the squares of
the vertical distances from each point to the
line is at a minimum.
DSC
Using the standard form of an equation of a
line,
DSC
Going back to our previous example,
DSC
Compute for the values of a and b.
DSC
Compute for the values of a and b in previous
seatwork.
DSC
When an outlier affects the equation of the
regression line, it is called an influential point
or influential observation.
To check, graph the regression line without
the outlier. When it significantly affects the
equation, the outlier is an influential point.
It is either you exclude the outlier from your
data set or make more data points near the
outlier to compensate its position.
DSC
DSC
Chi-Square Goodness of Fit Test
When you are testing to see whether a
frequency distribution fits a specific pattern,
you can use the chi-square goodness-of-fit
test.
Example:
• A market analyst wished to see if
consumers prefer a specific flavor in sodas.
• A traffic engineer wants to see if accidents
are more often in some days than others.
DSC
1. State the null and alternative hypothesis to
identify the claim.
Ho: The random variable x follows a
“certain” distribution.
H1: The random variable x does not follow
a “certain” distribution.
DSC
3. Compute for the test value:
DSC
The Russel Reynold Association surveyed retired senior
executives who had returned to work. They found that
after returning to work, 38% were employed by another
organization, 32% were self-employed, 23% were either
freelancing or consulting, and 7% had formed their own
companies. To see if these percentages are consistent
with those of Allegheny County residents, a local
researcher surveyed 300 retired executives who had
returned to work and found that 122 were working for
another company, 85 were self-employed, 76 were
either freelancing or consulting, and 17 had formed
their own companies. At α=0.10, test the claim that
the percentages are the same for those people in
Allegheny County.
DSC
A researcher read that firearm-related deaths
for people aged 1 to 18 were distributed as
follows: 74% were accidental, 16% were
homicides, and 10% were suicides. In her
district, there were 68 accidental deaths, 27
homicides, and 5 suicides during the past year.
At α=0.10, test the claim that the percentages
are equal.
DSC
When data can be tabulated in table form in
terms of frequencies, several types of hypotheses
can be tested by using the chi-square test.
Test for independence used to determine
whether two variables are independent of or
related to each other when a single sample is
selected.
Homogeneity of proportions test is used to
determine whether the proportions for a variable
are equal when several samples are selected from
different populations.
DSC
1. State the null and alternative hypotheses to
identify the claim.
DSC
3. Compute for the test value. To compute the test
value, first find the expected values. For each cell
of the contingency table, use the formula below to
get the expected value.
DSC
Suppose a new post-operative procedure is
administered to a number of patients in a large
hospital. The researcher can ask the question,
“Do the doctors feel differently about this
procedure from the nurses, or do they feel
basically the same way?” To answer this question,
a researcher selects a sample of nurses and
doctors. As the survey indicates, 100 nurses
prefer the new procedure, 80 prefer the old
procedure, and 20 have no preference; 50
doctors prefer the new procedure, 120 like the
old procedure, and 30 have no preference.
DSC
Ho: The opinion about the procedure is
independent of the profession.
DSC
DSC
DSC
DSC
Reject Ho. Therefore, preference is dependent
with profession.
DSC
A researcher selected 100 passengers from each of
3 airlines and asked them if the airline had lost
their luggage on their last flight. The data are
shown in the table. At α=0.05, test the claim that
the proportion of passengers from each airline who
lost luggage on the flight is the same for each
airline.
DSC
Ho: p1=p2=p3
Α=0.05
DSC
DSC
Accept Ho. Therefore, there is not enough evidence
to reject the claim that the proportions are equal.
Hence it seems that there is no difference in the
proportions of the luggage lost by each airline.
DSC