Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

DSC

 Correlation is a statistical method used to


determine whether a relationship between
variables exists.
 Regression is a statistical method used to
describe the nature of the relationship
between variables, that is, positive or
negative, linear or nonlinear.
“Are two or more variable related?”
“If so, what is the strength of the relationship?”
“What type of relationship exists?”
“What kind of predictions can be made from the relationship?”

DSC
 Independent variable – can be controlled or
manipulated
 Dependent variable - cannot be controlled or
manipulated; it depends on the independent
variable

Example:
Number of hours of study vs. exam grade

DSC
 Scatter plot is a graph of the ordered pairs (x,
y) of numbers consisting of the independent
variable, x, and dependent variable, y.
 It is used to determine the nature of
relationship between variables.

Positive relationship DSC


Negative relationship

No relationship

DSC
 It is computed from a sample data which
measures the strength and direction of a
linear relationship between two variables.
 The symbol for the sample correlation
coefficient is r.
 The symbol for the population correlation
coefficient is ρ (rho).

DSC
DSC
 It is computed from a sample data which
measures the strength and direction of a
linear relationship between two variables.
 The symbol for the sample correlation
coefficient is r.
 The symbol for the population correlation
coefficient is ρ (rho).
 Round off to three decimal places

DSC
 Compute for the correlation coefficient of the
following data.

DSC
 Step 1: Make a table.

DSC
 Step 2: Complete the table as shown.

DSC
 Step 3: Compute for the value of r.

 The correlation coefficient suggests a strong


relationship between the number of cars and
a company’s annual income.

DSC
 Compute for the correlation coefficient of the
following data.

DSC
 When correlation coefficient is significant,
determine the equation of the regression line,
which is the data’s line of best fit.
 Its purpose is to enable the researcher to see
the trend and make predictions on the basis
of the data.
 Determining the regression line when r is not
significant and then making predictions using
the regression line are meaningless.

DSC
 Best fit means that the sum of the squares of
the vertical distances from each point to the
line is at a minimum.

DSC
 Using the standard form of an equation of a
line,

DSC
 Going back to our previous example,

DSC
 Compute for the values of a and b.

DSC
 Compute for the values of a and b in previous
seatwork.

 Identify the equation of its regression line.

DSC
 When an outlier affects the equation of the
regression line, it is called an influential point
or influential observation.
 To check, graph the regression line without
the outlier. When it significantly affects the
equation, the outlier is an influential point.
 It is either you exclude the outlier from your
data set or make more data points near the
outlier to compensate its position.

DSC
DSC
Chi-Square Goodness of Fit Test
 When you are testing to see whether a
frequency distribution fits a specific pattern,
you can use the chi-square goodness-of-fit
test.
 Example:
• A market analyst wished to see if
consumers prefer a specific flavor in sodas.
• A traffic engineer wants to see if accidents
are more often in some days than others.

DSC
1. State the null and alternative hypothesis to
identify the claim.
Ho: The random variable x follows a
“certain” distribution.
H1: The random variable x does not follow
a “certain” distribution.

2. Find the critical region.


The test is always right-tailed.

DSC
3. Compute for the test value:

where: E = expected frequency


O = observed frequency (from sample)
D.F. (v) = number of categories minus 1
Assumptions:
 The data obtained is from a random sample.
 The expected frequency for each category must
be 5 or more.

4. Make a decision and summarize the results.

DSC
The Russel Reynold Association surveyed retired senior
executives who had returned to work. They found that
after returning to work, 38% were employed by another
organization, 32% were self-employed, 23% were either
freelancing or consulting, and 7% had formed their own
companies. To see if these percentages are consistent
with those of Allegheny County residents, a local
researcher surveyed 300 retired executives who had
returned to work and found that 122 were working for
another company, 85 were self-employed, 76 were
either freelancing or consulting, and 17 had formed
their own companies. At α=0.10, test the claim that
the percentages are the same for those people in
Allegheny County.

DSC
A researcher read that firearm-related deaths
for people aged 1 to 18 were distributed as
follows: 74% were accidental, 16% were
homicides, and 10% were suicides. In her
district, there were 68 accidental deaths, 27
homicides, and 5 suicides during the past year.
At α=0.10, test the claim that the percentages
are equal.

DSC
 When data can be tabulated in table form in
terms of frequencies, several types of hypotheses
can be tested by using the chi-square test.
 Test for independence used to determine
whether two variables are independent of or
related to each other when a single sample is
selected.
 Homogeneity of proportions test is used to
determine whether the proportions for a variable
are equal when several samples are selected from
different populations.

DSC
1. State the null and alternative hypotheses to
identify the claim.

2. Find the critical region in the right tail (using


chi-square table).

degrees of freedom = (no. of rows minus 1)


times (no. of columns minus 1)

DSC
3. Compute for the test value. To compute the test
value, first find the expected values. For each cell
of the contingency table, use the formula below to
get the expected value.

To find the test value, use the formula,

4. Make the decision and summarize the results.

DSC
 Suppose a new post-operative procedure is
administered to a number of patients in a large
hospital. The researcher can ask the question,
“Do the doctors feel differently about this
procedure from the nurses, or do they feel
basically the same way?” To answer this question,
a researcher selects a sample of nurses and
doctors. As the survey indicates, 100 nurses
prefer the new procedure, 80 prefer the old
procedure, and 20 have no preference; 50
doctors prefer the new procedure, 120 like the
old procedure, and 30 have no preference.

DSC
 Ho: The opinion about the procedure is
independent of the profession.

 H1: The opinion about the procedure is


dependent on the profession.

 No given level of significance, assume 5%.

DSC
DSC
DSC
DSC
 Reject Ho. Therefore, preference is dependent
with profession.

DSC
 A researcher selected 100 passengers from each of
3 airlines and asked them if the airline had lost
their luggage on their last flight. The data are
shown in the table. At α=0.05, test the claim that
the proportion of passengers from each airline who
lost luggage on the flight is the same for each
airline.

DSC
 Ho: p1=p2=p3

 H1: At least one mean differs from the other.

 Α=0.05

 Degrees of freedom = (2-1)(3-1) = 2

DSC
DSC
 Accept Ho. Therefore, there is not enough evidence
to reject the claim that the proportions are equal.
Hence it seems that there is no difference in the
proportions of the luggage lost by each airline.

DSC

You might also like