Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

CHEE2010 Tutorial 5: Analysis of Variance (ANOVA)

Submission Instructions

1. Ensure that all questions are answered in the boxes provided in this document
2. Ensure that all graphs and code are copied into this document where requested
3. Save this document as a PDF file with the name CHEE2010_tut5_<your surname>.pdf
4. Upload your PDF document to Turnitin by 3pm Wednesday 9th September 2020

Grading Criteria

Tutorials will be graded according to the following criteria:

0 marks for a very limited attempt with significant errors


0.5 marks for approx. half of questions attempted with minor errors, or all questions attempted with some errors
1 mark for all questions attempted with only minor errors

In the weeks following each tutorial, we will provide solutions for each question and additional feedback based on
common difficulties encountered. Note that while each tutorial can contribute up to 1% of your final grade, only
your top 7 tutorials will be counted, giving a maximum tutorial contribution of 7%.

Tutorial Introduction

The tutorial explores ANOVA analysis. At the end of this tutorial, you should be able to:

(a) Conduct a one-way ANOVA in Excel


(b) Conduct a multi-way ANOVA in MATLAB
(c) Evaluate if a blocking strategy is effective in a specific experimental design

You may need to download the Analysis ToolPak and Solver Add-in in Excel. To do this, open Excel and navigate to
File → Options → Add-Ins → Go… → Analysis ToolPak / Solver Add-in → OK.

1
Section 1: One-way ANOVA

Looking at the effluent solids data from last week, it is important to know if performance varies between months.
To determine this, a one-way ANOVA can be performed on the data grouped by month (first 3 months only). A one-
way ANOVA test is used to examine the variability between groups within a data set. It is called “one-way” as there
is only one independent variable being analysed.

The data is contained within the first worksheet of Data5.xlsx. Copy and paste the data for each month into separate
columns, then navigate to Data → Data Analysis → ANOVA: Single Factor. Select the three separate columns as the
input range. Note that Excel’s ANOVA tool can handle unbalanced analysis, so it’s okay if the amount of data in each
month is different. Group the data by columns, indicate whether labels are included in the input range, and keep
the level of significance at α = 0.05.

Copy and paste the ANOVA output summary table below.

This is for single factor ANOVA.

Table 1 shows the summary of output

Before the results of an ANOVA can be interpreted, both the null hypothesis and alternative hypothesis must be
defined. Define the null and alternative hypotheses for a one-way ANOVA on the effluent data grouped by month.

H0:𝜇1 = 𝜇2 = 𝜇3
H1: 𝜇1 ≠ 𝜇2 ≠ 𝜇3

What is your interpretation of this summary table? What is the significance of the P, F and Fcrit values?

By assuming the null hypothesis is true and close to 1, F statistic defined as the ratio between variation
between means of group and variation within group. The F statistic value is directly proportional to
variation among group means. Fstat>Fcrit is used to evaluate and reject the null hypothesis when different
group means are equal. Probability, P interpreted as a measure of risk difference from the F value. Hence,
there is no evidence indicating difference between the mean effluent monthly by using Fstat>Fcrit and
probability more than 5.0%

2
Calculate the standard deviation and 95% confidence interval values for each month. Plot the mean values for each
month on a column chart and add error bars based on the 95% confidence intervals.

Copy and paste the resulting plot below. Ensure you have labelled both axes and given the plot a title.

The standard deviation and 95% interval confidence (IC) are given in Table 1.The plot for mean values for each
month is given in Figure 1.

Figure 1 shows the plot of mean values for each month

Section 2: Multi-way ANOVA

The Fenton reaction is used to mineralize organic contaminants within water and is employed as an alternative
green technology to remediate the organic load in the wastewater of the company that you work for. In the Fenton
reaction OH radicals are produced via the reaction of Fe(II) and H2O2 at low pH.

Organic contaminants (X) are oxidised to carbon dioxide via the action of these OH radicals.

Fe2+ + H2 O2 → Fe3+ + OH ∙ +OH −

OH ∙ + X → CO2 + H2 O2

The percentage of mineralization (i.e. the outcome) can be measured via the chemical oxygen demand (mg/L). You
are required to minimize the consumption of Fenton reagents (Fe(II) and H2O2) and optimize the pH. To achieve
this, you have selected multivariate experimental design employing three factors at two levels. The factor and levels
are described in the table below.

Factor Name Units Low Level (-) High Level (+)

A pH - 2 4
B H2O2 mg/L 100 300
C Fe2+ mg/L 10 20

3
The data is contained in the second worksheet of Data7.xlsx and displayed in the table below.

A B C Y
2 100 10 16
4 100 10 18
2 300 10 60
4 300 10 72
2 100 20 20
4 100 20 23
2 300 20 80
4 300 20 90
In Excel, copy and paste the data for each factor into new columns and replace all low- and high-level values with
-1 and +1, respectively. Calculate the effect of factors B and C (and their interaction) using the following formula:

∑ Y+1 ∑ Y−1
Effect = −
n+1 n−1

List the values obtained below. Is the interaction significant compared to the individual factors?

Effect of B = 56.25
Effect of C = 11.75
Effect of BC = 7.25

Export the original data to text (tab delimited) format. Create a script file in MATLAB (New → Script), name it
anova.m, and begin populating it by calling the data and creating individual vectors for each factor from it. Use the
anovan command to conduct a multi-way ANOVA on the data. This will assess the numerical impact of the
different factors. Note: Conducting a multi-way ANOVA is Excel is possible, but complicated.

Copy and paste the ANOVA output summary table below.

Define the null and alternative hypothesis below.

4
H0: Solids in effluent are not significant to the model when grouped by factor A, B or C
H1: Solids in effluent are significant to the model when grouped by factor A, B or C

Based on this analysis, which factors are important? (Hint: Look at the p-values)

Factor A is not significant effect as P>0.05. Thus, retain H0. Factors B and C are significant effect as P < 0.05. Thus,
reject H0.

Now conduct a multi-way ANOVA that additionally considers the impact of interactions between the factors.

Copy and paste the ANOVA output summary table below.

Do the results change? Are there any significant interactions?

The probability changed due to change in degree of freedom. The probability for factor A is above 5% while other
factors remain less than 5%. There is no significant interaction observed. The significant error is observed in pH
value and consumption of H2O2.

Create a main effects plot for the data considering all three factors (Hint: help maineffectsplot). Copy and
paste the resulting plot below.

5
What does this plot reveal? Does it agree with the results of the ANOVA analyses?

The factor A and C has a less effect compared to factor B.

Is a full interaction model (i.e. multi-factor interactions) possible for this data set?

No, Matlab shows NaN as there is no sufficient degree of freedom.

Section 3: Blocking

Blocking is an experimental design method used to reduce confounding. Confounding is the “mixing of effects”
wherein the effects of a given factor on a given outcome are mixed in with the effects of an additional unexamined
factor (or set of factors) resulting in a distortion of the true relationship (Skelly et al. 2012).
Incorporating blocking factors into the ANOVA analysis is an effective method of separating out the potential effects
of individual factors. Typically, a blocking factor is a source of variability that is not of primary interest to the
experimenter.

6
The third worksheet in Data.xlsx contains data obtained from three suppliers over three days. Using one-way
ANOVA in MATLAB, analyse if there is a difference between the three suppliers, not considering the different days.

Define the null and alternative hypothesis below.

H0: Solids in effluent are not significant to the model when grouped by suppliers
H1: Solids in effluent are significant to the model when grouped by suppliers

Copy and paste the ANOVA output summary table below.

What is your interpretation of the results?

ANOVA shows retained H0 has probability more than 5%. Hence, type of suppliers is not substantial to
the model.

Three days were needed to perform the experiments and two experiments were run each day. However, the plant
operator changed every day. To exclude any distortion due to different operators collecting the data you decide to
consider it as a separate blocking factor in your ANOVA test. Analyse the blocked experiment via a two-way ANOVA
in MATLAB.

Copy and paste the ANOVA output summary table below.

Can you conclude that blocking was effective? Why/why not?

7
The significance of suppliers was not identified by ANOVA analysis. Thus, the blocking was effective.

Ensure all MATLAB coding from Sections 2 and 3 is saved in a script with the name anova.m. Copy and paste the
contents of this script file below.

clear
close all
clc

Section 2

A = [1 -1 -1 -1 16;
2 1 -1 -1 18;
3 -1 1 -1 60;
4 1 1 -1 72;
5 -1 -1 1 20;
6 1 -1 1 23;
7 -1 1 1 80;
8 1 1 1 90];

a = A(:,2);
b = A(:,3);
c = A(:,4);
Y = A(:,5);

anovan(Y,{a,b,c});
anovan(Y,{a,b,c},'model','interaction'
anovan(Y,{a,b,c},'model','full');

maineffectsplot(Y,{a,b,c},'varnames',{'Factor a','Factor b','Factor c'})

Section 3

clear
clc

A = [1 1 36;
1 1 36;
2 1 41;
2 1 42;
3 1 44;
3 1 46;
1 2 34;
1 2 35;
2 2 41;

8
2 2 42;
3 2 43;
3 2 44;
1 3 33;
1 3 33;
2 3 41;
2 3 42;
3 3 44;
3 3 43];

a = A(:,1);
b = A(:,2);
Y = A(:,3);

anovan(Y,{b});
anovan(Y,{a,b});

You might also like