Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

DOE-Exercise PILOT PLANT (Frac Fac 24-1)

Organic synthesis of semi-carbazone from glyoxylic acid in a pilot plant

Background
The organic synthesis of semi-carbazone from glyoxylic acid is a key step in the synthesis of azuracil (a
cytostaticum, anti-cancer drug). The objective of this study was to investigate the best operating
conditions for a pilot plant synthesizing semi-carbazone. A fractional factorial design in four factors
was constructed and three responses were measured (we use two here). The aims of this experimental
protocol were to obtain a high yield of semi-carbazone and high purity. Two center points have been
added to the original design.

Objective
This exercise demonstrates what’s possible with a fractional factorial design (res IV) and findings that
may need a follow up. In the exercise you will;
 Investigate how to detect and solve problems with significant but confounded interactions
and square effects using the Analysis Wizard and its tools.
 Interpret and communicate a possible result.
 Understand the difference between the presentation tools Contour, Sweet Spot and Design
Space plots.

Data

Copyright Sartorius Stedim Data Analytics AB, 18-03-09 Page 1 (8)


Tasks
Task 1
Set up the investigation in MODDE and choose a fractional factorial design of resolution IV. This means
that the two-factor interactions will be confounded. Produce a list showing which interactions are
confounded with each other. The factor precision (left at default values) will not be used in this
investigation.

Task 2
Use the Analysis wizard to work through the responses.
Use the “Interaction test” and “Square test” functions to see if the model(s) need interaction and/or
square terms.

Task 3
Show graphically the part of the experimental space that should be chosen for a series of verifying
experiments in the pilot plant (specify levels for the variables). Goal: High Yield and High Purity.
Consider Addition Time as a factor that contributes to higher cost in the production.
Hint: Use and compare; Contour, Sweet spot and Design Space plots on the Home tab.

Task 4
Which method is commonly used to separate confounding effects between two-factor interactions?

Copyright Sartorius Stedim Data Analytics AB, 18-03-09 Page 2 (8)


Solutions to Pilot Plant
Task 1
On the Design tab, click Confoundings to show the list of interactions that are confounded.
Below, we can see the confounding pattern. The problem is that we cannot be sure which of the
confounded interaction terms that are important when we get a significant coefficient (Note: a model
including all confounded terms cannot be fitted with MLR since the confounded terms are 1:1
correlated in the current design).

Task 2
Response 1 (Yield)

Copyright Sartorius Stedim Data Analytics AB, 18-03-09 Page 3 (8)


The summary plot indicates that this is a bad
model, why?
Note that Model validity seems OK despite a
missing interaction. With only two replicates the
model validity test will be quite unreliable.

In this case we have a linear model to start with and it is often true that interaction terms have to be
added to produce better models.
Use the Interaction test in the wizard:

Copyright Sartorius Stedim Data Analytics AB, 18-03-09 Page 4 (8)


The test shows that there is an interaction between factors “Addition Time” and “Temperature” (low
probability that the term is equal to zero). This interaction is confounded with the interaction between
“Stirring” and “Water”. It does not matter (from a statistical perspective) which interaction we select,
the other will automatically be unavailable in the dialog, but the selection should be motivated either
by knowledge or other reasons (large main effects). The choice of term will have a profound influence
on the model interpretation.

Although they have low contribution to the model, the linear terms of Stirring and Water are kept in
the model, but can also be removed. It is a good procedure to document all linear contributions and
later use a fully tuned model for predictions.

Response 2 (Purity)

Copyright Sartorius Stedim Data Analytics AB, 18-03-09 Page 5 (8)


The histogram indicates 2 main groups of data.
Typically due to one main factor influence (two
levels on the factor).
The square test is highlighted indicating that one
or several factors have a non-linear influence on
the response.

The square test gives a list of possible square terms to add to the model.

Here we have chosen to add a square term in temperature to the model. That is an educated guess. In
chemistry temperature often has a non-linear effect on results. However, from a theoretical point, it
can be any of the factors that cause the non-linearity (one or several). To sort this out we have to
augment the design with new experiments so that there are three levels for the factors (RSM design).
We have chosen to add one square term. This improves the model. The two linear terms AddT and Stirr
are kept in the model but can also be excluded.

Copyright Sartorius Stedim Data Analytics AB, 18-03-09 Page 6 (8)


Task 3
We can produce contour plots with addition time and temperature factors on the axes. Amount of
water added is set to its center level (because it has a negative effect on Yield and a positive one on
Purity). Stirring is also set to the center level because it has a small effect on both responses.

A sweet spot plot shows where the criteria for the two responses are fulfilled.

Copyright Sartorius Stedim Data Analytics AB, 18-03-09 Page 7 (8)


The design space plot shows how to set factor levels to achieve safe results (the plot shows the
Probability of failure). The default is set to 1%.
Note: When creating the Design Space plot, the uncertainty interval of Confidence was used; this is
reasonable given the small size (10 experiments) of the experimental design.

Comparing the Sweet Spot Plot and the Design Space Plot it is obvious that the allowable factor
ranges are smaller in the Design Space Plot.
In order to identify a region for verifying experiments we compare the Contour Plot and the Design
Space plot. In the Contour Plot the dynamics and average levels of predictions are seen. In the Design
Space plot the risk of failing to comply with the specifications is given. For temperature the range
between 35 and 55 seems to be the most interesting and for addition time the range above 1.9 h. Note
that the models have unresolved interaction confoundings and square confoundings respectively.

Task 4
One common method used to unconfound two-factor interactions is called FOLD-OVER. It is also
possible to use D-Optimal functionality to augment designs in a more targeted way (i.e. add a few
experiments to resolve a specific confounding).

Conclusions
In order to accomplish high yield and high purity a factor combination of addition time 1.9h, water
137.5 ml/mol and temperature 45 C looks appropriate. This setpoint should be verified with additional
experiments or if possible with model resolving experiments. The last factor, stirring time, may be set
at a convenient level.
It is unfortunately in the most expensive region (high temp. and long time) we predict a result within
specifications with high confidence. If the conclusions had been based on the Sweetspot plot it is likely
that the decision could have been quite different with the motivation of having a fast and economical
process. A factor combination of addition time 1.3h, water 137.5 ml/mol and temperature 30 C will
probably fail in > 30% of the attempts.
Taking probability analysis into account may give remarkably different conclusions.
Original literature reference: J-C Vallejos, Diss. IPSOI, Marseille 1978.

Copyright Sartorius Stedim Data Analytics AB, 18-03-09 Page 8 (8)

You might also like