JCTB 2391

Research Article
Received: 6 January 2010 Revised: 9 February 2010 Accepted: 22 February 2010 Published online in Wiley Interscience: 7 April 2010
(www.interscience.wiley.com) DOI 10.1002/jctb.2391
Enzymatic hydrolysis of sugarcane bagasse

for bioethanol production: determining
optimal enzyme loading using neural networks
Elmer Ccopa Rivera,∗ Sarita Cândida Rabelo, Daniella dos Reis Garcia,
Rubens Maciel Filho and Aline Carvalho da Costa∗
Abstract
BACKGROUND: The efficient production of a fermentable hydrolyzate is an immensely important requirement in the utilization
of lignocellulosic biomass as a feedstock in bioethanol production processes. The identification of the optimal enzyme loading
is of particular importance to maximize the amount of glucose produced from lignocellulosic materials while maintaining low
costs. This requirement can only be achieved by incorporating reliable methodologies to properly address the optimization
problem.
RESULTS: In this work, a data-driven technique based on artificial neural networks and design of experiments have been
integrated in order to identify the optimal enzyme combination. The enzymatic hydrolysis of sugarcane bagasse was used as a
case study. This technique was used to build up a model of the combined effects of cellulase (FPU/L) and β-glucosidase (CBU/L)
loads on glucose yield (%) after enzymatic hydrolysis. The optimal glucose yield, above 99%, was achieved with cellulase and
β-glucosidase concentrations in the ranges of 460.0 to 580.0 FPU L−1 (15.3–19.3 FPU g−1 bagasse) and 750.0 to 1140.0 CBU L−1
(2–38 CBU g−1 bagasse), respectively.
CONCLUSIONS: The dynamic model developed can be used not only to the prediction of glucose concentration profiles for
different enzymatic loadings, but also to obtain the optimum enzymes loading that leads to high glucose yield. It can promote
both a successful hydrolysis process control and a more effective employment of enzymes.

c 2010 Society of Chemical Industry
Keywords: enzymatic hydrolysis; sugarcane bagasse; enzyme loading; optimization; modeling; artificial intelligence
NOTATION η learning rate

bk is the bias in the kth neuron in the output layer θj bias of the jth neuron in the hidden layer.
d desired output vector
ECell cellulase activity (FPU L−1 )
Eβ−glu β-glucosidase activity (CBU L−1 ) INTRODUCTION
f activation function of the jth neuron in the hidden Ethanol from lignocellulosic materials has been investigated
layer during the past few years with great attention, but its production
F activation function of the kth neuron in the output in large-scale plants has not yet become viable. Studies taking into
layer account process integration, increase of fermentation yields and
g vector of the values predicted by the ANN model integration of unit operations are still needed in order to make
G glucose concentration (g L−1 ) hydrolysis a competitive technology.1,2 Bagasse, the by-product
N agitation (rpm) of bioethanol manufacture from sugarcane fermentation, is a very
t time (h) promising raw material for bioethanol production in the large-
T temperature (◦ C) scale process perspective. It is already available on the ethanol
wji weight connecting the ith neuron in the input layer plant site, since it is produced in the mills where sugar is extracted
and the jth neuron in hidden layer
Wkj weight connecting the jth neuron in the hidden layer
and the kth neuron in the output layer ∗ Correspondence to: Elmer Ccopa Rivera and Aline Carvalho da Costa, Labo-
x1 coded values of cellulase concentration ratory of Optimization, Design and Advanced Control, School of Chemical
x2 coded values of β-glucosidase concentration Engineering, State University of Campinas, P.O. Box 6066, 13083-970, Camp-
inas, SP, Brazil. E-mail: elmer@feq.unicamp.br, accosta@feq.unicamp.br
Yield glucose yield (%)
Laboratory of Optimization, Design and Advanced Control, School of Chemical
Greek letters Engineering,StateUniversityofCampinas,P.O.Box6066,13083-970,Campinas,
983
α momentum coefficient SP, Brazil
J Chem Technol Biotechnol 2010; 85: 983–992 www.soci.org

c 2010 Society of Chemical Industry
www.soci.org E. Ccopa Rivera et al.
from sugarcane, and better technologies of cogeneration are of modeling and optimization techniques enhances the overall
expected to result in an increased surplus of bagasse at the prediction accuracy.14,16
plant site. Bioethanol from sugarcane bagasse may share the The application presented in this study illustrates the efficiency
infrastructure where conventional bioethanol is produced, such as and usefulness of a model-based approach for the identification
fermentation and distillation units, which diminishes equipment of the optimum ratio of cellulase and β-glucosidase on the
costs. The product obtained after hydrolysis may be diluted in enzymatic hydrolysis of sugarcane bagasse with regard to cellulose
the sugarcane juice, thus decreasing the impacts of potential conversion to glucose. DOE was used to obtain statistically well
fermentation inhibitors, such as furfural and its derivatives formed distributed data of cellulase and β-glucosidase concentrations in
during cellulose hydrolysis. the input domain for the training and validation of the ANN model,
Of the two possible hydrolysis methods, acid and enzymatic and glucose concentration (g L−1 ) was used as the model output.
hydrolysis, acid hydrolysis is relatively inexpensive, but it forms This is a very important issue to be addressed, as neural networks
compounds that might seriously inhibit the subsequent fermen- have very limited extrapolation properties. As a consequence,
tation. On the other hand, enzymatic hydrolysis takes place under an accurate nonlinear model was obtained, which provided an
milder process conditions than that of dilute acid hydrolysis, optimal region of glucose yield (%) during the enzymatic hydrolysis
thus leading to a decreased formation of by-products. Techni- of sugarcane bagasse.
cal feasibility, however, is not sufficient: economic potential is
the driving force. Thus, there is a great interest in the modeling
and optimization of all second-generation bioethanol production MATERIAL AND METHODS
steps. Substrate
The present market offers many cellulase preparations (includ- Sugarcane (Saccharum officinarum) was grown and mechanically
ing those obtained from Trichoderma reesei) containing low levels harvested in 2006, and bagasse was obtained from the sugar plant
of β-glucosidase, which leads to an increased accumulation of cel- Usina da Pedra, located in Serrana, São Paulo, Brazil. This material
lobiose in the enzymatic hydrolyzates of cellulose. The inability of was dried outdoors for 72 h and, to obtain more uniform particles,
industrial glucose-fermenting yeasts to ferment cellobiose results was milled in a knives mill (Wiley Mill, Philadelphia, PA, USA, model
in incomplete conversion of the hydrolyzate to ethanol, signifi- 3) and in a hammer mill (General Electric, Plainville, CT, USA), for
cantly diminishing final yield. These drawbacks may be overcome 10 min at each mill, and went through a process of screening using
by supplementation of the cellulase complex with β-glucosidase a Tyler 35 sieve (Paulinia, SP, Brazil). The dry matter (DM) content
from other sources. Thus, the task of identifying an optimal en- was approximately 95% (w/w).
zyme combination is of extreme importance in obtaining a high
glucose yield for the overall economy of the process.3 – 6 However, Pretreatment
successful optimization of the enzymatic hydrolysis step can only Bagasse was pretreated with alkaline hydrogen peroxide. The
be achieved by incorporating methodologies for rapid develop- pretreatment was performed in the conditions determined as
ment of reliable mathematical models. These models can be used optimal for this process, with bagasse concentrations of 8% (w/w),
to facilitate the implementation of suitable operating strategies to 11% (v/v) of hydrogen peroxide and pH adjusted to 11.5 with
achieve high operational performance. sodium hydroxide. The pretreatment solution was incubated in an
Design of experiments (DOEs) have successfully been used to orbital shaker (Marconi, Piracicaba, SP, Brazil, MA-832), agitated at
investigate the influence of physical parameters (temperature, pH, 150 rpm, 25 ◦ C for 1 h.17,18 The pretreated biomass was washed
substrate concentration, agitation, among others) in enzymatic until pH 7.0 and dried at 50 ◦ C for 24 h.
hydrolysis.7 – 10 This approach has been widely used in various ap-
plications due to its well-established methodology.11 Considering Chemical analysis of bagasse samples
the difficulties involved in biotechnological processes, the main Samples of raw bagasse and bagasse pretreated with alkaline
difficulty in model-based techniques for definition of operational hydrogen peroxide were analyzed to determine chemical com-
strategies and optimization in the enzymatic hydrolysis of sug- position. Samples of approximately 4 g were extracted with 95%
arcane bagasse is the problem of obtaining an accurate model ethanol for 12 h in a Soxhlet apparatus.19 For determination of ash
to aid in the decision making process. Although DOEs provide content, samples of 1 g were burned in a muffle furnace at 575 ◦ C
understanding about the process and a reliable measurement of for 4 h.20
its parameters,12 practical experience has shown that the behavior Extracted bagasse samples were hydrolyzed with 72% sulfuric
of enzymatic reactions with varying enzymes loading is extremely acid at 30 ◦ C for 1 h (300 mg of sample and 3 mL of sulfuric acid).
complex and hence difficult to handle statistically.13,14 The acid was diluted to a final concentration of 4% (addition of
Artificial intelligence, such as artificial neural networks (ANN), 84 mL of water) and the mixture heated at 125 ◦ C, 1 atm for 1 h.21
has been used successfully for solving biotechnological complex The residual material was cooled and filtered through filter paper
problems related to the field of modeling and optimization in previously dried.
order to achieve high operational performance. This technique The acid-insoluble lignin was measured as the weight of
is required to efficiently combine all available knowledge and to insoluble residue remaining at 105 ◦ C. The acid-soluble lignin
direct the development towards an improved process operation was measured by UV–Vis spectroscopy (Mini-1240 Shimadzu) at
strategy. ANNs can be used to offer adaptive solutions, since the 205 nm.22 A solution of 4% H2 SO4 was utilized as blank. The total
reestimation of their parameters is a straightforward procedure.15 lignin content was the sum of acid-insoluble and acid-soluble
These characteristics are suitable for analyzing data from more lignin.
complex processes such as the influence of enzyme loading on Carbohydrates in biomass were determined using the soluble
the enzymatic hydrolysis of sugarcane bagasse. At this point, it fraction. Glucose, xylose and arabinose were determined by high
984
is worthwhile mentioning that in many studies the combination performance liquid chromatography (HPLC; Waters Corporation,
www.interscience.wiley.com/jctb
c 2010 Society of Chemical Industry J Chem Technol Biotechnol 2010; 85: 983–992
Optimization of enzymatic hydrolysis of sugarcane bagasse using ANN www.soci.org
Sugar analysis by HPLC

Table 1. Chemical composition of untreated bagasse and bagasse
pretreated with alkaline hydrogen peroxide Samples were filtered through a 0.45 µm filter and the content
of monosaccharides (D-glucose, D-xylose and L-arabinose) was
Untreated bagasse Bagasse pretreated
quantified using a HPLC system (Waters Corporation) equipped
Components (%) with H2 O2 (%)
with a refractive index detector. The separation was performed in
Ash 1.79 ± 0.02 – a Sugar-Pak I column (Waters Corporation) at 70 ◦ C with deionized
Extractives 3.25 ± 0.2 – water as eluent at a flow rate of 0.5 mL min−1 .
Total lignin 25.10 ± 0.5 9.87 ± 0.4 The maximum yield of glucose was calculated using
Glucan 37.35 ± 0.5 60.09 ± 0.9
Xylan 23.66 ± 0.9 16.58 ± 0.6 Glucose yield (% theoretical maximum)
g of glucose by HPLC
= × 0.90 × 100 (1)
g of glucan
Massachusetts, USA) equipped with a refractive index detector. Where 0.90 is the factor used to convert sugar monomers to
The separation was performed in a Sugar-Pak I column (Waters anhydromonomers.
Corporation) at 70 ◦ C with a flow rate of 0.5 mL min−1 , using
filtered deionized water as the mobile phase. The sample was
centrifuged and then, filtered through 0.2 µm (Acrodisc) and a ANN MODEL DEVELOPMENT
volume of 10 µL was injected. This section presents the considerations required to develop
Acetyl content was determined using a Biorad HPX87H column a modeling technique based on ANN. A multilayer perceptron
at 45 ◦ C, eluted at 0.55 mL min−1 with 0.01 mol L−1 sulfuric acid. neural network (MLP) was used in this work, mainly for its easily
Acetyl groups were detected in a 65 ◦ C temperature-controlled RI understandable architecture and simple mathematical form, which
detector (Knauer, Berlin, Germany, HPLC pump and detector). results in a simple tool for modeling and optimization.
Glucose, xylose, arabinose, and acetic acid were used as Important elements of ANNs are model structure (architecture)
external calibration standards. Sugar loss by acid degradation was and ANN training, as will be discussed below.
considered using the Sugar Recovery Standards as suggested by
the NREL method.21 The factors used to convert sugar monomers ANN structure selection
to anhydromonomers were 0.90 for glucose and 0.88 for xylose
An MLP with one hidden layer of sigmoidal neurons and a layer of
and arabinose. Acetyl content was calculated as the acetic acid
linear output neurons was employed. This structure has nonlinear
content multiplied by 0.7. These factors were calculated based on
processing capabilities and universal approximation property,26
water addition to polysaccharides during acid hydrolysis. Table 1
and has already been used successfully to describe the dynamic
shows the composition of the bagasse untreated and pretreated
behavior of biotechnological processes.15,27
with alkaline hydrogen peroxide.
A MLP consists of three types of layers: an input layer, an output
layer and one or more hidden layers, whose numbers of neurons
Enzymatic hydrolysis performed with variation on enzymes are N, M and K, respectively. Each layer may have a different number
loading of neurons, which are interconnected by adjustable parameters
(weights and biases) associated with them. The relationship is
The enzymatic hydrolysis of pretreated bagasse was performed in
given mathematically as:
250 mL erlenmeyer flasks, containing a 100 mL mixture of citrate
buffer and solid substrate with pH adjusted to 4.8. The substrate  N 
concentration in the hydrolysis assays was 3% (w/v).
M
gk = F  Wkj f wji xi + θj + bk 
The effect of enzymes loading was evaluated using
j=1 i=1
cellulase from Trichoderma reesei (Sigma-Aldrich, Steinheim,
Germany, ATCC 26 921) and β-glucosidase from Aspergillus niger (j = 1, . . . , M); (k = 1, . . . , K) (2)
(Novozym 188). The values of enzymes concentrations were si-
multaneously varied based on a 22 + central composite design. where wji is the weight connecting the ith neuron in the input
The flasks were incubated in an orbital shaker (Marconi MA-832) layer and the jth neuron in the hidden layer; θj is the bias of the jth
agitated at 100 rpm at 50 ◦ C. The reaction time was set at 72 h and neuron in the hidden layer; Wkj is the weight connecting the jth
periodically, aliquots were taken, boiled to deactivate the enzymes neuron in the hidden layer and the kth neuron in the output layer;
and evaluated for sugar content. bk is the bias in the kth neuron in the output layer; f (·) and F(·) are
The concentrations of cellulase and β-glucosidase given in FPU the activation functions of the jth neuron in the hidden layer and
L−1 and CBU L−1 , respectively, were varied according to the design of the kth neuron in the output layer, respectively.
matrix described in Table 2.
Selection of the input domain using DOE and training
The available process information that can be used for building
Enzymatic activities ANN models consists of the measurable process output and the
Cellulase activity was determined as filter paper units per milliliter process inputs. A problem which arises is to select from the amount
(FPU mL−1) , as recommended by the International Union of Pure of information available the appropriate information for the ANN
and Applied Chemistry.23,24 β-glucosidase activity was determined inputs. When ANNs are used to build inferential models, a study of
through a solution of cellobiose 15 mmol L−1 and expressed in their input domain is crucial in order to provide a reduction in the
units per milliliter (CBU mL−1 )25 . Enzyme activity was 64.11 FPU dimension of the input space, which can remarkably reduce the
985
mL−1 for cellulase and 308.37 CBU mL for β-glucosidase. time needed for training.28
J Chem Technol Biotechnol 2010; 85: 983–992

c 2010 Society of Chemical Industry www.interscience.wiley.com/jctb
Table 2. Experimental design (coded levels in parentheses) and results of 22 plus star configuration central composite design, including three
replicates at the center point
Trial Cellulase ECell (FPU L−1 ) β-glucosidase Eβ−glu (CBU L−1 ) Glucose G (g L−1 ) Glucose yield Yield (%) Production rate (g L−1 h−1 )
1 174.0 220.0 16.59 82.81 0.230

(−1) (−1)
2 174.0 1280.0 15.73 78.52 0.218
(−1) (+1)
3 775.0 220.0 16.26 81.19 0.226
(1) (−1)
4 775.0 1280.0 17.10 85.42 0.238
(1) (+1)
5 50.0 750.0 8.57 42.76 0.119
(−1.4142) (0)
6 900.0 750.0 19.66 98.15 0.273
(+1.4142) (0)
7 475.0 0.0 12.72 63.51 0.177
(0) (−1.4142)
8 475.0 1500.0 18.65 93.08 0.259
(0) (+1.4142)
9 (C) 475.0 750.0 19.67 98.21 0.273
(0) (0)
10 (C) 475.0 750.0 19.66 99.39 0.273
(0) (0)
11 (C) 475.0 750.0 19.91 99.43 0.277
(0) (0)
Design of experiments (DOE) allows one to optimize the input 22

domain by reducing the number of experiments and ensuring
that the dataset is statistically well distributed. Thus, this dataset 18
Glucose (g/L)
should be sufficient to build effective ANN models.29 This approach 13

has been used recently in areas of biotechnological processes
optimization,30 where statistical techniques such as multivariate 9
regression models have traditionally been used.
4
In this work, in order to obtain data for the training of the neural
network, the inputs are distributed according to a DOE within 0
the enzyme concentration intervals. These intervals were defined 0 20 40 60 80
based on prior knowledge of enzymatic hydrolysis of sugarcane Time (h)
bagasse.17,18 The aim was to include as learning patterns those data
Figure 1. Experimental glucose concentration values for each trial in the
that contain most of the important information about enzymatic DOE matrix (Table 2). Trial 1 ( – × – ), Trial 2 ( – – ), Trial 3 ( – – ), Trial
hydrolysis. 4 ( – ◦ – ), Trial 5 ( – • – ), Trial 6 ( – – ), Trial 7 ( – ♦ – ), Trial 8 ( – – ),
The DOE was carried out using a central composite design Trial 9 (average value of the three replicates at the center point) ( – – ).
(CCD), consisting of two factors (concentrations of cellulase (ECell )
and β-glucosidase (Eβ−glu )) at three levels with three replicates
Table 3 details the input (ECell , Eβ−glu and t) and the output (G)
at the center point. The DOE matrix is shown in Table 2. The
vectors used to carry out the ANN training. In this table, the dataset
response variable (glucose concentration) chosen as the reaction
from trial 9 is an average value of the three replicates at the center
time after which no significant changes in these variables were
point of the DOE matrix (Table 2).
detected (72 h) is also shown in Table 2, where the corresponding A representative dataset containing 480 input/output patterns
glucose yield, Yield (%) and production rate (g L−1 h−1 )) are also was presented in a randomized sequence to the neural network for
reported. Nevertheless, to include the kinetic behavior in the data- the estimation of its parameters (weight and bias). The predictive
driven identification of the process, ANN training was processed capability of neural networks or validation was assessed on a
using glucose concentration profiles, G (g L−1 ), for each trial in different sequence of experimental observations, comprising the
the DOE matrix (Fig. 1). A PCHIP routine (piecewise cubic hermite experimental data from trial 2, which was randomly chosen from
interpolating polynomial) implemented in Matlab (MathWorks, the DOE matrix.
Natick, MA, USA) was used to fit the experimental data for G with In this study, both input and output data to the ANN were
the purpose of increasing the dataset by interpolation. Thus, each scaled in the interval [0.1, 0.9]. Small random values were
trial used for training provides 60 interpolated data points. Tests used for the initialization of weights and biases. Subsequently,
have shown that this procedure led to good results when dynamic the standard backpropagation learning algorithm,31 based on
986
processes were identified using ANN.15 a gradient descendent method implemented in FORTRAN was
Table 3. Input and output vectors used for the ANN training and validation
Input vectors Output vector
ECell (FPU L−1 ) Eβ−glu (CBU L−1 ) t (h) G (g L−1 )

Trial {ECell i , i = 1 . . . 60} {Eβ−glu i , i = 1 . . . 60} [60 discrete data] [60 discrete data]
1 ECell i = 174.0 Eβ−glu i = 220.0 [0.2 . . . 72.0] [0.299 . . . 16.587]

2∗ ECell i = 174.0∗ Eβ−glu i = 1280.0∗ [0.2 . . . 72.0]∗ [0.358 . . . 15.728]∗
3 ECell i = 775.0 Eβ−glu i = 220.0 [0.2 . . . 72.0] [1.134 . . . 16.260]
4 ECell i = 775.0 Eβ−glu i = 1280.0 [0.2 . . . 72.0] [0.879 . . . 17.104]
5 ECell i = 50.0 Eβ−glu i = 750.0 [0.2 . . . 72.0] [0.040 . . . 8.570]
6 ECell i = 900.0 Eβ−glu i = 750.0 [0.2 . . . 72.0] [0.808 . . . 19.659]
7 ECell i = 475.0 Eβ−glu i = 0.0 [0.2 . . . 72.0] [0.151 . . . 12.722]
8 ECell i = 475.0 Eβ−glu i = 1500.0 [0.2 . . . 72.0] [0.680 . . . 18.647]
9 (C) ECell i = 475.0 Eβ−glu i = 750.0 [0.2 . . . 72.0] [0.763 . . . 19.747]
∗ Trial 2 is used to validate the ANN model. Here the vectors ECell , Eβ−glu , t and G contain 14 experimental points (i = 1 . . . 14)
and glucose concentration, G (g L−1 ), was the model output.

ENZYMATIC HYDROLYSIS The purpose of this approach was to obtain an optimal region
of glucose yield, Yield (%), through the determination of the
optimum ratio of ECell (FPU L−1 ) and Eβ−glu (CBU L−1 ). In the current
Eβ-glu optimization problem of determining the optimum enzyme
{E1 … En}
Enzymes
concentrations, the effectiveness of the pretreatment and the

G optimal physical parameters of hydrolysis (temperature, T, pH and
ANN (g/L)
ECell agitation, N) are known. Furthermore, it is worthwhile mentioning
that the systematic approach developed can also be used to build
models with more than two inputs (in the case of this work, enzyme
t concentrations {E1 . . . En }). In order to accomplish this, the data-
driven identification procedure must be carried out adequately
Optimal pretreatment conditions with a suitable set of data representative of the process to be
Optimal T, pH and N studied. This requirement was fulfilled in the present case.
Figure 2. General framework of the model-based approach used to

optimize the enzymatic hydrolysis of sugarcane bagasse. RESULTS AND DISCUSSION
Results of the DOE
Development of the data-driven technique based on ANN and
employed to train the ANN. This algorithm makes use of the two DOE started with selection of the appropriate information to build
adjustable terms; learning rate, η and momentum coefficient, α, in the inferential model.
order to stabilize convergence and accelerate the optimization of Initially, the results of experiments obtained utilizing a central
the ANN parameters.32 composite design (CCD) with three replicates at the center point
The appropriate number of neurons in the hidden layer was were analyzed by considering glucose yield after hydrolysis of
found by the cross-validation technique in order to avoid model pretreated bagasse as response variable. The CCD matrix is seen
over-fitting and to achieve good generalization from the training in Table 2. The ranges of the input variables (ECell and Eβ−glu ) were
dataset. This technique splits the data sample into a training selected based on the results of preliminary studies described
dataset and a validation dataset. Then neural networks with elsewhere.17,18
different numbers of hidden nodes are trained with the training It can be seen from Table 2 that, in the operational conditions
dataset, and their performances evaluated on the ability to make used in this work, maximum glucose release was obtained
correct predictions of the validation dataset in terms of mean with ECell = 475.0 FPU L−1 and Eβ−glu = 750.0 CBU L−1 ,
square error (MSE): corresponding to the center point. Trial 6 (ECell = 900.0 FPU
L−1 and Eβ−glu = 750.0 CBU L−1 ) also led to high values of glucose

K
yield.
MSE = (dk − gk )2 (3)
Statistical analysis of the data obtained in the CDC was
k=1
performed using the software Statistica 7.0 (Statsoft, Inc., Tulsa,
In Equation (3), gk is the prediction of the neural networks and dk OK). A second-order statistical model was obtained:
is the desired output, which in this study is glucose concentration
G (g L−1 ). Glucose yield = 99.01 + 10.45 × x1 − 12.38 × x 1 2 (4)
Figure 2 illustrates the general framework of the model-based + 5.22 × x2 − 8.46 × x 2 + 2.13 × x1 × x2
2
approach proposed in this work. The concentrations of cellulase,

ECell (FPU L−1 ), β-glucosidase, Eβ−glu (CBU L−1 ), and the hydrolysis where x1 and x2 are the coded values of cellulase and β-glucosidase
987
time, t (h), were the inputs of the MLP neural network model, concentrations.

Results of ANN training and validation

(1)ECell(L) 42.65096 The potential of different ANN structures with a single hidden layer
for estimation of glucose concentration, G (g/L) was assessed. The
validation dataset was assembled to evaluate the performance
ECell(Q) -42.4282
of the trained ANN. ECell (FPU L−1 ) and Eβ−gluc (CBU L−1 ) and t
(h) vectors corresponding to Trial 2 in Table 3 were used in the
Eβ-glu(Q) -28.9883
validation set. During the validation process, the number of hidden
neurons was varied from 5 to 20, and the optimal number chosen
(2)Eβ-glu(L) 21.30082 by the cross-validation criterion with the number of epochs fixed
at 2000 for all the structures studied. The neural network with
1Lby2L 6.14622 11 hidden neurons for G (g L−1 ) was found to give the lowest
MSE for the validation sample. The glucose concentration was
then modeled using an ANN with 56 scalar parameters (weights
p = .05
Standardized Effect Estimate (Absolute Value)
and bias). The learning rate, η, and the momentum coefficient, α,
used in this work were optimized to 0.95 in the backpropagation
Figure 3. Pareto chart of standardized effects for the glucose yield (%). learning.
The ANN model selected for this work is illustrated in Fig. 4
(using the notation from Equation (2)). Table 5 shows the final
The Pareto chart (Fig. 3) shows the magnitude of the effects optimized parameters (weights and bias) of the model. This table
on glucose yield. In this chart, the effect estimates divided by contains the additional information required to reproduce the
their standard errors are sorted from the largest absolute value behavior of glucose concentration using the ANN model from a
to the smallest absolute value. The magnitude of each effect is broad range of enzyme loadings (see Table 3).
represented by a column, and a line going across the columns The performance of the optimal ANN model in describing the
indicates how large an effect must be to be considered statistically experimental glucose concentration (g L−1 ) for the training and
significant. In this work the vertical line corresponds to a p-value validation data sets are shown in Fig. 5. As shown in Fig. 5(a)
of 0.05, which implies a 95% level of significance. It can be seen and (b) (dataset corresponding to Trials 5 and 8 in Table 3), the
from Fig. 3 that all the effects are statistically significant and the model effectively tracks the desired trajectory of experimental
largest effects on glucose yield are the linear and quadratic effects observations. For the validation data set (Trial 2), it can be seen
of cellulase concentration. from Fig. 5(c) that the model also described the experimental
The ANOVA is shown in Table 4. From this table it can be observations accurately.
concluded that the model does not fit the experimental data The quest for a more rigorous evaluation of a good modeling
well. The calculated F value (lack of fit/pure error) is higher than tool leads to the use of additional performance measures (criteria).
the critical F value at 95% confidence (i.e. at this confidence RSD (residual standard deviation), written as a percentage of the
level, there is evidence of lack of fit for the model, Equation (4)). average of the experimental values, dk , defined by Equation (5),
Furthermore, the results show that the model accounted for a was used to evaluate the quality prediction of the ANN model.
low percentage of the explained variance (R2 = 68.0%), and
the calculated regression F value (regression/residual) is lower 0.5
1 n
than the critical F value at 95% confidence, indicating that the (dk − gk )2
regression is statistically not significant. n k=1
RSD(%) = × 100 (5)
The statistical model obtained for the data considered in this dk
work was not significant due to the high non-linearity of the
enzymatic hydrolysis process. Moreover, as the glucose yield has where dk is the experimental observation, gk is the value predicted
been determined over a wide range of enzyme concentrations, by the ANN model and n is the number of points.
a second-degree polynomial was not capable of describing the In this work, the RSD(%), the correlation coefficient (R2 ) and the
glucose yield in the range considered. As an alternative to the mean square error (MSE) were used to evaluate the performance
statistical techniques, an ANN model was implemented. of the ANN model, as can be seen in Table 6. From these criteria,
Table 4. ANOVA for the model describing glucose yield
Source of variation Sum of squares Degrees of freedom Mean square F-ratio
Regression 2119.1 5 423.0 2.11∗

Residual 1003.1 5 200.6
Lack of fit 1002.1 3 334.1 695.4∗∗
Pure error 0.961 2 0.480
Total 3121.3 10
∗
F listed values F5,5 = 5.5
(95% of confidence) ∗∗ F3,2 = 19.16
Percentage of explained variance (R2 ) = 68.0; percentage of explicable variance = 99.9.

∗ F test for statistical significance of the regression = regresion/residual.
∗∗ F test for lack of fit = lack of fit/pure error.
988
θ1
w1,1
w1,2 + f1(•)
w1,3
x1 = ECell θ2
w2,1 θ1
W1,1
x2 = Eβ-gluc
w2,2 + f2(•) W1,2 + F1(•) g1 = G
•
•
x3 = t w2,3 •
W1,11
•
θ11 •
•
w11,1
w11,2 + f11(•)
w11,3
Input layer Hidden layer Output layer
Figure 4. Optimal ANN structure used for prediction of glucose concentration.
Table 5. Optimized parameters (weights and bias) of the ANN model
Parameters connecting the input and hidden neurons Parameters connecting the hidden and output neurons
wj1 wj2 wj3 θj W1j b1 = 2.076
j=1 −1.827 −0.7160 −1.072 −0.3798 8.284 × 10−2

j=2 2.613 −4.558 −6.729 0.9368 −2.254
j=3 −2.752 −0.2179 4.301 × 10−2 −0.2609 0.1154
j=4 −3.072 −4.270 16.92 −6.356 −2.222
j=5 −1.849 −1.653 −0.9896 0.2454 −0.5380
j=6 −1.929 −0.9366 −0.6441 −1.608 0.8346
j=7 −0.2829 −22.82 1.053 0.5961 −10.46
j=8 −2.072 −4.927 2.043 3.595 −1.395
j=9 −1.784 −1.903 −1.898 0.5293 −0.8095
j = 10 −4.009 3.465 −4.913 6.030 −2.993
j = 11 −5.663 −4.807 × 10−2 5.377 7.509 2.676
by the MSE and RSD(%). Furthermore, in all cases R2 was close to

Table 6. Statistical criteria used to characterize the prediction quality
of the ANN model unity, indicating a good fit of the model to the experimental data.
Finally, it is worth mentioning that these results have shown
Trial MSE RSD(%) R2 that the dynamic model developed in this work can infer directly
the glucose release during enzymatic hydrolysis from a knowledge
1 26.8 17.0 0.98
of enzymatic loadings.
2 (Valid.) 16.8 14.4 0.97
3 13.9 10.4 0.97
4 13.4 10.0 0.98 Prediction of optimal enzyme loading using the ANN model
5 8.0 18.7 0.99 Prediction of the optimal region of glucose yield is one of the
6 49.9 17.4 0.98 main objectives in this study. Thus, after ANN model validation
7 3.2 8.8 0.99 is completed, it can be used to find processing inputs, i.e. the
8 36.4 15.8 0.98 optimum ratio of Ecell and Eβ−glu that gives the optimal glucose
9 (C) 19.6 11.6 0.99 yield. The conversion of glucose concentration, G (g L−1 ) to glucose
yield, Yield (%) is straightforward using Equation (1).
A three-dimensional response surface of glucose yield, was
obtained by keeping the time variable, t (h) constant at 72 h
(at which time no significant changes in yield were detected).
it was concluded that for the training and validation (marked in The response surface is shown in Fig. 6(a). It can be seen that
989
bold) datasets, the model showed similar performance, as judged the surface is strongly nonlinear and reveals a single maximum,

(a) 10 (a)
8
Glucose (g/L)
6
4
2
0
0 20 40 60 80
(b) 22
18
Glucose (g/L)
13
9
4 (b)
0
0 20 40 60 80
(c) 18
14
Glucose (g/L)
11
7
4
0
0 20 40 60 80
Time (h)
Figure 5. Experimental data (filled triangles) and performance of the ANN

model for the glucose concentration using: (a) and (b) training dataset
(corresponding to Trial 5 and 8 in Table 3) and (c) validation dataset Figure 6. (a) Response surface and (b) contour plot generated by ANN
(corresponding to Trial 2 in Table 3). model showing the effect of cellulase and β-glucosidase activities on
glucose yield at 72 h.
which represents the optimal ratio of Ecell to Eβ−glu for maximum Studies carried out by Martin35 have shown that bagasse
glucose yield. This behavior, confirmed by the validation results pretreated with steam at 250 ◦ C for 10 min, after impregnation
of the ANN-based predictive model, has shown that the addition with sulfuric acid 1% (w/w), can provide a glucose yield of 35.9%
of Eβ−glu to all concentrations of Ecell tested does not necessarily using an enzyme loading of 41.4 FPU g−1 dry pre-treated biomass
ensure gradual increase in Yield. of cellulase and 39 IU g−1 dry pre-treated biomass of β-glucosidase.
Figure 6(b) show the optimal processing conditions to maxi- According to Zhao,36 pretreating bagasse with 10% NaOH at
mum glucose yield. This figure shows a contour plot of Yield as 90 ◦ C for 1.5 h and further delignifying with 10% peracetic acid at
a function of Ecell and Eβ−glu . It can be seen that the maximum 75 ◦ C for 2.5 h, led to a yield of reducing sugars of 92.04% and a
yield is not obtained for the maximum enzymes loading. Analysis glucose yield of 56.23% after enzymatic hydrolysis for 120 h with
of the contours indicated that glucose yield above 99.0% could cellulase loading of 15 FPU g−1 solid.
be obtained in the ECell range 460.0–580.0 FPU L−1 (15.3–19.3 Mesa37 studied the organosolv pretreatment of sugarcane
FPU g−1 bagasse) with supplementation of Eβ−glu in the range bagasse. The best result in terms of glucose yield was 20.9 g
750.0–1140.0 CBU L−1 (25–38 CBU g−1 bagasse). Concentrations glucose per 100 g sugarcane bagasse, obtained after enzymatic
of enzyme below and above these ranges led to reduced yields. hydrolysis of biomass pretreated with 1.25% sulfuric acid as a
The amount of enzymes required for high yields in enzymatic catalyst for 60 min. The enzymatic loading was 15.0 FPU g−1 dry
hydrolysis is directly related to the efficiency and type of biomass pre-treated of cellulase and 15.0 IU g−1 dry biomass of
pretreatment. Thus, different pretreatments lead to different pre-treated β-glucosidase.
changes in the factors that cause resistance of lignocellulosic It is worthwhile mentioning that the optimal processing
materials to enzymatic hydrolysis, such as the content of lignin, the condition obtained in this work is certainly specific for the raw
presence of acetyl groups, cellulose crystallinity, polymerization material pretreatment method and conditions used. However,
degree, surface area, volume and particle size.33 it should be emphasized that application of the model-based
Comparison of the results obtained in this work with those approach steps presented in this work to other enzymatic
obtained by other authors show that combining an adequate hydrolysis processes is straightforward.
pretreatment agent with optimization tools is a successful method
to achieve high yields with minimum enzyme loading.
Krishna34 pre-treated sugarcane bagasse with 1% alkaline CONCLUDING REMARKS
hydrogen peroxide, resulting in glucose yields of 70%, using This work presents results from the development and testing of
990
an enzyme loading of 40 FPU g−1 biomass. a modeling approach for the optimization of enzymes loading
in enzymatic hydrolysis of sugarcane bagasse. In this process, 2 Cardona CA and Sánchez OJ, Fuel ethanol production: Process
besides considering the optimal pretreatment conditions prior design trends and integration opportunities. Bioresource Technol
to enzymatic hydrolysis, it is also important to define an optimum 98:2415–2457 (2007).
3 Himmel ME, Ding SY, Johnson DK, Adney WS, Nimlos MR, Brady JW
ratio of enzymes that makes it possible to reduce enzyme loading, et al, Biomass recalcitrance: engineering plants and enzymes for
which is often an important requirement to provide cost-efficient biofuels production. Science 315:804–807 (2007).
second generation ethanol processes. Thus, for reliable per- 4 Meyer AS, Rosgaard L and Sorensen HR, The minimal enzyme cocktail
formance prediction and optimization through modeling, a concept for biomass processing. J Cereal Sci 50:337–344 (2009).
systematic model-based approach should be implemented. In this 5 Zhou J, Wang YH, Chu J, Luo LC, Zhuang YP and Zhang SL,
Optimization of cellulase mixture for efficient hydrolysis of
work, a methodology to fully optimize enzyme loading has been steam-exploded corn stover by statistically designed experiments.
developed using artificial neural networks and design of experi- Bioresource Technol 100:819–825 (2009).
ments techniques that have been widely used for prediction and 6 Kumar R and Wyman CE, Effect of enzyme supplementation at
optimization purposes in engineering applications. Several recent moderate cellulase loadings on initial glucose and xylose release
from corn stover solids pretreated by leading technologies.
works points out the potential of combining the features of both
Biotechnol Bioeng 102:457–467 (2009).
techniques to enhance prediction and optimization performance. 7 Kunamneni A and Singh S, Response surface optimization of
The complex relationship between the enzymatic activities enzymatic ydrolysis of maize starch for higher glucose production.
and the glucose released in a hydrolysis reaction is not easy to Biochem Eng J 27:179–190 (2005).
identify. However, the data-driven identification methodology 8 Sorensen HR, Pedersen S and Meyer AS, Optimization of reaction
conditions for enzymatic viscosity reduction and hydrolysis of
developed in this work can convert the correlation of these
wheat arabinoxylan in an industrial ethanol fermentation residue.
variables into a predictive mathematical model, while maximizing Biotechnol Prog 22:505–513 (2006).
the amount of experimental information that can be obtained 9 Kaur S, Sarkar BC and Sharma HK, Optimization of enzymatic
about the enzymatic process using an experimental design. Based hydrolysis pretreatment conditions for enhanced juice recovery
on this methodology, an optimal neural network structure was from guava fruit using response surface methodology. Food
Bioprocess Technol 2:96–100 (2009).
obtained and its performance in describing the dynamic behavior
10 Ferreira S, Duarte AP, Ribeiro MHL, Queiroz JA and Domingues F,
of glucose release during hydrolysis was assessed. Prediction by the Response surface optimization of enzymatic hydrolysis of Cistus
model using the validation dataset gave acceptable performance ladanifer and Cytisus striatus for bioethanol production. Biochem
measures (MSE, RSE and R2 ), equivalent to those obtained for the Eng J 45:192–200 (2009).
training dataset. 11 Montgomery DC, Design and Analysis of Experiments, 4th edn. John
Wiley and Sons, New York (1997).
The dynamic model developed for the enzymatic hydrolysis of 12 Liao XP, Xie HM, Zhou YJ and Xia W, Adaptive adjustment of plastic
sugarcane bagasse can be used not only for the prediction of injection processes based on neural network. JMaterProcess Technol
glucose concentration profiles for different enzymatic loadings, 187–188:676–679 (2007).
but also to obtain the optimum enzymes loading that leads to 13 Garcia DR, Determination of kinetics data of the pretreatment
optimal glucose yield. It can promote both successful hydrolysis of sugarcane bagasse with alkaline hydrogen peroxide and
subsequent enzymatic hydrolysis. MSc thesis, State University of
process control and a more effective employment of enzymes. Campinas, SP, Brazil (2009).
It can be seen from the graphical plots that the effectiveness of 14 Paliwal M and Kumar UA, Neural networks and statistical techniques:
enzymatic hydrolysis, as evaluated from the glucose released, a review of applications. Expert Syst Applic 36:2–17 (2009).
was strongly dependent on the cellulase and β-glucosidase 15 Rivera EC, Costa AC, Andrade RR, Atala DIP, Maugeri Filho F and Maciel
loadings. The recommended enzyme activities from the study Filho R, Developed of adaptive modeling techniques to describe
the temperature dependent kinetics of biotechnological processes.
were; cellulase concentration in the range 460.0–580.0 FPU L−1 Biochem Eng J 36:157–166 (2007).
(which corresponds to the range 15.3–19.5 FPU g−1 bagasse) 16 Rivera EC, Costa AC, Wolf Maciel MR and Maciel Filho R, Ethyl alcohol
and β-glucosidase in the range 750.0–1140.0 CBU L−1 (which production optimization by coupling genetic algorithm and
corresponds to 25–38 CBU g−1 bagasse). In these conditions the multilayer perceptron neural network. Appl Biochem Biotechnol
glucose yield was above 99%. Calculation of this optimum region 129–132:969–984 (2006).
17 Rabelo SC, Maciel Filho R and Costa AC, A comparison between
may have practical consequences for improving the hydrolytic lime and alkaline hydrogen peroxide pretreatments of sugarcane
efficiency of the enzymes that may be independently controlled bagasse for ethanol production. Appl Biochem Biotechnol
to accomplish a desired enzyme constraint regarding glucose yield. 144:87–100 (2008).
Although the optimal enzymatic load is highly dependent on 18 Rabelo SC, Maciel Filho R and Costa AC, Lime pretreatment of
the pretreatment method chosen and its operational conditions, sugarcane bagasse for bioethanol production. Appl Biochem
Biotechnol 153:139–150 (2009).
as well as on the quality of the raw biomass used, the methodology 19 Sluiter A, Ruiz R, Scarlata C, Sluiter J and Templeton D, National
developed in this work is straightforward and can be easily Renewable Energy Laboratory, Golden, CO, USA, (2007).
applied to any combination of pretreatment/biomass employed http://www.nrel.gov/biomass/analytical procedures.html#lap-010
in enzymatic hydrolysis. [accessed 1 August 2008].
20 Sluiter A, Hames B, Ruiz R, Scarlata C, Sluiter J and Templeton D,
National Renewable Energy Laboratory, Golden, CO, USA, (2005).
http://www.nrel.gov/biomass/analytical procedures.html#lap-005
ACKNOWLEDGEMENTS [accessed 27 August 2008].
The authors thank Fundação de Amparo à Pesquisa do Estado de 21 Sluiter A, Hames B, Ruiz R, Scarlata C, Sluiter J and Templeton D,
São Paulo (FAPESP) and Conselho Nacional de Desenvolvimento National Renewable Energy Laboratory, Golden, CO, USA, (2005).
http://www.nrel.gov/biomass/analytical procedures.html#lap-013
Cientı́fico e Tecnológico (CNPq) for financial support.
[accessed 4 September 2007].
22 Hyman D, Sluiter A, Crocker D, Johnson D, Sluiter J, Black S et al,
National Renewable Energy Laboratory, Golden, CO, USA, (2007).
REFERENCES http://www.nrel.gov/biomass/analytical procedures.html#acid
1 Zaldivar J, Nielsen J and Olsson L, Fuel ethanol production from soluble [accessed 6 October 2008].
lignocellulose: a challenge for metabolic engineering and process 23 Ghose TK, Measurement of cellulase activities. Pure Appl Chem
991
integration. Appl Microbiol Biotechnol 56:17–34 (2001). 59:257–268 (1987).

24 Adney B and Baker J, National Renewable Energy Laboratory, artificial neural networks in a biochemical reaction. J Food Eng
Golden, CO, USA, (1996). http://www.nrel.gov/biomass/ 78:846–854 (2007).
analytical procedures.html#lap-006 [accessed 6 October 2006]. 31 Bishop CM, Neural Networks for Pattern Recognition. Oxford University
25 Wood TM and Bhat KM, Methods for measuring cellulase activities, in: Press, Oxford (1995).
Methods in Enzymology, ed by Wood WA and Kellog ST, Academic 32 Balestrassi PP, Popova E, Paiva AP and Lima JWM, Design of
Press, San Diego, CA, vol 160, pp 81–112 (1988). experiments on neural network’s training for nonlinear time series
26 Cybenko G, Approximation by superpositions of a sigmoidal function. forecasting. Neurocomputing 72:1160–1178 (2009).
Math Control Signal 2:303–314 (1989). 33 Hendriks ATWM and Zeeman G, Pretreatments to enhance the
27 Rivera EC, Farias Junior F, Atala DIP, Andrade RR, Costa AC and Maciel digestibility of lignocellulosic biomass. Bioresource Technol
Filho R, A LabVIEW-based intelligent system for monitoring of 100:10–18 (2009).
bioprocesses, in Computer-Aided Chemical Engineering, ed by 34 Krishna SH, Prasanthi K, Chowdary GV and Ayyanna C, Simultaneous
Jeżowski J and Thullie J. Elsevier B.V., The Netherlands, vol 26, saccharification and fermentation of pretreated sugar cane leaves
pp 309–314 (2009). to ethanol. Process Biochem 33:825–830 (1998).
28 Gonzaga JCB, Meleiro LAC, Kiang C and Maciel Filho R, ANN-based 35 Martı́n C, Galbe M, Nilvebrant NO and Jönsson LJ, Comparison of the
soft-sensor for real-time process monitoring and control of an fermentability of enzymatic hydrolyzates of sugarcane bagasse
industrial polymerization process. Comput Chem Eng 33:43–49 pretreated by steam explosion using different impregnating agents.
(2009). Appl Biochem Biotechnol 98–100:669–716 (2002).
29 Desai KM, Survase SA, Saudagar PS, Lele SS and Singhal RS, 36 Zhao X, Peng F, Cheng K and Liu D, Enhancement of the enzymatic
Comparison of artificial neural network (ANN) and response surface digestibility of sugarcane bagasse by alkali–peracetic acid
methodology (RSM) in fermentation media optimization: case pretreatment. Enzym Microb Technol 44:17–23 (2009).
study of fermentative production of scleroglucan. Biochem Eng 37 Mesa L, González E, Ruiz E, Romero I, Cara C, Felissia F et al, Preliminary
J 41:266–273 (2008). evaluation of organosolv pre-treatment of sugar cane bagasse for
30 Bas D and Boyac IH, Modeling and optimization II: comparison of glucose production: application of 23 experimental design. Appl
estimation capabilities of response surface methodology with Energy 87:109–114 (2010).
992

JCTB 2391

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

JCTB 2391

Uploaded by

Copyright:

Available Formats

Research Article

(www.interscience.wiley.com) DOI 10.1002/jctb.2391

Enzymatic hydrolysis of sugarcane bagasse

NOTATION η learning rate

α momentum coefficient SP, Brazil

J Chem Technol Biotechnol 2010; 85: 983–992 www.soci.org

Sugar analysis by HPLC

J Chem Technol Biotechnol 2010; 85: 983–992

1 174.0 220.0 16.59 82.81 0.230

Design of experiments (DOE) allows one to optimize the input 22

should be sufficient to build effective ANN models.29 This approach 13

Input vectors Output vector

ECell (FPU L−1 ) Eβ−glu (CBU L−1 ) t (h) G (g L−1 )

1 ECell i = 174.0 Eβ−glu i = 220.0 [0.2 . . . 72.0] [0.299 . . . 16.587]

and glucose concentration, G (g L−1 ), was the model output.

concentrations, the effectiveness of the pretreatment and the

Figure 2. General framework of the model-based approach used to

approach proposed in this work. The concentrations of cellulase,

J Chem Technol Biotechnol 2010; 85: 983–992

Results of ANN training and validation

Table 4. ANOVA for the model describing glucose yield

Source of variation Sum of squares Degrees of freedom Mean square F-ratio

Regression 2119.1 5 423.0 2.11∗

Percentage of explained variance (R2 ) = 68.0; percentage of explicable variance = 99.9.

Input layer Hidden layer Output layer

Figure 4. Optimal ANN structure used for prediction of glucose concentration.

Table 5. Optimized parameters (weights and bias) of the ANN model

wj1 wj2 wj3 θj W1j b1 = 2.076

j=1 −1.827 −0.7160 −1.072 −0.3798 8.284 × 10−2

by the MSE and RSD(%). Furthermore, in all cases R2 was close to

J Chem Technol Biotechnol 2010; 85: 983–992

Figure 5. Experimental data (filled triangles) and performance of the ANN

integration. Appl Microbiol Biotechnol 56:17–34 (2001). 59:257–268 (1987).

J Chem Technol Biotechnol 2010; 85: 983–992

You might also like