Download as pdf or txt
Download as pdf or txt
You are on page 1of 38

Journal Pre-proofs

Modeling and optimization of microbial lipid fermentation from cellulosic etha-


nol wastewater by Rhodotorula glutinis based on the support vector machine

Lihe Zhang, Bin Chao, Xu Zhang

PII: S0960-8524(20)30050-X
DOI: https://doi.org/10.1016/j.biortech.2020.122781
Reference: BITE 122781

To appear in: Bioresource Technology

Received Date: 29 November 2019


Revised Date: 6 January 2020
Accepted Date: 7 January 2020

Please cite this article as: Zhang, L., Chao, B., Zhang, X., Modeling and optimization of microbial lipid fermentation
from cellulosic ethanol wastewater by Rhodotorula glutinis based on the support vector machine, Bioresource
Technology (2020), doi: https://doi.org/10.1016/j.biortech.2020.122781

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover
page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will
undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing
this version to give early visibility of the article. Please note that, during the production process, errors may be
discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2020 Elsevier Ltd. All rights reserved.


1 Modeling and optimization of microbial lipid
2 fermentation from cellulosic ethanol wastewater
3 by Rhodotorula glutinis based on the support
4 vector machine
5 Lihe Zhang, Bin Chao, Xu Zhang*
6 Beijing Key Lab of Bioprocess, National Energy R&D Center for Biorefinery, College
7 of life Science and Technology, Beijing University of Chemical Technology, Beijing
8 China

9 Highlights

10  The change law of organic matter in fermentation of lipid was analyzed.

11  BP-ANN and SVM model of the fermentation of the ethanol wastewater were

12 established.

13  SVM is better than BP-ANN in prediction and optimization based on small sample.

14  The parameters were optimized by genetic algorithm based on SVM.

15 Key words

16 Cellulosic ethanol wastewater; Microbial lipid; Support vector machine; Artificial

17 neural network

18 Abstract

19 To establish the models of microbial lipid production from cellulosic ethanol

20 wastewater by R. glutinis, the biomass, lipid yield, and COD removal rate were

21 investigated under different conditions. Subsequently, the genetic algorithm based on


22 SVM was adopted to optimize parameters for obtaining the maximum biomass. The

23 results demonstrated that the initial COD and glucose content had a significant effect on

24 lipids synthesis. Most of the organic matter in the wastewater was consumed with the

25 production of lipid. Compared with BP-ANN, SVM had better fitting and generalization

26 ability for small amount of experimental data. By genetic algorithm optimization based

27 on SVM, the maximum biomass and lipid yield could reach 11.87 g/L and 2.18 g/L,

28 respectively. The results suggest that the SVM model could be used as an effective tool

29 to optimize fermentation conditions.

30 1. Introduction
31 Biofuels as a crucial renewable resource have spurred worldwide attention. At the

32 same time opportunities and challenges coexist in the development of biofuels (Baeyens

33 et al.2015; Wang et al 2019; Moreno et al.2017). In the past few decades, numerous

34 studies have been conducted on the production of biofuels from lignocellulosic biomass

35 to substitute the fossil fuels (e.g., bioethanol, especially cellulosic ethanol) (Humbird et

36 al.2011; Chandel et al.2018 Hanna, B 2018; Ibarra-Gonzalez et al 2019). In the

37 meantime, the increasing demand of cellulosic ethanol has boosted the technical

38 progress to produce biofuels. However, the production of cellulosic ethanol often results

39 in a concurrent production of large volumes of high-strength wastewater. (Humbird et

40 al.2011). According to the different production process, 6t to 80t of wastewater is

41 generated from ethanol produced per ton (Wang et al.2017). Besides, the wastewater

42 has a high chemical oxygen demand (COD), primarily containing sugars, organic acids,

43 glycerin, inhibitors, inorganic salt, etc. Cellulosic ethanol wastewater belongs to a


44 typical type of biorefinery wastewater that exhibits high concentration, high chroma,

45 complicated component and low pH (Zhang et al.2018). Thus, it will seriously pollute

46 the quality of the water environment and adversely affect people’s normal life without

47 purification. To achieve large-scale commercial production of fuel ethanol from

48 lignocellulose, a practical solution of wastewater treatment should be developed (Sarkar

49 et al.2012).

50 However, the resource utilization of the cellulosic ethanol wastewater has not been

51 reported extensively. Previous studies have focused on how to remove the COD or to

52 reduce the costs and energy consumption mainly (Steinwinder 2011; Zhao and Yu

53 2013). There are many research about the treatment methods of the industrial waste

54 water, which are very difficult and energy consuming. (e.g., evaporation, membrane

55 separation, single cell protein production, and electrochemical processes) (Shan et

56 al.2015; Shan et al.2015; Hu et al. 2017; Lynd et al 2017). For the cellulosic ethanol

57 wastewater, which is rich in organic matter, the most economical and feasible method is

58 to convert the residual sugars and organic compounds into lipid by oleaginous yeasts

59 (Gude 2016). As a crucial feedstock, microbial lipid can be used to produce biodiesel,

60 biolubricant and jet fuel using different methods (Chuck et al. 2014). However, because

61 of the high cost, it was difficult to achieve commercial application of biodiesel

62 production from microbial lipid. The main cost of microbial cultivation is the raw

63 material that takes up more than 80% (Xue et al.2010). According to previous research,

64 the production of microbial lipid from wastewater can effectively lower the cost of its

65 production. There are numerous studies about the production of microbial lipid from
66 various types of organic wastewater by oleaginous yeasts, especially the R. glutinis

67 (Xue et al.2006; Hall et al. 2011; Ling et al.2013; Zhou et al. 2013; Peng et al.2013;

68 Chen et al.2009). In our previous study, the feasibility of using the cellulosic ethanol

69 wastewater by R. glutinis has been explored. Subsequently, a novel strategy was

70 formulated for lipid production through coupling oleaginous yeasts and activated sludge

71 biological methods. As the results suggested, the utilization of cellulosic ethanol

72 wastewater by R. glutinis for producing microbial lipid can not only save the cost of

73 producing microbial lipid, but also remove the COD of wastewater.

74 In the process of fermentation, many technological parameters will have an

75 important impact on the fermentation results, such as fermentation time, temperature,

76 pH, solid-liquid ratio and so on. The yield can be increased by optimizing the process

77 parameters. The existing methods for obtaining the optimal pretreatment parameters

78 mostly cover orthogonal experimental design (OED) (Zhu et al., 2013), response

79 surface methodology (RSM) (Mohammadi et al., 2016), and uniform design (UD) (Fang

80 et al., 2000), artificial neural network (ANN) (Singh et al., 2017; Boukelia et al, 2016),

81 support vector machine (SVM) (Pablo et al., 2013), etc. But the accuracy of OED, RSM

82 and UD methods is not very high if the experimental data are not enough, which limits

83 their applications range. More and more researchers start to use ANN and SVM to build

84 models to optimize fermentation conditions. The modeling of fermentation can provide

85 reliable data reference for the control and optimization of fermentation process

86 parameters. ANN simulates the biological nervous system with bionics. Generally, it is

87 composed of input layer, hidden layer and output layer. The layers are connected by
88 weights, and each layer contains one or more nodes. It should only know the input and

89 output data of the fermentation, whereas it is not required to study the reaction

90 mechanism of the fermentation process. By analyzing the biological nervous system

91 from different angles, a variety of artificial neural network models are obtained. And

92 among these models, BP-ANN and RBP-ANN are commonly used. Accordingly,

93 modeling based on the ANN method is simple and easy, whereas its training algorithm

94 converges slowly and falls into local optimum easily. It is not suitable for modeling

95 with small sample data (Sebayang et al.2017; Grahovac et al.2016). SVM is considered

96 a novel pattern classification and nonlinear regression method for statistical learning

97 theory. It follows the structural risk minimization criterion to minimize the risk of the

98 sample points while minimizing the risk structure and enhancing the generalization

99 ability of the model. It has developed rapidly and has been successfully applied in many

100 fields (bioinformatics, medicine, text and handwriting recognition, etc.) (Irawan et

101 al.2015; Guerbai et al.2018). Compare with conventional ANN, SVM is capable of

102 obtaining global optimal solutions based on small samples. As fueled by the rapid

103 advancement of computer technology, SVM has now been widely used in various

104 disciplines of scientific research.

105 In this study, the cellulosic ethanol wastewater was applied as raw material to

106 culture R. glutinis aiming to evaluate the effect of initial concentration of glucose and

107 COD on the lipid fermentation and to investigate the change of major organics in the

108 wastewater. Therefore, biomass, lipid synthesis and the concentration of major organics

109 were monitored at different times when supplied with different concentrations. Based
110 on the data obtained from fermentation, the SVM and BP-ANN models of microbial

111 lipid fermentation were established. Subsequently, the best model was selected to find

112 the best process parameters for obtaining the maximum biomass concentration using the

113 genetic algorithm.

114 2. Material and methods

115 2.1 Microorganism, culture conditions, and wastewater

116 The yeast strain R. glutinis CGMCC No. 2258 was obtained from the China

117 National Research Institute of Food and Fermentation Industries. Besides, it was stored

118 in agar slant medium with yeast extract (4 g/L), urea (2 g/L), and glucose (200 g/L) at

119 4 °C.

120 The seed&basic medium contained (g/L) glucose 40, (NH4)2SO4 2, KH2PO4 7,

121 Na2SO4 2, MgSO4 1.5, Yeast extract 1.5.

122 Cellulosic ethanol wastewater which was used in our study was purchased from the

123 COFCO Corporation, China.

124 The wastewater was diluted into different proportion before making up the medium.

125 Subsequently, only glucose was added. All mediums were adjusted to the same initial

126 pH at 5.5 and sterilized at 121℃ for 20 min. The inoculums were cultured at 30 °C in a

127 180-rpm shaker for 24 h and then transferred into 500 mL flasks that contained 100 mL

128 medium with 10% inoculation size (v/v).

129 2.2. Analytical methods

130 The dry cell weight method was used to measure the biomass (Zhang et al., 2014).

131 Moreover, the concentration of glucose was measured by a glucose biosensor (SBA
132 40C, Biological Institute of Shandong Academy of Sciences). Lipid was extracted using

133 the method reported by (Xue et al. 2008). The lipid components were analyzed as

134 described in the existing study (Zhang et al. 2014).

135 HPLC (Thermo Scientific, Waltham, MA, USA) was used to measure the

136 concentration of sugars, organic acids and other organics and the specific method was

137 followed the procedure reported in Patel et al. (2015).

138 2.3 Experimental design and prediction model

139 In this paper, the MATLAB software was used to establish the SVM model and

140 BP-ANN model. In the experiments, the biomass, time and the concentration of glucose

141 were obtained to prepare data for modeling. The model was trained using the

142 fermentation data collected from the experiments to obtain a prediction model of lipid

143 fermentation from cellulosic ethanol wastewater by R.glutinis.

144 2.3.1 Establishment and functions of SVM model

145 In the present study, SVM regression model was built using the fermentation data

146 of R. glutinis. It is equivalent to a function map, as shown in Eq. (1), which has an input

147 and an output.

148 𝑦 = (𝑥) …………………………………… (1)

149 Where x denotes the independent variable; y is the dependent variable. In this

150 study, the fermentation time and the concentration of various substances at different

151 times are the dependent variable.

152 (1) Data preprocessing

153 The selected data were normalized by the Eq. (2). In MATLAB, the above
154 normalization can be achieved by the ‘mapminmax’ function (3). The mapping adopted

155 by the ‘mapminmax’ function is expressed as Eq. (4).


𝑥 ― 𝑥𝑚𝑖𝑛
156 𝑓:𝑥→𝑦 = 𝑥𝑚𝑖𝑛𝑚𝑎𝑥…………………………………………… (2)

157 [y, ps] = mapminmax(x, ymin, ymax) ………………………… (3)

(ymax - ymin) × (x - xmin)


158 𝑦= xmax - xmin
+ ymin………………………………… (4)

159 Where x denotes the data before normalization, in this paper, it mainly refers to the

160 data obtained from fermentation; ymin and ymax refer to the range parameters of the

161 mapping, the default values are -1 and 1, respectively; y is the normalized data; ps

162 indicates the structure that holds the normalized mapping; ymin and ymax and ps were

163 parameters related to software settings.

164 (2) Optimization selection of model parameters

165 In this paper, the best penalty factor c and g were obtained using cross validation

166 method (CV method). Function (5) in the toolbox of LIBSVM-FarutoUltimate was

167 employed to achieve the CV method.

168 [mes, bestc, bestg] = ...SVMcgForRegress(train_

169 y, train_x, cmin, cmax, gmin, gmax,v, cstep, gstep, msestep) ... .. (5)

170 Input: Where tarin_y refers to the dependent variable to be regressed and its size is

171 n by 1, and n is the number of samples; train_x is an independent variable and its size is

172 n by m, where n represents the number of samples and m represents the number of

173 independent variables; cmin and cmax are the minimum and maximum values of the

174 penalty coefficient c after taking the power exponent with the base of 2 and the default

175 values are -5 and 5, respectively; gmin and gmax are the minimum and maximum
176 values of the model parameter g after taking the power exponent with the base of 2 and

177 the default values are -5 and 5, respectively ; v represents the CV parameter and its

178 default is 5; cstep and gstep are the step size of the parameter c and g and their default

179 are 1, respectively; msestep refers to the step size of the MSE graph and its default is 5.

180 Output: Where mse denotes the lowest mean square error in the CV process; bestc

181 and bestg are the optimal parameters c and g, respectively.

182 (3) Training and regression prediction

183 The SVM model was trained by the best parameters c and g obtained by the CV

184 method, and subsequently, the experimental data were predicted by the regressive

185 analysis. The SVM model in this paper is implemented using the LIBSVM toolbox. The

186 major functions of the LIBSVM toolbox cover the training function ‘svmtrain’ and the

187 prediction function ‘svmpredict’.

188 Training function ‘svmtrain’:

189 model = svmtrain(train_y, train_x, options) …………………… (6)

190 Input: Where train_y denotes the dependent variable of the training set and its size

191 is n by 1, and n is the number of samples ; train_x refers to the independent variable

192 of the training set and its size is n by m, where n represents the number of samples

193 and m represents the number of independent variables ; Options is a parameter

194 option.

195 Output: Model represents a model obtained by training.

196 The prediction function ‘svmpredict’:

197 [predict_y, mse, dec_value] = svmpredict(test_y, test_x, model) ……… (7)


198 Input: Where test_y denotes the dependent variable of the test set and its size is

199 n by 1, and n is the number of samples; test_x indicates the independent variable of

200 the test set and its size is n by m, where n represents the number of samples and m

201 represents the number of independent variables; model is the SVM model trained

202 by the svmtrain function.

203 Output: Where predict_y denotes the result of the predicted test set ; mse refers to a

204 column vector with the size of 3×1; dec_value is the decision value.

205 2.3.2 Establishment and functions of BP-ANN model

206 The establishment of BP artificial neural network model can be divided into three

207 steps: construction, training and prediction. MATLAB software has a neural network

208 toolbox, which includes BP-ANN. BP-ANN involves three functions, ‘newff’, ‘train’

209 and ‘sim’. Before the BP-ANN modeling, the data were also preprocessed by Eq. (3).

210 (1) Parameter setting function ‘newff’

211 net = newff(P, T, S)……………………………….…….. (8)

212 Input: Where P is the input variable matrix; T is the output variable matrix; S is the

213 number of nodes in the hidden layer. The size of variable matrix were determined by

214 experimental data.

215 Output: Where Net is the BP artificial neural network after initialization.

216 (2) The training function ‘train’

217 net = train (NET, X, T, INi, OUTi) ………………………………. (9)

218 Input: Where NET for training network; X is the input variable matrix; T is the

219 output variable matrix; INi is the input layer condition; OUTi is the output layer
220 condition.

221 In general, the first three parameters need to be set, and the last two parameters use

222 the default values. The last two parameters are set only when the neural network needs

223 to be optimized.

224 Output: Where Net is the artificial neural network obtained after training.

225 (3) The prediction function ‘sim’

226 y = sim (net, x) ……………………………………… (10)

227 Input: Where net is a trained network; X is the input data;

228 Output: Where Y is the data of network prediction.

229 3. Results and discussion

230 3.1 Effects of initial COD on the fermentation of lipid

231 R. glutinis, as a kind of important lipid yeast, can accumulate lipids by exploiting

232 various wastewater as the raw materials. However, the cellulosic ethanol wastewater

233 applied in our study exhibits high concentration of inhibitor (e.g., furfural, 5-

234 hydroxymethyl, and furfuryl alcohol). Besides, it will suppress the growth and lipid-

235 producing of R. glutinis. In order to reduce the inhibition, the waste water was diluted

236 before fermentation. The effects of initial COD on the fermentation of lipid were

237 explored. Before fermentation, the glucose at a concentration of 40g/L was added to the

238 wastewater. The results were shown in Fig. 1. Several diversifications existed in the

239 growth and lipid-producing of R. glutinis with various wastewater contents in medium.

240 The decrease of biomass and lipid production was obvious when the proportion of

241 wastewater exceeded 40%. In particular, when the wastewater content reached 50%, the
242 lipid yield was nearly zero, and the glucose concentration of medium remained basically

243 unchanged. In this concentration, the concentration of the inhibitors exceeded the

244 tolerance limit of R. glutinis, and cell growth was nearly stagnant. In contrast, the

245 curves of cell growth and lipid yield are almost identical at the proportion of wastewater

246 of 25% and 33%. Cells were growing fastly before144 h, and biomass was peaked in

247 192 h with10.6g/L and 11.12g/L, respectively. Subsequently, the biomass was gradually

248 down-regulated, it was mainly because the nutrients were exhausted, and the cells began

249 to dissolve. After the fermentation, the COD removal rate reached over 80%.

250 The results suggested that the concentration of inhibitor was a significant factor

251 affecting biomass and lipid synthesis of R. glutinis. More importantly, compared with

252 the synthetic medium, there was no significant difference in the lipid yield by using the

253 wastewater as the raw materials with only glucose added (Gong 2019). Besides,

254 microbial lipid fermentation by R. glutinis could indeed act as a practical and functional

255 approach to treat the waste water, which is capable of not only producing lipid but also

256 removing the COD of the waste water.

257 3.2 Effects of initial glucose concentration on the fermentation process of lipid

258 To obtain the maximum lipid yield and COD removal rate, different concentrations

259 of initial glucose (20, 30, 40, 50 g/L) were employed to culture R. glutinis with the

260 wastewater content of 33%; moreover, the biomass, lipid accumulation and COD

261 removal rate were ascertained. As shown in Fig. 2, the biomass of cells displayed a

262 significant difference when the initial sugar concentration was up-regulated from 20g/L

263 to 50g/L. With the increase in the initial glucose concentration, the maximum biomass
264 progressively increased. The maximum biomass reached 7.12g/L,9.13 g/L, 11.12 g/L,

265 11.52 g/L, respectively. When the concentration of glucose was less than 30g/L, the

266 glucose was consumed rapidly in 160 h, and the biomass was not sufficiently

267 accumulated. When it reached more than 40 g/L, the lipid and biomass could be

268 sufficient to synthesis and accumulate; at the end of fermentation, the yield of lipid was

269 more than 1.9g/L. Nevertheless, when the glucose concentration was 50g/L, the rate of

270 glucose consumption decreased noticeably.

271 The results revealed that the addition of glucose could positively impact cell growth

272 and lipid synthesis. The results also proved that the addition of glucose can promote the

273 COD reduction of wastewater. When the glucose at a concentration of 40 g/L was

274 introduced to the wastewater, the COD removal rate reached 84%. It will greatly relieve

275 the pressure of wastewater treatment. Compared with other culture methods without

276 glucose added (Wang 2017, Zhou 2013), the removal rate of COD and lipid yield

277 obtained in this study were more competitive. Though the yield of lipid on cell was not

278 high enough, the production of lipids might be further facilitated by fed-batch

279 cultivations in a bioreactor.

280 3.3 The variations of organic matter during fermentation

281 The previous studies suggested that the characteristic of the cellulose ethanol

282 wastewater has been ascertained (Zhang et al.2018). The organic components of

283 wastewater primarily included sugars, organic acids, aldehydes and so on. To delve into

284 the fermentation process of lipid, the culture medium content 33% of wastewater and

285 40g/L of the glucose were taken as the initial medium for lipid production, with the
286 samples taken per 24 h during the fermentation. The concentrations of different organic

287 matters in the samples were ascertained and analyzed; the result are shown in Fig. 3.

288 According to the results, the biomass and lipid concentration rose with the

289 extension of time, and the concentrations of glucose, lactic acid, acetic acid, glycerin,

290 xylose, furfural, furfuryl alcohol were decreased. It was therefore suggested that R.

291 glutinis can consume various substrates during the fermentation, as reported by (Wiebe

292 et al., 2012; Patel et al., 2015). Fig. 3-D suggests that from 0 to 192 h, lactic acid

293 decreased in a relatively slow manner due to the rich glucose in the medium. The

294 glucose was fully consumed at 192 h, and the lactic acid began to be absorbed and

295 exploited rapidly by R. glutinis. The varying trend of acetic acid was more noticeable

296 than that of lactic acid. From 0 to 24h, acetic acid decreased obviously. However,

297 during the fermentation, the decline of acetic acid gradually moderated. The results

298 suggested that both lactic acid and acetic acid in wastewater could be exploited by

299 mucous red yeast. As compared with lactic acid, R. glutinis exhibits better utilization

300 ability to acetic acid. The identical phenomenon occurred with xylose and glycerin, and

301 the presence of glucose hindered the utilization of other substrates. It was not until the

302 concentration of glucose reached to 0 g/L that glycerol and xylose began to be drawn

303 upon rapidly.

304 The results also revealed that the concentration of the citric acid and succinic acid

305 fluctuated irregularly during the fermentation. It was primarily because the citric acid

306 and succinic acid were intermediate products in the process of cells growth and

307 metabolism. During the fermentation, the components of furfural and furfuryl alcohol in
308 wastewater decreased rapidly. After 120h, furfural and furfuryl alcohol were completely

309 consumed, whereas the results reported that R. glutinis exhibited robust tolerance and

310 assimilation ability to furfural and furfuryl alcohol. For the wastewater, rich in complex

311 organic matter, it is very cost-effective to produce lipid and reduce the COD of

312 wastewater by R. glutinis. According to the change of organic matter content in the

313 fermentation process, it can be seen that with the increase of bacterial mass, organic

314 matter in the wastewater decreased gradually.

315 3.4 Training and prediction based on BP-ANN and SVM model

316 During the microbial lipid fermentation from cellulosic ethanol wastewater by R.

317 glutinis different reaction conditions significantly affected the biomass and the yield of

318 lipid. The results of lipid synthesis revealed that the lipid synthesis and cell growth of R.

319 glutinis pertain to the coupling type. Accordingly, to find the optimal reaction condition

320 of the highest biomass, the reaction conditions should be optimized. In the present

321 study, genetic algorithm was adopted to optimize the conditions of the fermentation.

322 Besides, it covered two steps. The first step is the training and prediction of model,

323 while and the second step was extremum optimum design based on genetic algorithm.

324 Accordingly, a fermentation model should be build based on the fermentation data

325 under a range of reaction conditions. In this study, SVM and BP-ANN were employed

326 to build the fermentation model, respectively, and the optimal fermentation model was

327 taken to optimize the genetic algorithm.

328 The quality of the models were assessed by statistical means, e.g., the coefficient

329 of determination (R2) mean squared error (MSE), and the MSE can be expressed as:
1 𝑛
330 𝑀𝑆𝐸 = 𝑛[∑ (𝑦𝑒𝑥𝑝 ― 𝑦𝑝𝑟𝑒)2]…………………… (11)
𝑖=1

331 Where yexp denotes the experimental value; ypre denotes the predicted value; n

332 indicates the sample number.

333 In this paper, the biomass was taken as the dependent variable of the models and

334 the volume fraction of wastewater, while the concentration of glucose supplementation

335 and fermentation time were adopted as independent variable. According to the existing

336 studies here, 77 sets of data about the effects of initial glucose concentration and initial

337 COD on the fermentation process of lipid were harvested, and the data is listed in

338 Supplementary data Table 1. 7 sets of data were randomly taken as test data, and the rest

339 70 sets of data acted as training data to build the models. Subsequently, the trained

340 network was adopted to assess the output of test data and analyze the prediction results.

341 To build the SVM model, the training and test data were normalized by the

342 function of Eq. (3). Besides, the optimal parameters bestc and bsetg were harvested

343 using CV method based on ‘SVMcgForRegress’ function of Eq. (5). First, a rough

344 search of bestc and bsetg was conducted with the range of the parameters c and g to be

345 optimized both as [2-10, 210]. The results of rough search were presented in Fig. 4 A and

346 B. The optimal parameters c and g under the rough search reached 2.2974 and 4,

347 respectively, and the minimum MSE under CV was 0.0016. Moreover, according to the

348 rough results, the range of optimization parameters c and g were narrowed to [2-4, 24]

349 and [2-5, 25], separately. The results were shown in Fig. 4 C and D. The results revealed

350 that the optimal parameters c and g were 1 and 8, respectively, and the minimum MSE

351 under CV was 0.0016. Lastly, the SVM model was trained using the training data
352 according to the optimal parameters c and g calculated, and then the test data underwent

353 the regression prediction. The fitting results of the training data and test data are shown

354 in Fig. 5 A and B. According to the curve in Fig. 5, it can be observed that the fitting

355 degree between prediction data and experimental data of both test data and training data

356 were noticeably high. The results suggested that the SVM fermentation model exhibited

357 a prominent generalization ability.

358 To build the BP-ANN model, the training data and test data were normalized too

359 by ‘mapminmax’ function of Eq (3). ‘Newff’ function of Eq (8) was adopted to build

360 BP-ANN, and the number of iterations was set to 1000, the learning rate was 0.1, and

361 the learning goal was 0.0000004. Based on the ‘trian’ function of Eq (9), to train BP -

362 ANN with training data, the neural network was capable of predicting the biomass

363 during the fermentation. Subsequently, the ‘sim’ function of Eq (10) was called to test

364 the network with the test data, and the fitting effect of the network was analyzed by

365 assessing the error between the output and the test output. The trained network was

366 employed to assess the biomass of test data, and the predicted results are shown in Fig.

367 5 C and Fig. 5 D. The results revealed that BP-ANN exerts a good fitting effect on the

368 fermentation process of mucous red yeast, whereas some errors remained between the

369 predicted results and the actual results, and some samples displayed relatively

370 noticeable prediction errors.

371 The results of Tab 1suggest that the MSE of the training data and test data based

372 on SVM were 0.0004 and 0.0018 respectively, and the R2 was 0.9959 and 0.9862

373 respectively. The MSE of the training data and test data based on BP-ANN were 0.0043
374 and 0.0105, respectively, and the R2 was 0.9899 and 0.9785 respectively. It is therefore

375 suggested that with only a few samples, SVM model exhibited a better performance

376 than ANN model. SVM has a strong potential in the soft sensor of internal variables in

377 fermentation processes and the prediction of fermentation results. The results suggest

378 that the SVM model could be used to study the complex fermentation process of lipid

379 fermentation process. Accordingly, in the present study, the optimization of genetic

380 algorithm will also comply with SVM model.

381 3.5 Optimization by genetic algorithm based on SVM

382 Lastly, genetic algorithm was adopted to find the optimal parameters for obtaining

383 the maximum biomass based on the SVM model. The number of iterations, the

384 population size, the crossover probability, the mutation probability, and the individual

385 length were 500, 50, 0.4, 0.2 and 3, respectively. The fitness variation curve of the

386 optimal individual in the optimization process was plotted in Fig. 6. The fitness value of

387 the optimal individual calculated by genetic algorithm was 11.8723, and the optimal

388 individual reached [32.6048 46.2636 221.0520]. The results revealed that the biomass

389 was peaked at 11.87 g/L increased by 5%, and the lipid content was 2.18 g/L with

390 wastewater volume fraction of 32.6%, sugar content of 46.2 g/L, as well as fermentation

391 time of 221 h.

392 The fermentation of lipid from cellulosic ethanol wastewater by R. glutinis is a

393 kind of complicated batch process which is severely nonlinear and time-varying.

394 Traditional optimization methods were time-consuming and laborious. In this paper,

395 computer model were established to optimize fermentation conditions. According to the
396 results, it demonstrated that the model established in our study had good generalization

397 and prediction ability for the fermentation of microbial lipid from cellulosic ethanol

398 wastewater. And according to the model we got the best fermentation parameters, and

399 the model can be used to optimize more process parameters based on different data.

400 4. Conclusions

401 This study investigated the change of organic matter in the process of lipid

402 fermentation and established the corresponding fermentation model to optimize the

403 fermentation parameters. The results demonstrated that the organic matter in cellulosic

404 ethanol wastewater were indeed employed by R. glutinis. The establishment of

405 fermentation model has important guiding significance for optimizing parameters. With

406 the development of big data technology and artificial intelligence technology, these

407 models can be enriched with experimental data continuously by adding novel detection

408 methods and targets. Furthermore, it can be used for other fermentation processes.

409 Acknowledgements

410 This work was supported by the National Key Research and Development Program of

411 China (2017YFB0306800) and the Overseas Expertise Introduction Project for

412 Discipline Innovation (B13005). And the authors would like to express thanks for the

413 supports.

414 References

415 Baeyens, J., Qian, K., Appels, L., Dewil, R., Tan, T., 2015. Challenges and

416 opportunities in improving the production of bio-ethanol. Prog. Energy Combust.

417 Sci. 47, 60-88. https://doi.org/10.1016/j.pecs.2014.10.003


418 Boukelia, T.E., Arslan, O., Mecibah, M.S., Baeyens, J., Qian, K., Appels, L., Dewil, R.,

419 Tan, T., 2016. ANN-based optimization of a parabolic trough solar thermal power

420 plant. Appl. Therm. Eng. 107, 1210-1218.

421 https://doi.org/10.1016/j.applthermaleng.2016.07.084

422 Brännström, H., Kumar, H., Alén, R., 2018. Current and Potential Biofuel Production

423 from Plant Oils. Bioenergy Res. 11, 592–613. https://doi.org/10.1007/s12155-018-

424 9923-2

425 Chandel, A.K., Silveira, M.H.L., Vanelli, B.A., 2018. Second Generation Ethanol

426 Production: Potential Biomass Feedstock, Biomass Deconstruction, and Chemical

427 Platforms for Process Valorization 135-152. https://doi.org/10.1016/B978-0-12-

428 804534-3.00006-9

429 Chen, X., Li, Z., Zhang, X. et al.2009. Screening of Oleaginous Yeast Strains Tolerant

430 to Lignocellulose. Degradation Compounds Appl Biochem Biotechnol 159: 591.

431 https://doi.org/10.1007/s12010-008-8491-x

432 Chuck, C.J., Lou-Hing, D., Dean, R., Sargeant, L.A., Scott, R.J., Jenkins, R.W., 2014.

433 Simultaneous microwave extraction and synthesis of fatty acid methyl ester from

434 the oleaginous yeast Rhodotorula glutinis. Energy 69, 446-454.

435 https://doi.org/10.1016/j.energy.2014.03.036

436 Fang, K.T., Lin, D.K.J., Winker, P., Zhang, Y., 2000. Uniform Design: Theory and

437 Application. Technometrics 42, 237-248.

438 https://doi.org/10.1080/00401706.2000.10486045

439 Gong, G., Liu, L., Zhang X., Tan, T., 2019. Comparative evaluation of different carbon
440 sources supply on simultaneous production of lipid and carotene of Rhodotorula

441 glutinis with irradiation and the assessment of key gene transcription. Bioresour.

442 Technol, 288:21559. https://doi.org/10.1016/j.biortech.2019.121559

443 Grahovac, J., Jokić, A., Dodić, J., Vučurović, D., Dodić, S., 2016. Modelling and

444 prediction of bioethanol production from intermediates and byproduct of sugar

445 beet processing using neural networks. Renew. Energy 85, 953-958.

446 https://doi.org/10.1016/j.renene.2015.07.054

447 Gude, V.G., 2016. Wastewater Treatment in Microbial Fuel Cells - An Overview 122,

448 287-307. https://doi.org/10.1016/j.jclepro.2016.02.022

449 Guerbai, Y., Chibani, Y., Hadjadji, B., 2018. Handwriting gender recognition system

450 based on the one-class support vector machines. Seventh International Conference

451 on Image Processing Theory. IEEE.

452 Hall, J., Hetrick, M., French, T., Hernandez, R., Donaldson, J., Mondala, A., Holmes,

453 W., 2011. Oil production by a consortium of oleaginous microorganisms grown

454 on primary effluent wastewater. Journal of Chemical Technology &

455 Biotechnology, 86(1), 54-60. https://doi.org/10.1002/jctb.2506

456 Hu, Q., Fan, L., Gao, D., 2017. Pilot-scale investigation on the treatment of cellulosic

457 ethanol biorefinery wastewater. Chem. Eng. J. 309, 409–416.

458 https://doi.org/10.1016/j.cej.2016.10.066

459 Ibarra-Gonzalez, P., Rong, B.G., 2019. A review of the current state of biofuels

460 production from lignocellulosic biomass using thermochemical conversion routes.


461 Chinese J. Chem. Eng. 27, 1523–1535. https://doi.org/10.1016/j.cjche.2018.09.018

462 Irawan, M. I., (2015). Study comparison backpropagation, support vector machine, and

463 extreme learning machine for bioinformatics data.

464 https://doi.org/10.17746/1563-0102.2015.43.2.116-125

465 Jovana, G., Aleksandar, J., Jelena, D., Damjan, V., Siniša, D., (2016). Modelling and

466 prediction of bioethanol production from intermediates and byproduct of sugar

467 beet processing using neural networks. Renewable Energy, 85, 953-958.

468 https://doi.org/10.1016/j.renene.2015.07.054

469 Ling, J. , Nip, S. , & Shim, H. . (2013). Enhancement of lipid productivity of

470 rhodosporidium toruloides in distillery wastewater by increasing cell density.

471 Bioresource Technology, 146, 301-309.

472 https://doi.org/10.1016/j.biortech.2013.07.023

473 Lynd, L.R., Liang, X., Biddy, M.J., Allee, A., Cai, H., Foust, T., Himmel, M.E., Laser,

474 M.S., Wang, M., Wyman, C.E., 2017. Cellulosic ethanol: status and innovation.

475 Curr. Opin. Biotechnol. 45, 202–211. https://doi.org/10.1016/j.copbio.2017.03.008

476 Mohammadi, R., Mohammadifar, M.A., Mortazavian, A.M., Rouhi, M., Ghasemi,

477 J.B.,Delshadian, Z., (2016). Extraction optimization of pepsin-soluble collagen

478 from eggshell membrane by response surface methodology (RSM). Food Chem.

479 190, 186-193. https://doi.org/10.1016/j.foodchem.2015.05.073

480 Moreno, A.D., Alvira, P., Ibarra, D., Tomás-Pejó, E., 2017. Production of Ethanol from

481 Lignocellulosic Biomass. Biofuels and Biorefineries, Vol. 7. Springer Singapore.

482 https://doi.org/10.1007/978-981-10-4172-3_12
483 Pablo, R.P., Juan, C.R., Chaparro, D.G., Venzor, J.A.P., Carreon, A.Q., Rosiles, J.G.,

484 (2013).Support Vector Machines for Regression: A Succinct Review of Large-

485 Scale and Linear Programming Formulations. Int. J. Intell. Sci. 3, 5-14.

486 Patel, A., Pruthi, V., Singh, R.P., Pruthi, P.A., 2015. Synergistic effect of fermentable

487 and non-fermentable carbon sources enhances TAG accumulation in oleaginous

488 yeast Rhodosporidium kratochvilovae HIMPA1. Bioresour Technol 188, 136-144.

489 https://doi.org/10.1016/j.biortech.2015.02.062

490 Pattananuwat, N., Aoki, M., Hatamoto, M., Nakamura, A., Yamazaki, S., Syutsubo, K.,

491 Araki, N., Takahashi, M., Harada, H., Yamaguchi, T., 2013. Performance and

492 microbial community analysis of a full-scale hybrid anaerobic-aerobic membrane

493 system for treating molasses-based bioethanol wastewater. Int. J. Environ. Res. 7,

494 979–988. https://doi.org/10.22059/ijer.2013.681

495 Peng, W., Huang, C., Chen, Xue-fang, Xiong, L., Chen, Xin-de, Chen, Y., Ma, L.,

496 2013. Microbial conversion of wastewater from butanol fermentation to microbial

497 oil by oleaginous yeast Trichosporon dermatis. Renew. Energy 55, 31-34.

498 https://doi.org/10.1016/j.renene.2012.12.017

499 Sarkar, N., Ghosh, S.K., Bannerjee, S., Aikat, K., 2012. Bioethanol production from

500 agricultural wastes: An overview. Renew. Energy 37, 19-27.

501 https://doi.org/10.1016/j.renene.2011.06.045\

502 Sebayang, A.H., Masjuki, H.H., Ong, H.C., Dharma, S., Silitonga, A.S., Kusumo, F.,

503 Milano, J., 2017. Optimization of bioethanol production from sorghum grains

504 using artificial neural networks integrated with ant colony. Ind. Crop. Prod. 97,
505 146-155. https://doi.org/10.1016/j.indcrop.2016.11.064

506 Shan, L., Yu, Y., Zhu, Z., Zhao, W., Wang, H., Ambuchi, J.J., Feng, Y., 2015.

507 Microbial community analysis in a combined anaerobic and aerobic digestion

508 system for treatment of cellulosic ethanol production wastewater. Environ. Sci.

509 Pollut. Res. 22, 17789–17798. https://doi.org/10.1007/s11356-015-4938-0

510 Shan, L., Liu, J., Ambuchi, J.J., Yu, Y., Huang, L., Feng, Y., 2017. Investigation on

511 decolorization of biologically pretreated cellulosic ethanol wastewater by

512 electrochemical method. Chem. Eng. J. 323, 455–464.

513 https://doi.org/10.1016/j.cej.2017.04.121

514 Singh, D.K., Verma, D.K., Singh, Y., Hasan, S.H., 2017. Preparation of CuO

515 nanoparticles using Tamarindus indica pulp extract for removal of As(III):

516 Optimization of adsorption process by ANN-GA. J. Environ. Chem. Eng. 5, 1302-

517 1318. https://doi.org/10.1016/j.jece.2017.01.046

518 Steinwinder, T., Gill, E., Gerhardt, M., n.d. Process Design of Wastewater Treatment

519 for the NREL Cellulosic Ethanol Model.

520 Wang, J., Hu, M., Zhang, H., Bao, J., 2017. Converting Chemical Oxygen Demand

521 (COD) of Cellulosic Ethanol Fermentation Wastewater into Microbial Lipid by

522 Oleaginous Yeast Trichosporon cutaneum. Appl. Biochem. Biotechnol. 182,

523 1121–1130. https://doi.org/10.1007/s12010-016-2386-z

524 Wang, M., Dewil, R., Maniatis, K., Wheeldon, J., Tan, T., Baeyens, J., Fang, Y., 2019.

525 Biomass-derived aviation fuels: Challenges and perspective. Prog. Energy

526 Combust. Sci. 74, 31–49.https://doi.org/10.1016/j.pecs.2019.04.004


527 Xue, F., Zhang, X., Luo, H., Tan, T., 2006. A new method for preparing raw material

528 for biodiesel production. Process Biochem. 41, 1699-1702.

529 https://doi.org/10.1016/j.procbio.2006.03.002

530 Xue, F., Miao, J., Zhang, X., Luo, H., Tan, T., 2008. Studies on lipid production by

531 Rhodotorula glutinis fermentation using monosodium glutamate wastewater as

532 culture medium. Bioresour. Technol. 99, 5923-5927.

533 https://doi.org/10.1016/j.biortech.2007.04.046

534 Xue, F., Gao, B., Zhu, Y., Zhang, X., Feng, W., Tan, T., 2010. Pilot-scale production of

535 microbial lipid using starch wastewater as raw material. Bioresour. Technol. 101,

536 6092-6095. https://doi.org/10.1016/j.biortech.2010.01.124

537 Zhang, X., Meng, L., Xu, Z., Tianwei, T., n.d. Microbial lipid production and organic

538 matters removal from cellulosic ethanol wastewater through coupling oleaginous

539 yeasts and activated sludge biological method. Bioresour. Technol. 267, 395-400.

540 https://doi.org/10.1016/j.biortech.2018.07.075

541 Zhang, Z., Zhang, X., Tan, T., 2014. Lipid and carotenoid production by Rhodotorula

542 glutinis under irradiation/high-temperature and dark/low-temperature cultivation.

543 Bioresour. Technol. 157, 149-153. https://doi.org/10.1016/j.biortech.2014.01.039.

544 Zhao, & Yu, B., 2013. Study on treatment of cellulose fuel ethanol wastewater and

545 application. Advanced Materials Research, 777, 365-369.

546 https://doi.org/10.4028/www.scientific.net/AMR.777.365

547 Zhou, W., Wang, W., Li, Y., Zhang, Y., 2013. Lipid production by Rhodosporidium

548 toruloides Y2 in bioethanol wastewater and evaluation of biomass energetic yield.


549 Bioresour. Technol. 127, 435-440. https://doi.org/10.1016/j.biortech.2012.09.067

550
551 Fig. 1 Effects of initial COD on biomass (A), glucose consumption (B), lipid content

552 and lipid yield (C) , and COD removal rate (D)

553 Fig. 2 Effects of initial glucose concentration on biomass (A), glucose consumption (B),

554 lipid content and lipid yield (C), and COD removal rate (D)

555 Fig. 3 Changes of the organic matter in cellulose ethanol wastewater during the

556 fermentation:A (Glucose, Xylose, Glycerin); B (Citric acid, Succinic acid); C

557 (Furfural, Furfuryl alcohol, HMF); D (Lactic acid, Acetic acid

558 Fig. 4 Contour map (A: Rough search, C: Fine search) and 3D view (B A: Rough

559 search, D: Fine search) of parameter optimization by CV

560 Fig .5 The fitting results of the training data (A: SVM model, C: BP-ANN model) and

561 the test data (B: SVM model, D: BP-ANN model)

562 Fig. 6 Curve of fitness

563 Table 1 Comparison between SVM and BP -ANN

564

565

566

567

568

569

570

571
572

573
574 Fig. 1 Effects of initial COD on biomass (A), glucose consumption (B), lipid content and lipid yield
575 (C) , and COD removal rate (D)
576
577
578

579
580

581
582 Fig. 2 Effects of initial glucose concentration on biomass (A), glucose consumption (B), lipid
583 content and lipid yield (C), and COD removal rate (D)

584

585

586

587
588

589

590 Fig. 3 Changes of the organic matter in cellulose ethanol wastewater during the fermentation:A

591 (Glucose, Xylose, Glycerin); B (Citric acid, Succinic acid); C (Furfural, Furfuryl alcohol, HMF); D

592 (Lactic acid, Acetic acid)

593

594

595

596

597
598

599
600 Fig. 4 Contour map (A: Rough search, C: Fine search) and 3D view (B A: Rough search, D: Fine
601 search) of parameter optimization by CV
602
603
604
605
606
607
608
609
610
611
612
613

A B

C D

614
615 Fig .5 The fitting results of the training data (A: SVM model, C: BP-ANN model) and the test data
616 (B: SVM model, D: BP-ANN model)
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632

11.88

11.87

11.86

11.85

11.84
Fitness

11.83

11.82

11.81

11.8

11.79
0 100 200 300 400 500 600
Iterations
633
634 Fig. 6 Curve of fitness

635

636

637

638

639

640

641

642

643

644

645

646

647

648
649

650

651
652 Table 1 Comparison between SVM and BP -ANN
SVM model BP-ANN model
MSE R2 MSE R2
Training data 0.0004 0.9959 0.0043 0.9899
Test data 0.0018 0.9862 0.0105 0.9785
653

654

655

656

657 Table S1. The detailed data of the models in this study
658
Number Volume fraction of wastewater(%) Glucose(g/L) Time(h) Biomass(g/L)

1 25 40 0 0.03
2 25 40 24 1.94
3 25 40 48 3.26
4 25 40 72 6.22
5 25 40 96 7.46
6 25 40 120 8.35
7 25 40 144 10.1
8 25 40 168 10.54
9 25 40 192 10.6
10 25 40 216 10.29
11 25 40 240 10.16
12 40 40 0 0.03
13 40 40 24 0.05
14 40 40 48 0.1
15 40 40 72 2.34
16 40 40 96 3.43
17 40 40 120 4.26
18 40 40 144 5.08
19 40 40 168 7.17
20 40 40 192 8.45
21 40 40 216 8.94
22 40 40 240 8.72
23 50 40 0 0.03
24 50 40 24 0.03
25 50 40 48 0.03
26 50 40 72 0.2
27 50 40 96 0.53
28 50 40 120 0.56
29 50 40 144 1.96
30 50 40 168 2.37
31 50 40 192 2.73
32 50 40 216 2.7
33 50 40 240 2.72
34 33.33 20 0 0.03
35 33.33 20 24 0.98
36 33.33 20 48 4.02
37 33.33 20 72 5.31
38 33.33 20 96 6.47
39 33.33 20 120 7.12
40 33.33 20 144 6.93
41 33.33 20 168 6.77
42 33.33 20 192 6.51
43 33.33 20 216 6.4
44 33.33 20 240 6.21
45 33.33 30 0 0.03
46 33.33 30 24 1.26
47 33.33 30 48 3.94
48 33.33 30 72 5.98
49 33.33 30 96 7.66
50 33.33 30 120 8.58
51 33.33 30 144 9.03
52 33.33 30 168 8.32
53 33.33 30 192 8.08
54 33.33 30 216 8.07
55 33.33 30 240 7.93
56 33.33 40 0 0.03
57 33.33 40 24 1.37
58 33.33 40 48 3.96
59 33.33 40 72 5.02
60 33.33 40 96 6.66
61 33.33 40 120 8.85
62 33.33 40 144 9.26
63 33.33 40 168 10.38
64 33.33 40 192 11.12
65 33.33 40 216 10.87
66 33.33 40 240 10.01
67 33.33 50 0 0.03
68 33.33 50 24 1.67
69 33.33 50 48 4.6
70 33.33 50 72 6.01
71 33.33 50 96 7.2
72 33.33 50 120 7.89
73 33.33 50 144 7.84
74 33.33 50 168 9.21
75 33.33 50 192 10.95
76 33.33 50 216 11.37
77 33.33 50 240 11.52
659

660 Highlights

661  The change law of organic matter in fermentation of lipid was analyzed.

662  BP-ANN and SVM model of the fermentation of the ethanol wastewater were

663 established.

664  SVM is better than BP-ANN in prediction and optimization based on small sample.

665  The parameters were optimized by genetic algorithm based on SVM.

666

667

668

669 Credit Author Statement

670 Lihe Zhang: Data curation; Methodology; Formal analysis; Investigation;

671 Resources.

672 Bin Chao: Software.

673 Xu Zhang: Conceptualization; Funding acquisition; Supervision; Validation.


674

675 Declaration of interests

676
677 ☒ The authors declare that they have no known competing financial interests or personal

678 relationships that could have appeared to influence the work reported in this paper.

679

680 ☐The authors declare the following financial interests/personal relationships which may

681 be considered as potential competing interests:

682

683
684

685

686

687

You might also like