72. Biotech Bioeng

Received: 16 May 2023 | Revised: 9 July 2023 | Accepted: 10 July 2023
DOI: 10.1002/bit.28503
REVIEW
Hybrid modeling in bioprocess dynamics: Structural

variabilities, implementation strategies, and
practical challenges
Biswanath Mahanty
Department of Biotechnology, Krunya

Institute of Technology and Sciences, Abstract
Coimbatore, Tamil Nadu, India
Hybrid modeling, with an appropriate blend of the mechanistic and data‐driven
Correspondence framework, is increasingly being adopted in bioprocess modeling, model‐based
Biswanath Mahanty, Department of experimental design (digital‐twin), identification of critical process parameters, and
Biotechnology, Karunya Institute of
Technology and Sciences, Coimbatore optimization. However, the development of a hybrid model from experimental data
‐641114, Tamil Nadu, India. is an inherently complex workflow, involving designed experiments, selection of the
Email: bmahanty@karunya.edu
data‐driven process, identification of model parameters, assessment fitness, and
generalization capability. Depending on the complexity of the process system and
purpose, each piece of these modules can flexibly be incorporated into the puzzle.
However, this extra flexibility can be a cause of concern to trace an “optimal” model
structure. In this paper, the development of hybrid models in a common bioprocess
system, selection of data‐driven components and their mapping to states, choice of
parameter identification techniques, and model quality assurance are revisited. The
challenges associated with hybrid‐model development, and corrective actions have
also been reviewed. The review also suggests the lack of data, and code sharing in
communal repositories can be a hurdle in the exploration, and expansion of those
tools in a bioprocess system.
KEYWORDS
ANN, black‐box, experimental design, generalization, sensitivity
1 | INTRODUCTION and logical (Tsopanoglou & Jiménez del Val, 2021). When embedded
into the first‐principles architecture of material, charge, or energy
Bioprocess models are widely adopted to understand the process balance from process‐level understanding of the system, such
dynamics and to predict the temporal change of biomass, substrate, parametric “white‐box” models have great potential for extrapolation,
product, or other state variables under consideration in a reproduc- and offer a robust option in process engineering. Incorporation of
ible manner. Incorporation of fundamental knowledge about the charge balance through the differential‐algebraic equation in a
process, and interdependency of the variables through mathematical bioprocess model is essential when carbonate chemistry, speciation
formalization in the form of algebraic, ordinary differential, or partial of weak acids, or pH dynamics are mapped to the system (Kresnowati
differential equations, is the core of the process modeling exercise et al., 2008; Nayak et al., 2015).
(Luo et al., 2021). Though the use of parametric expressions, e.g., However, model identification (i.e., selection of appropriate
Monod's kinetics for biomass growth, Luedeking‐Piret kinetics for kinetics) requires a tedious trial‐and‐error approach, even in the
product formation, or yield coefficient tagged substrate utilization presence of reasonable process knowledge (Moser et al., 2020). A
does not reveal any “mechanistic” insight but are certainly convincing single kinetic model may not be capable of simulating the entire
Biotechnol Bioeng. 2023;1–20. wileyonlinelibrary.com/journal/bit © 2023 Wiley Periodicals LLC. | 1

2 | MAHANTY
process, hence combining more than one kinetic model or shifting to data‐driven component, thereby improving robustness, extrapolation,
a different kinetic model depending on growth phases may be and reducing overfitting (Solle et al., 2017). Data‐driven information
warranted (Zhang et al., 2015). Fully mechanistic models are often is a valuable supplement (not an alternative) to mechanistic
over‐parametrized to capture complex dynamics and require an knowledge, increasing transparency and cost‐effectiveness of the
exorbitantly high volume of data to estimate parameters to any modeling approach (Sansana et al., 2021). The constraints imposed by
degree of confidence. Bioprocesses models are often associated with conservation equations ensure the robustness (Tsopanoglou &
partially nonidentifiable parameters due to the lack of sufficient Jiménez del Val, 2021), and prediction accuracy of hybrid models in
quality experimental data. However, the removal of such nonidentifi- an identified valid domain (Bae et al., 2020). Automated identification
able but physically interpretable terms makes the model over- of data‐driven architecture reduces the expertise required for the
simplified, poor in fitting and predictability, leaving no choice but parametric representation of the mechanistic part (Lopes Dias
inevitable use of mildly overfitted models as a trade‐off between et al., 2007; Luna et al., 2021). Increased process knowledge from
physical insight and mathematical accuracy (Lee et al., 2018). Ignoring hybrid modeling when combined with the design of experiments
structured‐segregated biomass populations in a highly dynamic (DoE), can reduce the experimental burden and accelerate process
process results in unsatisfactory prediction capabilities. In addition development (Bayer et al., 2020b).
to generalization and overfitting problems, ensuring data adequacy A hybrid modeling framework, incorporating ANNs‐based reaction
for an accurate representation of inherent process variability, that is, kinetics in mass balance equations, has successfully been adopted in
change in feedstock, nonlinearity of processes, or altered product different bioprocess systems (Bangi et al., 2022; Bayer et al., 2021b;
specification, are unique deployment challenges in industrial settings Oliveira, 2004; Vega‐Ramon et al., 2021). Parameters from existing
(Fisher et al., 2020; Tsopanoglou & Jiménez del Val, 2021). Limited kinetic models can also be reformulated as probabilistic Gaussian
data in batch production with frequent changes to manufacturing processes, reducing prediction errors from uncertainty propagation
practices, and implementation of new technologies complicates the (Vega‐Ramon et al., 2021). Hybrid model‐based framework has been
situation further. adopted to develop soft sensors to predict bioprocess trajectories and
Even if a perfect model could be formulated, the kinetic identify critical process parameters (CPP) relevant to specific critical
constants may not be time‐invariant during bioprocess operation. quality attributes (Rathinavelu et al., 2023). Model predictive control-
On the contrary, data‐driven nonparametric models (referred to as lers (MPCs) use either a mechanistic, data‐driven, or hybrid dynamic
black‐box models) are formulated exclusively from process data model to adapt real‐time process control using online measurements
without the incorporation of knowledge, or mechanistic obligations of CPP (i.e., temperature, dissolved oxygen, and pH) or spectroscopic
(Härdle et al., 2004). Though different data‐driven modeling soft sensor‐based measurements of additional variables (Brunner
subvariants, such as multiple linear regression (MLR), principal et al., 2020; Narayanan et al., 2020b). Optimal control actions are
component analysis (PCA), or partial least squares (PLS) regression regularly re‐calculated by minimizing a global objective function based
do exist to map functional dependency of process rates on state on the prediction of future model outputs over a limited time horizon
variables, the artificial neural network (ANN)‐based approaches are and current process information (Caramihai & Severi, 2013). The data‐
by far more widely adopted in the literature (del Rio‐Chanona driven module can embed multivariate data analysis to identify CPP,
et al., 2017a; Dong et al., 2014; Mowbray et al., 2023; Narayanan without focusing on the mechanistic insight of the relationships
et al., 2019; Pinto et al., 2022). ANN‐based black‐box model, (Sokolov et al., 2017). The hybrid modeling approach not only offers
adopting either shallow or deep networks, is a universal function better time‐resolved process characterization in comparison to pure
approximator and can offer excellent training performance in data‐driven or response surface methodology on process endpoints
supervised learning on a large raw data set. These nonparametric but have the potential for soft‐sensing and MPCs (Bayer et al., 2020a).
models also have parameters (e.g., weight, bias), but are flexible in Mechanistic model prediction of total organic carbon (TOC) generation
dimension and not fixed in priori. The data‐driven model captures the rate, and base federate in Pichia pastoris culture, has been integrated
process dynamics without mechanistic preview but relies on the into an MLR‐based data‐driven approach to develop an adaptive soft
quantity and quality of process data (Sansana et al., 2021). Pure sensor for biomass measurement (Brunner et al., 2020). The TOC
“black‐box” models completely disregard the first‐principles informa- change rate in the reactor, calculated by subtracting the online
tion (e.g., mass balances) (Zorzetto et al., 2000), and even though they measured CO2 emission rate from the inline measured methanol
offer an excellent fit on trained data, their extrapolation or reaction rate, is numerically integrated to get cumulative TOC
generalization capability for inputs beyond the training constellations accumulation in biomass. The cumulative TOC and base feed serve
are very poor (Bardow & Marquardt, 2004; Bhadriraju et al., 2020). as predictors to model the measured biomass concentration
Between the two extreme paradigms with their pros and cons, a using MLR.
synergy referred to as gray‐box or hybrid models, has emerged as an Though, the superiority of the hybrid‐modeling approach is well
invaluable tool for process modeling (Rogers et al., 2023). Hybrid recognized, their expansion into bioprocess systems is not very
models still adopt a mechanistic first‐principle framework for the exciting. A bibliometric search in the Scopus database for “hybrid
biological process but unidentified parts of the dynamics (i.e., specific model” AND (“bioprocess” OR “fermentation”) in TITLE‐ABS‐KEY
growth, product formation rates) are represented through a flexible returns just 150 documents, published over the past three decades
MAHANTY | 3
(Figure S1). The association strength of keywords across those 2.1 | ANN‐based models
articles suggests contextual relevancy of “neural network,” “optimi-
zation,” “mathematical model,” “bioreactors,” and “kinetic parame- ANNs consisting of an input layer, one or more interconnected
ters” with hybrid modeling. However, keywords relevant to the hidden layers, and an output layer for information processing, are
modeling approach, and model quality (i.e., training, parameter widely adopted tools good at mapping complex nonlinear problems.
estimation, model identification, cross‐validation, and bootstrapping) Each neuron in the hidden or output layer adds ups the inputs (i.e.,
are nearly nonexistent. weight) from all neurons in the previous layer, combines with
In recent times, a couple of reviews on the advantages, and adjustable bias. Subsequently, each neuron processes the input
disadvantages hybrid of models (Tsopanoglou & Jiménez del through nonlinear “Activation functions” (linear, hyperbolic tangent,
Val, 2021), the use of machine learning algorithms in bioprocess logistic function, or a rectified linear unit) before passing it over to
optimization, and process monitoring (Mondal et al., 2023), and neurons in the next layer. The weight and bias of the ANN model are
recent trends on hybrid modeling considering bigdata and Industry estimated through iterative training with experimental data. How-
4.0 (Sansana et al., 2021) have been published. The utility of ever, the selection of the number of neurons can be critical as an
statistical and hybrid modeling approaches has been reviewed increase in the number always results in better fitness but with
considering upstream pharmaceutical bioprocesses (Tsopanoglou & deteriorated generalization capability (Alrashed et al., 2018).
Jiménez del Val, 2021). The structural arrangement, information ANN has evolved from a single hidden layer to multiple fully
integration strategy in hybrid modeling, and their application in connected hidden layers of deep neural network (DNN) architecture
chemical bioprocess have also been reviewed in the recent past (von requiring a smaller number of neurons for specific functions
Stosch et al., 2014). However, the practical challenges associated (Schmidhuber, 2015). Hybrid models with a DNN have been utilized
with the arrangement of black‐box white‐box model components, in different process operations (Bangi & Kwon, 2020) and industrial‐
selection of black‐box architecture, training, and assembling the scale bioreactor system modeling (Shah et al., 2022). Though DNN
entire puzzle altogether into an optimization framework from a can offer excellent model fitting to observations, their prediction can
researcher's perspective have not been well documented. be inconsistent or implausible due to extrapolation or observational
In this review, following this introduction, important machine biases. The incorporation of any physical laws correlating predictors
learning tools have been presented in Section 2. In Section 3, the and responses during ANN training is referred to as physics‐informed
structural arrangement of hybrid models and their difference from neural networks (PINNs) (Karniadakis et al., 2021). PINNs add up the
mechanistic or data‐driven counterparts has been clarified. The unsupervised loss from partial differential equations (PDEs) using
formalization of the black‐box model part and different approaches automatic differentiation of ANN along with supervised loss of data
for training has been elaborated in Section 4. Assessment of measurements in the training process (Figure 1). The utility of PINNs
generalization capability through designed experiments, cross‐ is essentially a data‐driven model, with albeit better predictability, but
validation, and boot‐strapping is presented in Section 5. Finally, the should not be confused with a hybrid model built on a mechanistic
application of the hybrid model, challenges, and future directions has framework. Biologically‐informed neural networks (BINNs) as an
been discussed. extension of has been used in approximate in vitro experiments while
respecting governing reaction‐diffusion partial differential equations
(Lagergren et al., 2020). The author used a BINN to model cell density
2 | M A C H I N E LE A R N I N G TO O L S I N as a function of space, and time. The network output is further used
H YB R I D M O D E L I N G to approximate diffusivity and growth functions imbibed in governing
dynamic systems.
Bioprocess involves a complex dynamic interaction of different Long short‐term memory (LSTM)‐ANN models are another
biochemical species. High time scale variability in bioprocess makes it variant of machine learning (ML) algorithm useful for multiscale
difficult to determine and implement effective control unless the models with time‐varying and time‐invariant inputs. LSTMs are
process could be reliably modeled by incorporating CPPs (Rathore known for learning temporal dependencies in sequence data through
et al., 2021). Mechanistic models are inefficient with nonlinear the use of input, output, and forget gates (Jeon & Kim, 2021). These
problems, rely on an accurate understanding of the mechanism, and networks take the current input value, previous output, and the
are not suitable for optimal control strategy (del Rio‐Chanona previous unit state as inputs to generate the current output and unit
et al., 2017b). On the other hand, hybrid modeling of bioprocess state (Schwedersky et al., 2019).
incorporates a data‐driven approach to map the specific rates in
mechanistic models, without advocating any explicit functional
relationship with process variables. Artificial intelligence has been 2.2 | Support vector machine (SVM)
introduced as a facilitating approach to build a data‐driven model by
learning the relationship between input and output using a training SVM as a supervised statistical ML algorithm maps the input into
data set, creating a rule on trained structure for future predictions multi‐dimensional feature space based on a nonlinear mapping kernel
(Cheng et al., 2023). function (Raghavendra & Deka, 2014). The optimal hyperplane that
4 | MAHANTY
F I G U R E 1 A typical physics‐informed neural network (PINN). Network input (x, t) are processed through multiple hidden layer neurons, with
weight and bias, activation (σ) to prediction (u). Loss from partial differentials ∂u/∂t, ∂u/∂x, ∂2u/∂x2 calculated through automatic differentiation
of the network is added with loss from the prediction (u). The combined loss is minimized below a threshold (ε) during the training process.
maximizes the margin between the nearest data points and the 2.4 | Latent variable methods
hyperplane is located, and the most “important” training points
defining the hyperplane are support vectors. Linear, polynomial, Latent variable (LV) approaches project high‐dimensional correlated
radial basis, and sigmoid kernels in SVM allow modeling complicated predictors into a few uncorrelated orthogonal LVs, as linear
separating hyperplanes. Nonlinear kernel processed input multiplied combinations of the original variables, while preserving the variability
with a weight vector of dimension same as feature‐space, added with in original predictors and response spaces (Liu et al., 2022a). Principal
bias, is the SVM prediction (Wang et al., 2010). Unlike ANN, SVM component regression (PCR), and PLS regression are the most
uses structural risk minimization and is less likely to be overfitted popular. In PCR, principal components scores extracted from the
from a small data set. Prediction of SVM heavily depends on kernel covariance matrix (arranged based on the original variability they
function selection (Andrade Cruz et al., 2022) and the computational explain) are associated with the response adopting an ordinary least
burden increases as the kernel matrix grows with data size (Cervantes squares approach (Artigue & Smith, 2019). PLS, on the other hand,
et al., 2020). Support vector regression (SVR) model with radial bases finds the linear combinations of predictors (latent vectors) through
kernel functions was adopted as data‐driven component (cumulative the simultaneous decomposition of predictors and response so that
oxygen uptake and carbon dioxide production rates) as nonlinear maximal covariance between them (Mehmood et al., 2012).
functions of the biomass concentration in hybrid modeling of PLS‐based MPC has been proposed to account for variability in yield
recombinant therapeutic protein production process in Escherichia as a result of random variations in raw material properties and in‐
coli (Simutis & Lübbert, 2017). batch process fluctuations (Duran‐Villalobos et al., 2020). The data‐
driven soft sensor has been utilized to control glucose concentration
in mammalian cell cultures based on online cumulative oxygen uptake
2.3 | Fuzzy logic (FL) rates (DOc) (Goldrick et al., 2018). The DOc values are calculated from
agitator speed, online dissolved oxygen (DO) measurements, satura-
Fuzzy logic incorporates the idea that all biological information tion DO concentration, and inlet gas flow rate. On the other hand,
(e.g., a transition from lag phase to exponential and stationary cumulative glucose consumption rates (Glc) are calculated from
phase) cannot be identified precisely and operates similarly to glucose feed and off‐line measured glucose concentrations. A least
human thinking defined by membership functions with different square regression (R2 > 0.99) between DOc and Glc serves as a data‐
probabilities for each value and “If–then” rules (Patnaik, 2006). driven component to predict glucose measurement and control.
FL‐based algorithms are robust and perform well in small sample
datasets. Neuro‐fuzzy estimator has been used in penicillin
fermentation pilot plant, an adaptation of bacteria to inhibitor in 3 | H Y B R I D M O D E L ST R U C T U R A L
acetic acid fermentation (Araúzo‐Bravo et al., 2004; Arnold FRAME WORK
et al., 2002). FL model has been adopted to predict metabolic
flow in Chlorella sp. utilizing a synthetic flue gas mixture correlating 3.1 | General model structure
to growth performance (Bhola et al., 2017), to determine nutrient
composition for phycobiliproteins production from cyanobacteria Dynamic model provides temporal evolution of key biological or
(Kumar Saini et al., 2020). chemical species (e.g., biomass, substrate, product, off‐gas
MAHANTY | 5
composition, and pH) in a complex interacting environment. The those having significant influence on the same, and can be measured
mathematical representation of the model depends on the availability in bioprocess are generally included (Simutis & Lübbert, 2017). Some
of data, inherent complexity, and our understanding of the process. of the complex mechanistic insight can however be dropped or
The dynamics of substrate, biomass, or product when governed by simplified from hybrid model ODEs (Dong et al., 2014).
fundamental laws of physics, for example, material, energy balance,
and chemical equilibrium are termed first principles models, as shown
in Figure 2a. The material balance of substrate accounts for the 3.2 | Hybrid modeling—Serial or parallel
portions utilized toward biomass growth, product formation, and
cellular maintenance mapped through respective yield (YX / S, YP / S ), or The black‐box and white‐box modules can be arranged in three
maintenance coefficients (m). In the mechanistic model, ordinary different ways, that is, parallel to each other or one of the two serial
differential equations (ODEs) representing biomass growth (dX /dt) or structures depending on their order (von Stosch et al., 2014), as
product formation (dP /dt) rates are explicit functions of biomass (X), shown in Figure 3. In parallel structure, the black box model can map
substrate (S), product (P), and time‐invariant parameters (θ , ϕ). In the residuals between experimental and white box model predictions
contrast, the data‐driven model (Figure 2b) represents growth rate µ so that mechanistic model outputs can be corrected for future
(.), substrate consumption rate g(.), and product formation rate h(.), as prediction (Bhutani et al., 2006) or to account for the unknown
implicit functions (e.g., ANN) of state variables. There are no material dynamics in numerical integration (Zhang et al., 2020). Very similar
balance constraints in the data‐driven modeling approach. Hybrid framework with some differences in objective, ML algorithm has
model, though respects the material balance, some of the poorly been adopted in different studies (see Table 1). Blackbox model
understood rates or terms are modeled through data‐driven prediction from real‐time measurements can be used to reconcile the
functions (Figure 2c). kinetic model output (Cabaneros Lopez et al., 2021). The parallel
Though ODEs are part of dynamic models irrespective of the structure hybrid model offers an exceptionally good fit without
white box or hybrid architecture, process informative state vectors or consideration of specific reasons for mismatch. It only uses the
deviation between the actual process and mechanistic model output,
referred to as residual model (RM), to describe the dynamics that
mechanism models cannot describe. The semi−supervised hybrid
(a) (b)
modeling approach has been adopted in continuous ethanol produc-
tion with Saccharomyces cerevisiae (Zhao et al., 2023). The RM was
trained using a feed−forward neural network built on selected
bioreactor input and output of the kinetic model. PLS regression has
been used to predict the real‐time concentration of glucose, xylose,
and ethanol from attenuated total reflectance spectra in ligno-
cellulosic ethanol fermentation (Cabaneros Lopez et al., 2021). The
(c) reconciled predictions of data‐driven and kinetic models, with
F I G U R E 2 General structure of (a) mechanistic first‐principal

model, (b) data‐driven model, and (c) hybrid model involving biomass
(X) growth, substrate (S) consumption, and product (P) formation. In
the mechanistic model each rate is explicitly represented in a formal
algebraic expression (e.g., Monod's growth model), and material
balance is incorporated in substrate consumption rate through
biomass yield (YX/S), product yield (YP/S), or maintenance (m) F I G U R E 3 General scheme for mechanistic and serial/parallel
coefficient. In a data‐driven model, all specific rates (µ, g, h) are hybrid model in bioprocess. In the mechanistic model, the initial value
represented as black‐box functions (e.g., ANN) of state variables. of the state vector ζ ∈ [X , S]T along with parameters θ ∈ [μm, KS, kd ] is
Hybrid model, though respects the material balance, some of the used for prediction ζˆ ∈ [Xˆ , Sˆ]T . In a hybrid serial structure, the
poorly understood rates or terms are modeled through a data‐driven data‐driven component of the model (μ) is computed from the state
backbox function. Either a function (e.g., μ), term (e.g., μX), parameter either before or during the integration of the ordinary differential
(i.e., α, β), or even an entire rate (dX /dt, dP /dt) can be mapped using equations (ODEs). In parallel hybrid structure, the residuals (EX) from
the black box approach. The rate equation of a state may entirely be the mechanistic model are mapped as a black box function and added
omitted in a hybrid modeling exercise. back to mechanistic output for final prediction.
6 | MAHANTY
TABLE 1 A representative list of data‐driven hybrid models and their structure.
Data‐driven model Hybrid model structure Bioprocess References
ANN Serial/parallel Industrial hydrocracking Bhutani et al. (2006)
Polynomial regression Parallel Algal biomass growth and lutein synthesis Zhang et al. (2020)
PLS regression Parallel Cellulose‐to‐ethanol fermentations Cabaneros Lopez et al. (2021)
Feed−forward ANN Parallel Continuous yeast fermentation for ethanol production Zhao et al. (2023)
PLS Parallel Mammalian cell culture Narayanan et al. (2020a)
PLS Parallel Lignocellulosic fermentation Lopez et al. (2020)
DNN Parallel β‐carotene production using Saccharomyces cerevisiae Bangi et al. (2022)
ANN Serial Viral capsid protein production by Escherichia coli von Stosch et al. (2012)
ANN Serial Sodium gluconate fermentation by Aspergillus niger Dong et al. (2014)
Reinforcement learning Serial In silico fermentation with biomass, substrate, and product Mowbray et al. (2023)
ANN Serial Fed‐batch fermentation Zuo & Wu (2000)
ANN Serial Cell culture Narayanan et al. (2019)
Abbreviations: ANN, artificial neural network; DNN, deep neural network; PLS, partial least squares.
continuous‐discrete extended Kalman filter, have been resilient to (e.g., specific rates) can precisely be measured/calculated from
measurement disturbances. A PLS‐based hybrid model with an experimental data, before integrating it into mechanistic model
extended Kalman filter has also been adopted for real‐time framework. However, parameters from both black‐box (e.g., weights
monitoring and control of mammalian cell culture (Narayanan of feed‐forward ANN, or coefficients of multiple linear regression)
et al., 2020a). A soft‐sensor hybrid modeling approach has been and mechanistic model are combinedly estimated while minimizing
adopted to monitor the progress of batch cellulose‐to‐ethanol residuals between measured and modeled states.
fermentation, where a spectroscopic data‐driven PLS module A serial hybrid structure with two independent ANN, one to map
measures real‐time glucose concentration that is recursively fitted mycelium growth rate, and the other to map growth‐associated and
to update mechanistic model parameters, and long‐horizon predic- nongrowth‐associated parameters relevant to substrate consumption
tions (Lopez et al., 2020). The supervised learning‐based parallel‐ and product formation kinetics has been adopted to model sodium
hybrid approach only considers the correlation between residuals and gluconate fermentation by Aspergillus niger (Dong et al., 2014).
process inputs but does not account for oversimplification of the Reinforcement learning‐based identification of hybrid model struc-
mechanism model, or model parameter inaccuracies. Studies have ture has been adopted to model silico experiment data to quantify
suggested that the use of ANNs in a hybrid dynamic model can time‐varying and history‐dependent kinetic behavior for typical
impede numerical accuracy in parameter estimation, particularly if fermentation processes involving biomass, substrate, and product
multiple ANNs are involved (Zhang et al., 2020). Hybrid models can (Mowbray et al., 2023). Production of thuringiensin using Bacillus
also incorporate neural ordinary differential equations, that is, thuringiensis in a fed‐batch system has been modeled using a hybrid
derivatives of unknown dynamics along with kinetic model ODEs to unstructured model, where specific growth, substrate consumption
have increased the overall prediction accuracy (Bangi et al., 2022). and production rates were mapped to state variables using ANN (Zuo
Neural ODE assumes continuous variation of hidden state and is & Wu, 2000). Similarly, a feed‐forward single hidden layer ANN was
parameterized using an ODE specified by a neural network, and the adopted to project‐specific rates for biochemical species while
output layer can be computed using an ODE solver. A detailed modeling the fed‐batch cell culture process data set using a serial
theoretical treatment and implementational strategy of a neural hybrid model (Narayanan et al., 2019). The author reported robust
ODE‐based system can be found elsewhere (Bangi et al., 2022; Chen prediction of titer when compared to statistical predictive models.
et al., 2018).
The black box‐white box serial combination is most popular in
bioprocess modeling, where the former usually represents the 4 | HYBRID MODELING STRATEGIES
“imprecise” part of underlying kinetics in the latter, derived from
first principles, that is, material, or energy balances for the process 4.1 | Selection of model terms into a black box
(von Stosch et al., 2012). The black‐box module is used to map the
kinetic part of the model (e.g., specific growth rate) taking current Dynamic bioprocess models involving structured kinetics are typically
state measurements and/or reaction conditions as input (Figure 3). represented by a set of coupled differential equations as a function of
The black‐box part can independently be trained if the kinetic terms the current state and time‐invariant model parameters. The kinetic
MAHANTY | 7
expressions are often improvised, before conversion to a hybrid backbox function, encapsulating the growth rate itself into a neural
model, based on specific knowledge about the bioprocess. In serial network is also evident (Dong et al., 2014). Rate expression for all the
architecture, either the existing rate terms (i.e., specific growth rate, states, as in structured models, need not be replicated in hybrid
product formation rates) can be substituted with black‐box, or modeling. For example, fungal cell mass classification (i.e., primary
corrective terms can be added to the state dynamics (Lee mycelia, secondary mycelia, and spores) in structured kinetics for
et al., 2020b). Trichoderma reesei RL‐P37 in cellulase production was re‐assigned
The subset of kinetic parameters or rate expression to be into single biomass while adopting a hybrid modeling strategy
mapped as a black‐box function is not straightforward for multi- (Tholudur et al., 1999). The amount of “greyness” to be incorporated
parametric, multi‐state industrial scale fermentation. Ideally, key in a hybrid model without inductive bias can be important for
sensitive and uncertain model parameters, having considerable performance.
influence on prediction, are identified through local and global
sensitivity analysis and wrapped into a black box framework (Shah
et al., 2022). Time‐varying parameters identified through the 4.2 | Sensitivity analysis
clustering approach from the short‐listed subset can further be
considered for the hybrid model (Shah et al., 2022). In such a Restructuring the kinetic model into a hybrid model involves
situation, global sensitivity analysis is performed to identify the most assigning a few existing model parameters into flexible, time‐
important parameters, and a clustering technique is adopted to locate varying data‐driven functions that are appropriate to map changing
temporal subdomains where the parameter differs (Lee bioprocess environments (Duan et al., 2020). Critical parameters
et al., 2019, 2020a). As the functional relation of biochemical having a significant influence on outputs need to be identified and
reaction rates with biomass, and substrate/product concentrations reformulated into a data‐driven function for a hybrid model (Chu &
are not known in sufficient detail, such terms are often defined as a Hahn, 2007). Identification of such critical parameters is performed
flexible mixture of parametric (i.e., specific substrate utilization rate) through sensitivity analysis, where insensitive parameters can be
and nonparametric functions (i.e., specific growth and product dropped from optimization and fixed at their baseline values (O'Brien
formation rates) (Pinto et al., 2019). There is no consciousness across et al., 2021).
literature on the selection of model terms to be formulated as black Local sensitivity analysis involves the assessment of parametric
box components. Different combinations of parametric and non- perturbation (around the nominal values) on different model outputs.
parametric variants may be considered, as schematically shown for an A sensitivity matrix is first derived and the Fisher information matrix
empirical model (Figure 2c). is calculated to assess the effect of parameters, one at a time, on the
However, the reason for selecting or not selecting one/more outputs (Shah et al., 2022). Local sensitivity analysis of model
rates, functions, or parameters in the black box model are not always structure has been performed with a finite‐difference approach in the
obvious (Table 2). It is also necessary to identify whose dynamics biopharmaceutical process (Möller et al., 2020). The sensitivity of the
should be corrected by the black‐box intervention, by examining the process outputs, on a wide range of parameter settings, is reflected in
correlation between each state and outputs through relative order. the global sensitivity analysis (GSA). The range of parameters for GSA
For example, specific growth, and product formation rates in the fed‐ analysis is first determined from repeated parameter estimation, each
batch bioreactor were modeled using a feedforward neural network, time using slightly different datasets. For GSA, lain hypercube
while specific substrate consumption rates maintained original kinetic sampling of a parametric range is considered, where sensitivity
expression relating to biomass yield coefficient and maintenance analysis values are averaged at the end. Other approaches for GSA,
coefficient (Pinto et al., 2019). In the referred study, biomass (X), the such as Sobol's sensitivity indices can be used with a surrogate model
feeding rate (F), and cultivation temperature (T) were used as input minimizing the repeated solution of the hybrid model (Abbiati
for the ANN models. et al., 2021). Parameter significance can be measured based on
Unlike their mechanistic counterparts, a hybrid model only sensitivity indices ranging from 0 to 1 (Tsipa et al., 2018).
requires capturing a data‐driven approximation of standard kinetics
to reduce the nonlinearity of the problem. Rogers et al. (2023)
adopted a hybrid modeling approach in γ‐linolenic acid (GLA) 4.3 | Mapping the black‐box component and
fermentation via Cunninghamella echinulata X‐15 in bioreactors under architecture
three different temperatures. The authors modified the original
kinetic expression and referred to specific growth rate as an ANN 4.3.1 | ANN‐based hybrid models
function of temperature, biomass, and substrate concentration.
A hybrid modeling strategy can be formulated, even without any Though there is a range of data‐driven modeling approaches to map
prior knowledge of the functional relationship of specific rates. This is the black‐box part of a hybrid model, ANNs are most widely adopted
more common than the exception as evident in the literature (Pinto for nonlinear, complex systems to predict multiple responses
et al., 2022; von Stosch et al., 2016a; Zuo & Wu, 2000). Though, simultaneously, whilst taking multiple independent predictors as
specific rates (growth, product formation) are generally modeled into input (Noll & Henkel, 2020). Moreover, ANN can be adapted to have
8
TABLE 2 Model terms considered for the black box approach and their justification.
|
Time invariant Time invariant

Number of parameters in parameters after
System description state variable original kinetics the hybrid model Justification References
Production of viral capsid protein by a 3a 12 0 Specific rates of growth and product formation are Pinto et al. (2019)
recombinant Escherichia coli strain in a fed‐ difficult to establish
batch bioreactor
Biomass‐specific productivity rates, base addition 3 0b 0 Rates and correlation coefficients depend von Stosch et al.
correlation coefficient with biomass on process conditions, not directly measurable. (2016a)
Industrial fermentation for biomass growth on 6 32 29 Three time‐varying parameters using the clustering Shah et al. (2022)
two substrates, and intermediate, product approach were selected to avoid overfitting
formation
γ‐linolenic acid (GLA) fermentation via 4 24 0, 5,11c Parameters directly related to biomass growth, Rogers
Cunninghamella echinulata X‐15 ‐batch substrate consumption, and production were et al. (2023)
experiment replaced
Growth and protein production by Trichoderma 6 14d 0 Four specific rate terms are included in artificial Tholudur
reesei on lactose, and xylose neural network (ANN) modeling et al. (1999)
Sodium gluconate fermentation by Aspergillus 3 6 0 Growth rate modeled through back propagation Dong et al. (2014)
niger lack of knowledge of mycelium growth neural network (BPNN), kinetic parameters for
substrate and product through ANN
Fermentation system to cultivate Bacillus 3 0b 0 Specific growth, substrate consumption, and Zuo and
thuringiensis for thuringiensin production production rates Wu (2000)
Recombinant E. coli fed‐batch process 7 0b 0b Specific growth, substrate consumption, and Pinto et al. (2022)
production rates, and recovery rate constant
Fed‐batch cell culture therapeutic protein 8e 0b 0b Lack of knowledge of specific rates and metabolism. Narayanan
production. Included viable cells, concentration of glucose, et al. (2019)
lactate, glutamine, glutamate, ammonia,
osmolality, and titer.
Fed‐batch data on interleukin‐8 antibody 7 12 0 Specific rate of growth (total, viable cell), four (Bayer et al. (2023)
production by Chinese hamster ovary (CHO) substrate consumption, product formation
DP‐12 cells. modeled
Simulated batch production of monoclonal 6 6 0 Specific rate of glucose, glutamine, lactate, Botton
antibodies ammonia, and mAbs titer modeled from et al. (2022)
simulated experiments
MAHANTY
MAHANTY | 9
a different number of fully or partially connected hidden layers, a
Luna et al. (2021)

variety of transfer functions (Apicella et al., 2021), and learning
et al. (2021)
Vega‐Ramon
algorithms which can be optimized for the complexity of a given
References problem (Karimi Alavijeh et al., 2022). In ANN‐based hybrid models,
however, a time‐intensive optimization approach must be integrated
to refine hyperparameters and network topology (Lancashire
et al., 2008).
(biomass/substrate, product/biomass) mapped as
Specific rates of growth, substrates, and product Identification of the most likely combination of kinetic model
structures with their time‐varying parameter through reinforcement
Specific growth rate, and two yield coefficients
learning (RL) has been proposed in hybrid modeling for typical in silico
formation mapped using a single ANN
fermentation measuring biomass, substrate, and product concentration

(Mowbray et al., 2023). ANN has been adopted as parameterized
control policy in RL, mapping optimal parameters from input state
function biomass, substrate
variables, through maximization of the expected sum of rewards over

the time domain. A simple feed‐forward ANN can be effective to model
specific rates of multiple bioprocess variables in a hybrid modeling
approach (Narayanan et al., 2019). A two‐norm regularized objective
function and cross‐validation can be adopted to avoid overfitting and
Justification
identify the optimal number of neurons and regularization parameters.

Similarly, a single hidden layer ANN (with hyperbolic tangent and linear
transfer functions) was chosen to model specific rates of biomass,
viable cell, substrate and product formation in fed‐batch production of
interleukin‐8 antibody by Chinese hamster ovary (CHO) DP‐12 cell
the hybrid model
parameters after
Time invariant
(Bayer et al., 2023). Adopting a serial hybrid structure, the ANN models
the specific rates utilizing four operating parameters and predictions of
the previous time step of each response variable as input. Bayer et al.
0
(2021b) adopted an ANN‐based hybrid model for fed‐batch cultivation

of CHO cells. Standardized input (z‐score) cultivation temperature,
glucose, glutamine, asparagine, alanine, and aspartate to glutamate ratio
original kinetics
Time invariant
parameters in
were used to model specific growth and product formation rates. The
author adopted in‐depth knowledge, correlation analysis, and PCA to
select appropriate inputs for the ANN model.
7
10
Out of three process variables, substrate profile is dropped from hybrid model.
The rate of production/consumption of culture variables (i.e.,

biomass, substrate, product) can be modeled as a product of
maximum specific rates, stoichiometric coefficients (reaction rates
state variable
on the cell concentration), and specific production/consumption rates

Number of
estimated by the ANN from reduced concentration vector as input

(Botton et al., 2022). A single hidden layer containing 10 neurons,
f
4
3
hyperbolic‐tangent activation function, linear output transfer func-

Dynamically changing, noncontrolled process variables.
tion, and minimization of the normalized sum of squared errors with

regularization term in training convergence was adopted. Separate
PHA production by Pseudomonas putida GPo1
Astaxanthin production by Xanthophyllomyces
Parametric white box models not described.
ANN submodels for each of the data‐driven model terms/parameters

dendrorhous in batch culture – glucose
(i.e., specific rates), with independent input space mapping, can also
The different hybrid model considered.
be formulated in a hybrid model (Yang et al., 2016). Though, such a

using ammonia and octanoate
strategy improves the flexibility of the network design, the number of

In a structured kinetic model.
hyperparameters to be identified increases proportionally.

&/sucrose as substrate
For single sugar utilization.

(Continued)
System description
4.3.2 | Non‐ANN hybrid models

TABLE 2
Due to the perceived risk of overfitting from the ANN‐integrated

kinetic model, polynomial regression has been elected as a data‐
driven component in the hybrid modeling exercise (Zhang et al., 2020).
b
e
a
f
10 | MAHANTY
In serial hybrid structure, the author adopted a mixed integer maintenance, yield coefficients) are encapsulated into a black box
nonlinear programming (MINLP) for the automatic identification of architecture, as shown in Equation (3).
kinetic model alternatives and polynomial model reduction in
dξ
conjunction with the parameter estimation approach. = f (g (ξ , t, w ), ξ , t, θ), (3)
dt
Hybrid Gaussian process state spaces models (GPSSM) with fully
discretized numerical solution has been used to model viable cell where the rate expression f (g (.), ξ , t, θ) is well defined a parametric
density, glucose, glutamine, ammonia, lactate, and product titer in cell function, however, g (ξ , t, w ) is the prediction (e.g., specific growth
culture (Cruz‐Bournazou et al., 2022). Following a quadratic rate) from the data‐driven model (i.e., ANN, polynomial regression),
polynomial to fit the experimental data, the polynomial derivatives with hyperparameter w (network weight, bias, or polynomial coeffi-
were used to train the GPM for each finite element adopting cients), using states, and/or time input predictors. Selection of
penalized least‐square approach. Vega‐Ramon (Vega‐Ramon predictors for the black‐box models, though remains heuristic, may
et al., 2021) adopted a mechanistic model in astaxanthin production be guided based on fundamental knowledge about the process. As
in Xanthophyllomyces dendrorhous batch culture. Following the the data‐driven components lack the direct measurements of specific
estimation of model parameters, the calculated specific growth rate, rates, they are not pre‐trained on any independent data set and
substrate consumption rate, and product accumulation rate at each cannot even have a finalized topology, or weights and biases
time step for each data set were used to build three Gaussian process (Oliveira, 2004).
models, using an exponential kernel function. The author observed
lower uncertainty associated with the hybrid model in comparison to
its kinetic variant, a better representation of the process dynamics. 4.5 | Parameter estimation approaches
Parameter estimation for a mechanistic model is often directed

4.4 | Difficulties in implementation of black‐box through a least‐square regression approach incorporated into the
component optimization framework. Developing the data‐driven model part is
certainly the most challenging aspect of the hybrid model. However,
The mechanistic model of a dynamic bioprocess system is repre- the output of the data‐driven component is often a specific rate and
sented by a system of coupled differential equations of the states as a the explicit relationship between the network trainable variables and
known function of state variables and time‐invariant parameters as the concentrations is unknown (lack of target outputs). Two different
shown in Equation (1). strategies have been proposed to train the hybrid model (Karimi
Alavijeh et al., 2022) with the data set, i.e., (i) approximation of
dξ
= f (ξ , t, θ), (1) bioreaction rates, followed by minimization of the deviation from the
dt
predicted specific rates; and (ii) minimization of error between
where ξ is the state vector, that is, ξ = [X , S, P ]T and θ is a set of experimental and predicted state concentrations through sensitivity
time‐invariant parameters. As the functional form of f(.) is explicitly approach.
known, the differential can be calculated at each time step without
difficulty. Model fitting exercise can take either differentiation of
time series data to calculate the rate, or integration of the model to 4.5.1 | Direct approach
predict the concentrations. However, the former is not considered
due to noise amplification. With a specified initial state vector (ξ(t =0) ) Existing hybrid modeling approaches take a multi‐step strategy to
and guess the value of θ, the system of ODEs is numerically develop the empirical models of unknown variables (Yang et al., 2016),
integrated while iteratively changing the parameter vector to as briefed below. The simplest approach would take an appropriate
minimize the weighted sum of squared error (Ew) between interpolation function for the data set, and calculation of derivatives
experimental and modeled states inside an optimization framework of state variables ( ) at different time points (Figure 4a). If the
dξ
dt
(Wang et al., 2018). proposed hybrid model does not have any time‐invariant parameter
(to estimate), the data‐driven target output g(.) can be calculated at
1 n m
(ξij − ξîj )2
Ew = ∑ ∑
n × m i =1 j =1 σ 2ξi
, (2) each time point by plugging the value of ( ) in Equation (3). The
dξ
dt
empirical data‐driven model can now be formalized as both ξ and g(.)

where, i = 1, 2, …, n denotes the ith state variable, j = 1,2, …, m is a are available at each time point. The length of an interpolated data
different time point of measurement, and ξij, ξîj, ξ¯i are the measured, set can be much higher than experimental data points, which can be
predicted, and measured average value of the ith state, respectively. advantageous for the data‐driven model. Once the data‐driven model
In hybrid modeling, the representation of rate equations can be parameters (weight, bias, coefficients) are determined, the hybrid
different depending on whether one/more specific rates, terms, or model can be integrated using a numerical approach. Experimental
some time‐invariant coefficients of the mechanistic model (i.e., design, for example, central composite design, or full‐factorial design
MAHANTY | 11
(a) (b)
F I G U R E 4 Direct approach of hybrid modeling (a) dimension of experimental data is increased through interpolation, and the specific rates
at different times g (ξ , t, w ) are approximated from differences between adjacent data points. The data‐driven part is trained and then
incorporated into the ordinary differential equations (ODEs) of the hybrid model. In the optimization‐based approach (b) data‐driven prediction,
that is, the output g (ξ , t, w ) is guessed for all sampling points and the midpoint of each interval. The system of ODEs is numerically integrated
using discretized RK4 algorithm, and the objective function is calculated. Once the objective (Ewr) stops decreasing, the optimization terminates,
and the data‐driven model is trained with the optimized set g (ξ , t, w ), and reintegrated into a hybrid model.
is used to generate different combinations of process factors or appears to be a major challenge. This inevitably has a detrimental
settings for batch or fed‐batch experiments to be executed in influence on the unknown variable estimation values, leading to
laboratory conditions (Pinto et al., 2022). Intra‐experimental shifts of compromised reliability and accuracy, and eventually impacting the
CPP from one to another set point during fed‐batch fermentation generalization capability of the hybrid model (Karimi Alavijeh
reduce experimental burden (Bayer et al., 2020b). Data sets (time‐ et al., 2022).
series transient or steady state) from such designed experiments are The hybrid models can also be constructed by first dividing the
smoothed and interpolated over time to create a large training time domain into equally spaced intervals (Figure 4b). For example, if
pattern for the neural network, that maps the input state variables to experimental measurements are available at m different time points,
output (i.e., specific rates) (Zuo & Wu, 2000). unique values of the different specific rates g lj are assumed for each
A pre‐existing mechanistic bioprocess model can be used to of the m−1 intervals. In addition, if the hybrid model contains k time‐
generate the necessary training data for a neural network. First, invariant parameters, the entire set of m−1+k parameters can be
multiple experimental data sets are simultaneously fitted to the estimated (from their initial guess) by adopting a least‐squares
mechanistic model. In a refitting exercise, the parameters related to regression problem (Rogers et al., 2023). The objective function, that
growth, uptake, and product formation rates are independently is, weighted error with regularization (Ewr) for the optimization is
estimated for each data set while leaving other parameters at their formulated as in Equation (4).
global estimates (Luna et al., 2021). The simulated states can be used
to calculate the specific rates using the model equations, and this 1 n m
(ξij − ξîj )2 r m −1 2
compilation can subsequently be used to train the data‐driven

Ewr = ∑∑
n × m i =1 j =1 2
σ ξi (
l =1 j =1
)
+ λT ∑ ∑ g lj − g lj +1 , (4)
components of the hybrid model. Experimental data can also be

modeled to some typical mathematical function (i.e., logistic function) where, ξij and ξ̂ij are experimental and predicted values of ith state
for the estimation of time derivatives through analytical differentia- (i = 1, 2…n) on jth time point (j = 1, 2…m), respectively. The first term
tion (Tholudur et al., 1999). The time derivatives (rates) can be used of the objective function is the normalized sum of square error, while
to train the data‐driven component of the hybrid model. the second part penalizes a drastic change in specific rates (prediction
Although this procedure appears to be simple, the interpolation for r different black box modules) between two consecutive time
itself at the very first step of modeling is the key source of the noise. points. The λ is the penalty weight vector for state variables that
Moreover, offline data are acquired at large uneven time intervals (up controls the “rigidity” of black‐box output with time, where larger
to several hours) in bioprocess. Nevertheless, state dynamics evolve values will minimize rapid change in rates, and reduces the risk of
as a continuous time‐dependent function and its discretization overfitting to measurement noise (Rogers et al., 2023). The choice of
12 | MAHANTY
penalty weight should balance between mean relative percentage the values of those parameters that are not contributing to the model
error and uncertainty during parameter estimation. to zero, promoting a parsimonious structure (Stosch et al., 2018). The
To estimate the model parameters, the system of ODEs needs to Levenberg–Marquardt method (LMM), is very effective in solving
be discretised into fourth‐order Runge‐Kutta algebraic form and indirect training and identification, to estimate the respective specific
transformed into a nonlinear programming problem. However, the rates (Bayer et al., 2021b; Bayer et al., 2023). The error back
discretization of the process to match sampling time steps would propagation can be performed by calculating the gradient of the
impact the derivative approximation and accuracy of the solution squared errors (E) for the model state to the ANN weights, that is,
∂E
(Cruz‐Bournazou et al., 2022). However, following the minimization ∂w
(Oliveira, 2004; Pinto et al., 2019). Evaluation of Jacobian (J) in an
of the objective function (as in Equation 4), the data‐driven model ANN‐based hybrid model with sensitivities equations is shown in
g (ξ , t, w ) needs to be trained to fix the weight and bias (for ANN). Equation (7).
 dE   dξ  2 m  dξ 
= ∑     = − ∑ eTt  
Finally, hybrid models need to re‐simulate the state trajectories to
m t =1  dw t
m
∂w t =1  dξ t  dw t
∂E
check the final fitness. J= (7)
with et =ξit − ξît . The matrix ( )

dξ
dw t
can be computed from the
4.5.2 | Indirect training approach differentiation of state rates in Equation (3) to w, resulting in an
The training of hybrid models is essentially the identification of time‐

expression ( ) = ϕ( ),
d dξ
dt dw
dξ
dw
which should be simultaneously inte-
grated along with hybrid model ODEs, starting from the initial
invariant parameters (θ), and data‐driven model parameters (w)
coefficients, weight, and bias. If w is considered as a parameter vector
condition ( )
dξ
dw t=0
= 0 , and Euler forward integration method with a
(along with θ) to be identified, the training is transformed into a fixed step size (Oliveira, 2004; von Stosch et al., 2016a). The
simple dynamic system parameter estimation problem (Yang implementation of the LMM algorithm for the estimation of data‐
et al., 2016), that is, minimizing the SSE between observed and driven model (particularly ANN) parameters is shown in Figure 5.
predicted states as in Equation (2). The specific rates (e.g., µ, qp) can Model output and error are calculated from initialized weights and
be modeled as a feed‐forward neural network in a hybrid model biases, to estimate the Jacobian, and update the network parameters
system of ODEs (von Stosch et al., 2016b), as exemplified in Equation for new outputs (Shah et al., 2022). This process continues in an inner
(5). loop till the error becomes lower than a tolerance value. The network
weights are restored if weight‐updating results in the amplification of
H1 = tansig (w1*ξ + b1) error between model prediction and experimental observation. The
H2 = tansig (w2*H1 + b2) (5)
loss function can also be monitored on an independent validation set
[μ; qp] = purelin (w3*H2 + b3)
as a stopping criterion, that is, when validation error starts to increase
The above example considers two hidden layers with the (Pinto et al., 2019).
network weight wi (i = 1:3), and bias bi (i = 1:3). The network takes For the hybrid model, different specific rates can be modeled
the input of state vector (ξ ), and tangential‐sigmoidal (tansig), linear using the independent feed‐forward network. The topology selection
(purelin) transfer functions in the nodes of hidden, and output layers, (number of neurons in hidden layer) and parameter identification
respectively. The H1 and H2 are cumulative inputs received by the process (i.e., weight and bias) can be performed sequentially. The
different units in the first and second hidden layers, respectively. The number of hidden neurons for each ANN sub‐model in the hybrid
μ, qp , for example, are ANN output for specific growth, and product model can be varied within a given range (expert judgment), using a
formation rate respectively. The dimension of wi or bi depends on the uniform experimental design. A leave‐one‐out cross‐validation
number of inputs, hidden layer neurons, and output. This approach technique (using different batch experiments) can be adopted in
has been adopted in numerous studies, particularly when the each of the hidden neuron combinations to find the optimal group
topology of the data‐driven part is prefixed, or assumed. In this (Yang et al., 2016). Different network structures can be compared
indirect method, the loss function is not directly linked to the neural based on their Bayesian information criteria (BIC) value, the one with
network outputs (Pinto et al., 2022). As the data‐driven model is the highest BIC is retained (von Stosch et al., 2016a).
trained indirectly in the optimization framework, a regularization term
can also be included in the fitness function (Ewr) to improve training
convergence (Botton et al., 2022; Yang et al., 2011) as shown in 5 | GE NERALIZATION
Equation (6). CAPABILITIES —H O W T O ENS URE ?
1 n m
(ξij − ξîj )2 5.1 | Experimental design to model—Digital twins
Ewr = ∑∑ + λ‖g‖, (6)
n × m i =1 j =1 σ 2ξi
Design of experiments (DoE) is the tool to systematically investigate
where λ is the regularization parameter (weight), and g is the the relationship between input factors and the response of interest
parameter for the data‐driven component. The regularization drives (product concentration). DoE suggests a series of experimental
MAHANTY | 13
F I G U R E 5 Hybrid model training scheme based on sensitivity analysis. The model outputs are calculated based on initialized weights and
biases (wk, bk). The corresponding error, Ek, is used to calculate the Jacobian matrix with respect to the neural network parameters. The weight
and bias values are updated (wk+1, bk+1) to calculate new outputs and the process continues as the error decreases. An inner loop counter (m) is
used to control the change in the combination coefficient and update the parameters. Adopted and modified from (Shah et al., 2022).
F I G U R E 6 Hybrid model‐based prediction of optimal process conditions adopting digital twins. Hybrid model developed from an initial data
set is used to simulate optimal process conditions. If the recommended conditions are not reflected in existing data, experiments are carried out
under laboratory conditions and added to the initial data set for re‐training the hybrid model. The procedure is repeated until the best process
conditions are ascertained.
conditions combining different levels of each process factor for can be added to the existing data set for re‐training the hybrid model
effective exploration of response space (Mandenius & Brundin, 2008). (Bayer et al., 2021a). This exercise can be repeated unless the
DoE considering critical process parameters is the most widely projected best process conditions get stabilized (Figure 6). In silico
adopted approach to ensure quality by design target in a hybrid generation of simulated batch experiments with a semi‐parametric
modeling paradigm. A hybrid model‐based experimental design can hybrid has been shown to improve product titer, while deciphering
be adopted to explore the design space more effectively and identify the correlation between critical process parameters and critical
the optimum process conditions in bioprocess (Bayer et al., 2020b; quality attributes (Botton et al., 2022).
Bayer et al., 2021b; Sommeregger et al., 2017). A hybrid model The pattern of process operation space in the data set, used to
developed from a small initial data set is used to predict train the data‐driven module, should, however, cover the expected
recommended optimal experiments, and once carried out the same process region in prediction sets, for all practical purposes (Zhang &
14 | MAHANTY
Mao, 2017). The generation of high‐quality experimental data can be through the top‐level data‐driven model to incorporate different
laborious and time‐consuming. Intensified DOE (iDoE) that adopts process behaviors (Zhang et al., 2019).
intra‐experimental set‐point changes in fed‐batch experiments can
be used to train hybrid models, with good prediction performance
across multiple test batches, while substantially reducing the 5.2 | Cross‐validation
experimental burden (Bayer et al., 2020b; von Stosch et al., 2016b).
Experimental design for iDOE adds several stages (typically three or Predictability and interpretability of any bioprocess models, whether
four) in each experiment, but requires no information on the process mechanistic, data‐driven or hybrid, need to be evaluated to decide on
model in priori (von Stosch & Willis, 2017). The process response the usability of the model. Over‐fitting is a well‐known identification
information gathered from iDoE can be managed through hybrid problem, where the model fits the training data exceptionally well but
modeling, as a tool to have the best possible description of process shows poor performance on a new data set. Cross‐validation is an
behavior. Studies have suggested low errors in model transfer from internal re‐sampling method to ensure the quality of the mathemati-
shaker to reactor scale and between the DoE and the iDoE approach cal model formulated, where the original data is split into training and
(Bayer et al., 2021b). validation subsets, and simultaneously used for model building and
Experimental conditions for fed‐batch cultivations can be validation (Rajamanickam et al., 2021). Partitioning is typically
formulated with model‐assisted DoE of feeding strategies for performed batch‐wise with the amount of data allocated in each
seamless integration into a hybrid model development routine. For partition depending on data availability (Pinto et al., 2022). In hybrid
example, Bayer et al., (Bayer et al., 2023) performed four initial modeling, hyper‐parameter selection (i.e., number of neurons in
experiments, and a full DoE with 29 experiments varying the glucose, hidden layer) can be conducted through a k‐fold fashion, where a
and glutamine concentration in the feed, the bolus feed rate, and the subset of experimental batches is used for training and the remaining
feed start. The author adopted different data set labels in the training data set is used to obtain a normalized error estimate across all states
and validation exercise, using combinations of the initial four and fill along the trajectory (del Rio‐Chanona et al., 2017a). This procedure is
DoE sets. repeated until the chosen network configuration has been tested
A moderate number of experimental batch/fed‐batch studies against all the sets. The cross‐validation error can be used to choose
can be performed in laboratories, where the generated data set can the optimal topology in hybrid modeling.
randomly be distributed (in a defined proportion) into training and To fix an optimal number of neurons for three different empirical
testing subsets (Dong et al., 2014; Narayanan et al., 2019) to build submodels in a hybrid modeling structure, Yang et al. (2016) adopted
the hybrid model. However, the dimension of the data set used to a uniform design table for the combination of hidden neuron
train and test the hybrid model can vary depending on the model numbers and performed leave‐one‐out cross‐validation with 12
complexity, data availability, and at the discretion of the modeler. batches of modeling data. The random data partitioning in the
For example, Rogers (Rogers et al., 2023) chose three batch cross‐validation process (except for the leave‐one‐out) can be
experiments (13 measurement points in each set) to train, and repeated multiple times to average the cross‐validation error (Bayer
two batches to test the performance of hybrid models of different et al., 2023).
“grayness.” Entire experimental data sets can also be used to train a
hybrid model while reusing the same data set following Gaussian
noise addition as a validation set to stop overfitting (Pinto 5.3 | Bootstrap‑aggregation
et al., 2022). The incorporation of random noise increases the
dimension of an existing experimental data set, which is likely to Bootstrap aggregation, also known as bagging, is a machine‐learning
improve the predictive power of ANN models with biological approach designed to improve the stability and accuracy of models in
systems (del Rio‐Chanona et al., 2017a). However, the authors have statistical regression (Baron & Zhang, 2020). The bootstrap aggrega-
not explicitly compared the effectiveness of the strategy. Smooth- tion approach is particularly preferred for a smaller data set, where
ing and interpolation of experimental data in time dimension prefixing validation and training fold may not offer sufficient variation
facilitate data‐driven modeling, though the consequence on the in the model development. Experimental sets (not data points) from
model's generalization capability remains questionable (Zuo & the combined training‐validation pool are randomly re‐sampled into
Wu, 2000). The existing parametric kinetic model can mimic a training/validation partitions n number of times (Pinto et al., 2019).
“true” experimental system to generate in‐silico experimental data Each time, a different hybrid model is generated and identified using
to evaluate the hybrid model's performance (Pinto et al., 2019; the weighted least squares approach. During training, the loss of the
Zhang et al., 2020). The size of generated data is maintained to validation set is also measured as a stopping criterion. Finally,
match the real experimental implementations. Gaussian measure- predictions from best k hybrid models (k«n) are averaged for the
ment noise can also be added to in‐silico data before hybrid process states (i.e., biomass, substrate concentration), or data‐driven
modeling (Mowbray et al., 2023). Raw experimental data collected function (i.e., specific rates). Different hybrid models can have better
from process plants can initially be modeled using simple kinetic to performance in different regions of input space, and combining
correct and generate missing values, which can further be processed multiple models can offer better prediction accuracy.
MAHANTY | 15
6 | H Y B R I D M O D E L AP P L I C A T I O N S , heuristic. The hybrid model should be considered with prior domain

P R A C T I C A L CH A L L E N G E S , A N D F U T U R E knowledge of the underlying process (Zhang et al., 2020). Accurate
RESEARCH DIRECTIONS modeling of the kinetic part to account for primary process
mechanisms simplifies the data‐driven modeling of the secondary
6.1 | Application in model predictive control (MPC) process. Such formalization or distinction may not always be
apparent in complex bioprocess systems. Implementation of a hybrid
Hybrid model offers better extrapolation capabilities in comparison model from experimental data could be a significant hurdle for
to the data‐driven model and is useful for process control and researchers, who may not be well‐versed in complex mathematical
optimization. In MPC, the optimal future control actions are predicted formalization of sensitivity equations involved in hybrid‐model
based on the current horizon and control actions are delivered for identification, and parameter estimation exercises. Sharing experi-
execution for the next time step. To account for future disturbances, mental data, code (or at least pseudocode) in a communal repository
MPC requires a robust dynamic process model where online or as supplementary material to publications, is rarely reflected.
measurements from the ongoing processes are used to fine‐tune
the dynamic model parameters for future control actions (Zhang
et al., 2019). Though MPC can also be formulated with any model 6.3 | Future research directions
variants, the hybrid models generally with fewer parameters are
easier to adopt, interpret, and control strategies can be reviewed Hybrid modeling approach has been a key enabler for industrial
(Cubillos et al., 2001). Control Lyapunov–Barrier function (CLBF) bioprocess for estimation of the unknown‐specific uptake/secretion
based MPC, which offers constrained input stabilization of nonlinear rates, predictions for key nutrient/metabolite and product concen-
systems, was developed that utilizes a DNN combined with a first‐ trations (Kotidis & Kontoravdi, 2020), accelerate process develop-
principles model (Bangi & Kwon, 2023). A hybrid model‐based MPC ment through DoE with reduced experimental effort (Möller
strategy has been adopted to increase the profitability of an industry‐ et al., 2019). As an assembly of hybrid models can be readily
scale fermentation process that utilizes multi‐rate state observers automated, smart manufacturing with safer, and more cost‐effective
(Shah et al., 2023). Open‐loop MPC with a predetermined optimal could be accessible for biopharmaceuticals processes (Tsopanoglou &
control action sequence has been adopted to minimize inter‐batch Jiménez del Val, 2021). Even though the data requirements are
variability in recombinant protein production (Jenzsch et al., 2006). certainly reduced in hybrid modeling, the quality of data should be
Poor system‐to‐system transferability of ML‐based algorithms reasonable to model multivariate interactions. In addition, active
necessitates finetuning of a digital twin for MPC even in related learning coupled with iterative experimental design approaches will
systems, where large‐language models (e.g., ChatGPT) inspired be required to probe the poorly understood part of the model
transfer learning‐enhanced model development can be promising (Narayanan et al., 2023). In addition, there is no specific guideline on
(Sitapure & Kwon, 2023a). Cutting‐edge first‐generation time‐series the choice of ML algorithm, or the proportion of greyness to be
transformers models, with the potential for multivariate time‐series inscribed in model development. The next generation of hybrid
prediction, have also been integrated into process modeling and models has to be flexible enough to support different modeling
control (Sitapure & Kwon, 2023b). formalisms and available powerful tools with user interfaces (Liu
et al., 2022b). The development of software tools with the option to
integrate various sources of knowledge to formulate the best hybrid
6.2 | Challenges with hybrid modeling model for a given application is warranted (Sansana et al., 2021).
Transformation of nonlinear dynamics into linear ones and the close‐
Hybrid models are of great potential for efficient use of available loop stabilization through Lyapunov‐based model predictive control
experimental data, adopting digital twins in process development, (LMPC) or feed‐back control design can be realized in a bioprocess
and validation of model transferability into a different scale. system (Narasingam & Kwon, 2019, 2020). The use of hybrid‐model
However, hybrid models should be considered as an augmentation based digital‐twins (DTs) to manage process data to derive novel
tool, not a magic bullet to cover up poor‐experimental design, insights, near‐real‐time process characterization (identification of
erroneous measurements, or lack of knowledge in the concerned future disturbances), and as a part of industry 4.0 standards in diverse
bioprocess. A hybrid model must be well‐trained on a wider process biomanufacturing process would require features for adaptability and
space data set to have accurate predictions on unseen data (Walsh transferability (Sokolov et al., 2021).
et al., 2022). Each modeling exercise is unique to available data
(offline, online measurement), the complexity of the bioprocess, and
cannot be replicated in a different process system. Though a general 7 | CONCLUSIONS
framework for a hybrid model development pipeline can be laid
down, the mapping of data‐driven components (process terms, Hybrid modeling is certainly a useful tool to explore, control, and
specific rates, or parameters as a function of state variables, external predict complex bioprocess system behavior. Integration of data‐
inputs), their architecture, and hyperparameters remains largely driven components into the first‐principal model offers many
16 | MAHANTY
flexibilities, starting from the selection of the data set, the process to Araúzo‐Bravo, M. J., Cano‐Izquierdo, J. M., Gómez‐Sánchez, E., López‐
be mapped as a black‐box function and their topology, hyperpara- Nieto, M. J., Dimitriadis, Y. A., & López‐Coronado, J. (2004).
Automatization of a penicillin production process with soft sensors
meters, training algorithm, and validation. A review of literature from
and an adaptive controller based on neuro fuzzy systems. Control
the last two decades, however, does not suggest a profound Engineering Practice, 12, 1073–1090. https://linkinghub.elsevier.
expansion of hybrid modeling into bioprocess/biotechnology. This com/retrieve/pii/S0967066103002508
is not because of the unavailability of resources (i.e., reviews, Arnold, S., Becker, T., Delgado, A., Emde, F., & Enenkel, A. (2002).
Optimizing high strength acetic acid bioprocess by cognitive
textbooks, and commercial software packages), but because the
methods in an unsteady state cultivation. Journal of Biotechnology,
complex mathematical concepts presented in them are usually 97, 133–145. https://linkinghub.elsevier.com/retrieve/pii/
beyond the formal academic training of researchers in biotechnology. S0168165602000652
Furthermore, methodological descriptions in research articles are not Artigue, H., & Smith, G. (2019). The principal problem with principal
components regression. Cogent Mathematics & Statistics, 6,
always in sufficient detail that enables the readers to grasp the
1622190. https://doi.org/10.1080/25742558.2019.1622190
intricate implementation strategy. There are many methods at
Bae, J., Lee, H., Jeong, D. H., & Lee, J. M. (2020). Construction of a valid
different levels of complexity so it is very easy to get lost in the domain for a hybrid model and its application to dynamic
search for a simple, reliable strategy for model identification. optimization with controlled exploration. Industrial & Engineering
Chemistry Research, 59, 16380–16395. https://doi.org/10.1021/acs.
iecr.0c02720
A U T H O R C O N TR I B U T I O N S
Bangi, M. S. F., Kao, K., & Kwon, J. S.‐I. (2022). Physics‐informed neural
Biswanath Mahanty as sole author contributed to conceptualization, networks for hybrid modeling of lab‐scale batch fermentation for β‐
project administration, writing‐original draft, writing review, and carotene production using Saccharomyces cerevisiae. Chemical
editing. Engineering Research and Design, 179, 415–423. https://linkinghub.
elsevier.com/retrieve/pii/S0263876222000491
Bangi, M. S. F., & Kwon, J. S.‐I. (2020). Deep hybrid modeling of chemical
A C KN O W L E D G M E N T S process: Application to hydraulic fracturing. Computers & Chemical
Biswanath Mahanty acknowledge the support from Karunya Institute Engineering, 134, 106696. https://linkinghub.elsevier.com/retrieve/
of Technology and Sciences. The author also appreciates the “lesson” pii/S0098135419311019
Bangi, M. S. F., & Kwon, J. S. I. (2023). Deep hybrid model‐based
learned from Ms Pema, when decided to explore the hybrid modeling
predictive control with guarantees on domain of applicability. AIChE
avenue. Journal, 69(5):e18012. https://doi.org/10.1002/aic.18012
Bardow, A., & Marquardt, W. (2004). Incremental and simultaneous
CO NFL I CT OF INTERES T S T ATEME NT identification of reaction kinetics: Methods and comparison.
Chemical Engineering Science, 59, 2673–2684. https://linkinghub.
The author declares no conflict of interest.
elsevier.com/retrieve/pii/S0009250904002015
Baron, C. M. C., & Zhang, J. (2020). Reliable on‐line re‐optimization
D A TA A V A I L A B I L I T Y S T A T E M E N T control of a fed‐batch fermentation process using bootstrap
Data sharing is not applicable to this article as no new data were aggregated extreme learning machine. In O. Gusikhin, & K. Madani
(Eds.), Informatics in control, automation and robotics. ICINCO 2017.
created or analyzed in this study.
Lecture Notes in Electrical Engineering (Vol. 495, pp. 272–294).
Springer. https://doi.org/10.1007/978-3-030-11292-9_14
ORCID Bayer, B., Dalmau Diaz, R., Melcher, M., Striedner, G., & Duerkop, M.
Biswanath Mahanty http://orcid.org/0000-0002-5815-2440 (2021a). Digital twin application for model‐based DoE to rapidly
identify ideal process conditions for space‐time yield optimization.
Processes, 9, 1109. https://www.mdpi.com/2227-9717/9/7/1109
REFERENCES
Bayer, B., Duerkop, M., Pörtner, R., & Möller, J. (2023). Comparison of
Abbiati, G., Marelli, S., Tsokanas, N., Sudret, B., & Stojadinović, B. (2021). A mechanistic and hybrid modeling approaches for characterization of
global sensitivity analysis framework for hybrid simulation. a CHO cultivation process: Requirements, pitfalls and solution paths.
Mechanical Systems and Signal Processing, 146, 106997. https:// Biotechnology Journal, 18, 2200381. https://doi.org/10.1002/biot.
linkinghub.elsevier.com/retrieve/pii/S0888327020303836 202200381
Alrashed, A. A. A. A., Karimipour, A., Bagherzadeh, S. A., Safaei, M. R., & Bayer, B., Duerkop, M., Striedner, G., & Sissolak, B. (2021b). Model
Afrand, M. (2018). Electro‐ and thermophysical properties of water‐ transferability and reduced experimental burden in cell culture
based nanofluids containing copper ferrite nanoparticles coated with process development facilitated by hybrid modeling and intensified
silica: Experimental data, modeling through enhanced ANN and curve design of experiments. Frontiers in Bioengineering and Biotechnology,
fitting. International Journal of Heat and Mass Transfer, 127, 925–935. 9, 740215. https://doi.org/10.3389/fbioe.2021.740215/full
https://linkinghub.elsevier.com/retrieve/pii/S0017931018324876 Bayer, B., Stosch, M., Striedner, G., & Duerkop, M. (2020a). Comparison of
Andrade Cruz, I., Chuenchart, W., Long, F., Surendra, K. C., modeling methods for DoE‐based holistic upstream process charac-
Renata Santos Andrade, L., Bilal, M., Liu, H., Tavares Figueiredo, R., terization. Biotechnology Journal, 15, 1900551. https://doi.org/10.
Khanal, S. K., & Fernando Romanholo Ferreira, L. (2022). Application 1002/biot.201900551
of machine learning in anaerobic digestion: Perspectives and Bayer, B., Striedner, G., & Duerkop, M. (2020b). Hybrid modeling and
challenges. Bioresource Technology, 345, 126433. https:// intensified DoE: An approach to accelerate upstream process
linkinghub.elsevier.com/retrieve/pii/S0960852421017752 characterization. Biotechnology Journal, 15, 2000121. https://doi.
Apicella, A., Donnarumma, F., Isgrò, F., & Prevete, R. (2021). A survey on org/10.1002/biot.202000121
modern trainable activation functions. Neural Networks, 138, 14–32. Bhadriraju, B., Bangi, M. S. F., Narasingam, A., & Kwon, J. S. I. (2020).
https://linkinghub.elsevier.com/retrieve/pii/S0893608021000344 Operable adaptive sparse identification of systems: Application to
MAHANTY | 17
chemical processes. AIChE Journal, 66(11):e16980. https://doi.org/ Fisher, O. J., Watson, N. J., Escrig, J. E., Witt, R., Porcu, L., Bacon, D.,
10.1002/aic.16980 Rigley, M., & Gomes, R. L. (2020). Considerations, challenges and
Bhola, V., Swalaha, F. M., Nasr, M., & Bux, F. (2017). Fuzzy intelligence for opportunities when developing data‐driven models for process
investigating the correlation between growth performance and manufacturing systems. Computers & Chemical Engineering, 140,
metabolic yields of a Chlorella sp. exposed to various flue gas 106881. https://linkinghub.elsevier.com/retrieve/pii/
schemes. Bioresource Technology, 243, 1078–1086. https:// S0098135419308373
linkinghub.elsevier.com/retrieve/pii/S0960852417311252 Goldrick, S., Lee, K., Spencer, C., Holmes, W., Kuiper, M., Turner, R., &
Bhutani, N., Rangaiah, G. P., & Ray, A. K. (2006). First‐principles, data‐ Farid, S. S. (2018). On‐line control of glucose concentration in high‐
based, and hybrid modeling and optimization of an industrial yielding mammalian cell cultures enabled through oxygen transfer
hydrocracking unit. Industrial & Engineering Chemistry Research, 45, rate measurements. Biotechnology Journal, 13, 1700607. https://doi.
7807–7816. https://doi.org/10.1021/ie060247q org/10.1002/biot.201700607
Botton, A., Barberi, G., & Facco, P. (2022). Data augmentation to support Härdle, W., Werwatz, A., Müller, M., & Sperlich, S. (2004). Nonparametric
biopharmaceutical process development through digital models—A and semiparametric models. Springer series in statistics. Springer.
proof of concept. Processes, 10, 1796. https://www.mdpi.com/ https://doi.org/10.1007/978-3-642-17146-8
2227-9717/10/9/1796 Jenzsch, M., Gnoth, S., Beck, M., Kleinschmidt, M., Simutis, R., &
Brunner, V., Siegl, M., Geier, D., & Becker, T. (2020). Biomass soft sensor Lübbert, A. (2006). Open‐loop control of the biomass concentration
for a Pichia pastoris fed‐batch process based on phase detection and within the growth phase of recombinant protein production
hybrid modeling. Biotechnology and Bioengineering, 117, 2749–2759. processes. Journal of Biotechnology, 127, 84–94. https://linkinghub.
https://doi.org/10.1002/bit.27454 elsevier.com/retrieve/pii/S0168165606004780
Cabaneros Lopez, P., Udugama, I. A., Thomsen, S. T., Roslander, C., Jeon, B.‐K., & Kim, E.‐J. (2021). LSTM‐based model predictive control for
Junicke, H., Iglesias, M. M., & Gernaey, K. V. (2021). Transforming optimal temperature set‐point planning. Sustainability, 13, 894.
data to information: A parallel hybrid model for real‐time state https://www.mdpi.com/2071-1050/13/2/894
estimation in lignocellulosic ethanol fermentation. Biotechnology and Karimi Alavijeh, M., Baker, I., Lee, Y. Y., & Gras, S. L. (2022). Digitally
Bioengineering, 118, 579–591. https://doi.org/10.1002/bit.27586 enabled approaches for the scale up of mammalian cell bioreactors.
Caramihai, M., & Severi, I. (2013). Bioprocess modeling and control. In M. Digital Chemical Engineering, 4, 100040. https://linkinghub.elsevier.
D. Matovic, Biomass now ‐ Sustainable growth use. InTech Open. com/retrieve/pii/S277250812200031X
http://www.intechopen.com/books/biomass-now-sustainable- Karniadakis, G. E., Kevrekidis, I. G., Lu, L., Perdikaris, P., Wang, S., &
growth-and-use/bioprocess-modeling-and-control Yang, L. (2021). Physics‐informed machine learning. Nature Reviews
Cervantes, J., Garcia‐Lamont, F., Rodríguez‐Mazahua, L., & Lopez, A. Physics, 3, 422–440. https://www.nature.com/articles/s42254-021-
(2020). A comprehensive survey on support vector machine 00314-5
classification: Applications, challenges and trends. Neurocomputing, Kotidis, P., & Kontoravdi, C. (2020). Harnessing the potential of artificial
408, 189–215. https://linkinghub.elsevier.com/retrieve/pii/ neural networks for predicting protein glycosylation. Metabolic
S0925231220307153 Engineering Communications, 10, e00131. https://linkinghub.
Chen, T. Q., Rubanova, Y., Bettencourt, J., & Duvenaud, D. K. (2018). elsevier.com/retrieve/pii/S2214030120300067
Neural ordinary differential equations. In: 32nd Conference on Neural Kresnowati, M. T. A. P., Suarez‐Mendez, C. M., van Winden, W. A.,
Information Processing Systems (NeurIPS 2018), Montréal, Canada van Gulik, W. M., & Heijnen, J. J. (2008). Quantitative physiological
(pp. 6571–6583). study of the fast dynamics in the intracellular pH of Saccharomyces
Cheng, Y., Bi, X., Xu, Y., Liu, Y., Li, J., Du, G., Lv, X., & Liu, L. (2023). cerevisiae in response to glucose and ethanol pulses. Metabolic
Artificial intelligence technologies in bioprocess: Opportunities and Engineering, 10, 39–54. https://linkinghub.elsevier.com/retrieve/pii/
challenges. Bioresource Technology, 369, 128451. https://linkinghub. S1096717607000651
elsevier.com/retrieve/pii/S0960852422017849 Kumar Saini, D., Yadav, D., Pabbi, S., Chhabra, D., & Shukla, P. (2020).
Chu, Y., & Hahn, J. (2007). Parameter set selection for estimation of Phycobiliproteins from Anabaena variabilis CCC421 and its produc-
nonlinear dynamic systems. AIChE Journal, 53, 2858–2870. https:// tion enhancement strategies using combinatory evolutionary algo-
doi.org/10.1002/aic.11295 rithm approach. Bioresource Technology, 309, 123347. https://
Cruz‐Bournazou, M. N., Narayanan, H., Fagnani, A., & Butté, A. (2022). linkinghub.elsevier.com/retrieve/pii/S0960852420306192
Hybrid Gaussian process models for continuous time series in bolus Lagergren, J. H., Nardini, J. T., Baker, R. E., Simpson, M. J., & Flores, K. B.
fed‐batch cultures. IFAC‐PapersOnLine, 55, 204–209. https:// (2020). Biologically‐informed neural networks guide mechanistic
linkinghub.elsevier.com/retrieve/pii/S2405896322008461 modeling from sparse experimental data. PLoS Computational
Cubillos, F., Callejas, H., Lima, E. L., & Vega, M. P. (2001). Adaptive control Biology, 16, e1008462. https://doi.org/10.1371/journal.pcbi.
using a hybrid‐neural model: Application to a polymerisation reactor. 1008462
Brazilian Journal of Chemical Engineering, 18, 113–120. http://www. Lancashire, L. J., Lemetre, C., & Ball, G. R. (2008). An introduction to
scielo.br/scielo.php?script=sci_arttext&pid=S0104- artificial neural networks in bioinformatics: Application to complex
66322001000100010&lng=en&tlng=en microarray and mass spectrometry datasets in cancer studies.
Dong, Y., Fan, Q., Yan, X., Guo, M., & Lu, F. (2014). Development of a Briefings in Bioinformatics, 10, 315–329. https://doi.org/10.1093/
hybrid model for sodium gluconate fermentation by Aspergillus niger. bib/bbp012012
Journal of Chemical Technology & Biotechnology, 89, 1875–1882. Lee, D., Ding, Y., Jayaraman, A., & Kwon, J. (2018). Mathematical modeling
https://doi.org/10.1002/jctb.4270 and parameter estimation of intracellular signaling pathway: Appli-
Duan, Z., Wilms, T., Neubauer, P., Kravaris, C., & Cruz Bournazou, M. N. cation to LPS‐induced NFκB activation and TNFα production in
(2020). Model reduction of aerobic bioprocess models for efficient macrophages. Processes, 6, 21. http://www.mdpi.com/2227-9717/
simulation. Chemical Engineering Science, 217, 115512. https:// 6/3/21
linkinghub.elsevier.com/retrieve/pii/S0009250920300440 Lee, D., Jayaraman, A., & Kwon, J. S. (2020b). Development of a hybrid
Duran‐Villalobos, C. A., Goldrick, S., & Lennox, B. (2020). Multivariate model for a partially known intracellular signaling pathway through
statistical process control of an industrial‐scale fed‐batch simulator. correction term estimation and neural network modeling. PLoS
Computers & Chemical Engineering, 132, 106620. https://linkinghub. Computational Biology, 16, e1008472. https://doi.org/10.1371/
elsevier.com/retrieve/pii/S0098135419304375 journal.pcbi.1008472
18 | MAHANTY
Lee, D., Jayaraman, A., & Kwon, J. S.‐I. (2020a). Identification of cell‐to‐ engineering/biotechnology (Vol. 176, pp. 133–180). Springer. https://
cell heterogeneity through systems engineering approaches. AIChE doi.org/10.1007/10_2020_152
Journal, 66(5), e16925. https://doi.org/10.1002/aic.16925 Mowbray, M. R., Wu, C., Rogers, A. W., Rio‐Chanona, E. A. D., & Zhang, D.
Lee, D., Jayaraman, A., & Sang‐Il Kwon, J. (2019). Identification of a time‐ (2023). A reinforcement learning‐based hybrid modeling framework
varying intracellular signalling model through data clustering and for bioprocess kinetics identification. Biotechnology and
parameter selection: application to NF‐B signalling pathway induced Bioengineering, 120, 154–168. https://doi.org/10.1002/bit.28262
by LPS in the presence of BFA. IET Systems Biology, 13(4), 169–179. Narasingam, A., & Kwon, J. S. I. (2019). Koopman Lyapunov‐based model
https://doi.org/10.1049/iet-syb.2018.5079 predictive control of nonlinear chemical process systems. AIChE
Liu, C., Zhang, X., Nguyen, T. T., Liu, J., Wu, T., Lee, E., & Tu, X. M. (2022a). Journal, 65(1):e16743. https://doi.org/10.1002/aic.16743
Partial least squares regression and principal component analysis: Narasingam, A., & Kwon, J. S.‐I. (2020). Closed‐loop stabilization of
Similarity and differences between two popular variable reduction nonlinear systems using Koopman Lyapunov‐based model predictive
approaches. Gen. Psychiatry, 35, e100662. https://doi.org/10.1136/ control. In: 2020 59th IEEE Conference on Decision and Control
gpsych-2021-100662 (CDC) Jeju Island, Republic of Korea, December 14‐18 (pp. 704–709).
Liu, F., Heiner, M., & Gilbert, D. (2022b). Hybrid modelling of biological https://ieeexplore.ieee.org/document/9304259/
systems: current progress and future prospects. Briefings in Narayanan, H., Behle, L., Luna, M. F., Sokolov, M., Guillén‐Gosálbez, G.,
Bioinformatics, 23, 00. https://doi.org/10.1093/bib/bbac081/ Morbidelli, M., & Butté, A. (2020a). Hybrid‐EKF: Hybrid model
6555400 coupled with extended Kalman filter for real‐time monitoring and
Lopes Dias, J. M., Lemos, P., Serafim, L., Oehmen, A., Reis, M. A. M., & control of mammalian cell culture. Biotechnology and Bioengineering,
Oliveira, R. (2007). Development and implementation of a non‐ 117, 2703–2714. https://doi.org/10.1002/bit.27437
parametric/metabolic model in the process optimisation of PHA Narayanan, H., Luna, M. F., Stosch, M., Cruz Bournazou, M. N., Polotti, G.,
production by mixed microbial cultures. Computer Aided Chemical Morbidelli, M., Butté, A., & Sokolov, M. (2020b). Bioprocessing in the
Engineering 24, 995–1000. https://linkinghub.elsevier.com/retrieve/ digital age: The role of process models. Biotechnology Journal, 15,
pii/S1570794607801908 1900172. https://doi.org/10.1002/biot.201900172
Lopez, P. C., Udugama, I. A., Thomsen, S. T., Roslander, C., Junicke, H., Narayanan, H., Sokolov, M., Morbidelli, M., & Butté, A. (2019). A new
Mauricio‐Iglesias, M., & Gernaey, K. V. (2020). Towards a digital generation of predictive models: The added value of hybrid models
twin: A hybrid data‐driven and mechanistic digital shadow to for manufacturing processes of therapeutic proteins. Biotechnology
forecast the evolution of lignocellulosic fermentation. Biofuels, and Bioengineering, 116, 2540–2549. https://doi.org/10.1002/bit.
Bioproducts and Biorefining, 14, 1046–1060. https://doi.org/10. 27097
1002/bbb.2108 Narayanan, H., von Stosch, M., Feidl, F., Sokolov, M., Morbidelli, M., &
Luna, M. F., Ochsner, A. M., Amstutz, V., von Blarer, D., Sokolov, M., Butté, A. (2023). Hybrid modeling for biopharmaceutical processes:
Arosio, P., & Zinn, M. (2021). Modeling of continuous PHA Advantages, opportunities, and implementation. Frontiers in Chemical
production by a hybrid approach based on first principles and Engineering, 5, 1157889. https://doi.org/10.3389/fceng.2023.
machine learning. Processes, 9, 1560. https://www.mdpi.com/2227- 1157889/full
9717/9/9/1560 Nayak, J., Pal, M., & Pal, P. (2015). Modeling and simulation of direct
Luo, Y., Kurian, V., & Ogunnaike, B. A. (2021). Bioprocess systems analysis, production of acetic acid from cheese whey in a multi‐stage
modeling, estimation, and control. Current Opinion in Chemical membrane‐integrated bioreactor. Biochemical Engineering Journal,
Engineering, 33, 100705. https://linkinghub.elsevier.com/retrieve/ 93, 179–195. https://linkinghub.elsevier.com/retrieve/pii/
pii/S221133982100037X S1369703X14002861
Mandenius, C.‐F., & Brundin, A. (2008). Bioprocess optimization using Noll, P., & Henkel, M. (2020). History and evolution of modeling in
design‐of‐experiments methodology. Biotechnology Progress, 24, biotechnology: Modeling & simulation, application and hardware
1191–1203. https://doi.org/10.1002/btpr.67 performance. Computational and Structural Biotechnology Journal, 18,
Mehmood, T., Liland, K. H., Snipen, L., & Sæbø, S. (2012). A review of 3309–3323. https://linkinghub.elsevier.com/retrieve/pii/
variable selection methods in partial least squares regression. S2001037020304402
Chemometrics and Intelligent Laboratory Systems, 118, 62–69. O'Brien, C. M., Zhang, Q., Daoutidis, P., & Hu, W.‐S. (2021). A hybrid
https://linkinghub.elsevier.com/retrieve/pii/S0169743912001542 mechanistic‐empirical model for in silico mammalian cell bioprocess
Möller, J., Hernández Rodríguez, T., Müller, J., Arndt, L., simulation. Metabolic Engineering, 66, 31–40. https://linkinghub.
Kuchemüller, K. B., Frahm, B., Eibl, R., Eibl, D., & Pörtner, R. elsevier.com/retrieve/pii/S1096717621000525
(2020). Model uncertainty‐based evaluation of process strategies Oliveira, R. (2004). Combining first principles modelling and artificial
during scale‐up of biopharmaceutical processes. Computers & neural networks: A general framework. Computers & Chemical
Chemical Engineering, 134, 106693. https://linkinghub.elsevier. Engineering, 28, 755–766. https://linkinghub.elsevier.com/retrieve/
com/retrieve/pii/S009813541930821X pii/S0098135404000432
Möller, J., Kuchemüller, K. B., Steinmetz, T., Koopmann, K. S., & Pörtner, R. Patnaik, P. R. (2006). Synthesizing cellular intelligence and artificial
(2019). Model‐assisted design of experiments as a concept for intelligence for bioprocesses. Biotechnology Advances, 24,
knowledge‐based bioprocess development. Bioprocess and 1 2 9 – 1 3 3 . h t t p s : / / l i n k i n g h u b . e l s e v i er . c o m / r e t r i e v e / p i i /
Biosystems Engineering, 42, 867–882. https://doi.org/10.1007/ S0734975005001011
s00449-019-02089-7 Pinto, J., de Azevedo, C. R., Oliveira, R., & von Stosch, M. (2019). A
Mondal, P. P., Galodha, A., Verma, V. K., Singh, V., Show, P. L., bootstrap‐aggregated hybrid semi‐parametric modeling framework
Awasthi, M. K., Lall, B., Anees, S., Pollmann, K., & Jain, R. (2023). for bioprocess development. Bioprocess and Biosystems Engineering,
Review on machine learning‐based bioprocess optimization, mon- 42, 1853–1865. https://doi.org/10.1007/s00449-019-02181-y
itoring, and control systems. Bioresource Technology, 370, 128523. Pinto, J., Mestre, M., Ramos, J., Costa, R. S., Striedner, G., & Oliveira, R.
https://linkinghub.elsevier.com/retrieve/pii/S0960852422018569 (2022). A general deep hybrid model for bioreactor systems:
Moser, A., Appl, C., Brüning, S., & Hass, V. C. (2020). Mechanistic Combining first principles with deep neural networks. Computers &
mathematical models as a basis for digital twins. In C. Herwig, R. Chemical Engineering, 165, 107952. https://linkinghub.elsevier.com/
Pörtner, & J. Möller (Eds.), Digital twins: Advances in biochemical retrieve/pii/S0098135422002897
MAHANTY | 19
Raghavendra. N, S., & Deka, P. C. (2014). Support vector machine process and monoclonal antibody quality. Biotechnology Progress, 33,
applications in the field of hydrology: A review. Applied Soft 1368–1380. https://doi.org/10.1002/btpr.2502
Computing, 19, 372–386. https://linkinghub.elsevier.com/retrieve/ Sokolov, M., von Stosch, M., Narayanan, H., Feidl, F., & Butté, A. (2021).
pii/S1568494614000611 Hybrid modeling — A key enabler towards realizing digital twins in
Rajamanickam, V., Babel, H., Montano‐Herrera, L., Ehsani, A., Stiefel, F., biopharma? Current Opinion in Chemical Engineering, 34, 100715.
Haider, S., Presser, B., & Knapp, B. (2021). About model validation in https://linkinghub.elsevier.com/retrieve/pii/S2211339821000472
bioprocessing. Processes, 9, 961. https://www.mdpi.com/2227- Solle, D., Hitzmann, B., Herwig, C., Pereira Remelhe, M., Ulonska, S.,
9717/9/6/961 Wuerth, L., Prata, A., & Steckenreiter, T. (2017). Between the poles
Rathinavelu, S., Pavan, S. S., & Sivaprakasam, S. (2023). Hybrid model‐ of data‐driven and mechanistic modeling for process operation.
based framework for soft sensing and forecasting key process Chemie Ingenieur Technik, 89, 542–561. https://doi.org/10.1002/
variables in the production of hyaluronic acid by Streptococcus cite.201600175
zooepidemicus. Biotechnology and Bioprocess Engineering, 28, Sommeregger, W., Sissolak, B., Kandra, K., von Stosch, M., Mayer, M., &
203–214. Striedner, G. (2017). Quality by control: Towards model predictive
Rathore, A. S., Mishra, S., Nikita, S., & Priyanka, P. (2021). Bioprocess control of mammalian cell culture bioprocesses. Biotechnology
control: Current progress and future perspectives. Life, 11, 557. Journal, 12, 1600546. https://doi.org/10.1002/biot.201600546
https://www.mdpi.com/2075-1729/11/6/557 von Stosch, M., Hamelink, J.‐M., & Oliveira, R. (2016a). Hybrid modeling as
del Rio‐Chanona, E. A., Ahmed, N. rashid, Zhang, D., Lu, Y., & Jing, K. a QbD/PAT tool in process development: an industrial E. coli case
(2017b). Kinetic modeling and process analysis for Desmodesmus sp. study. Bioprocess and Biosystems Engineering, 39, 773–784. https://
lutein photo‐production. AIChE Journal, 63, 2546–2554. https://doi. doi.org/10.1007/s00449-016-1557-1
org/10.1002/aic.15667 von Stosch, M., Hamelink, J.‐M., & Oliveira, R. (2016b). Toward
del Rio‐Chanona, E. A., Fiorelli, F., Zhang, D., Ahmed, N. R., Jing, K., & intensifying design of experiments in upstream bioprocess develop-
Shah, N. (2017a). An efficient model construction strategy to ment: An industrial Escherichia coli feasibility study. Biotechnology
simulate microalgal lutein photo‐production dynamic process. Progress, 32, 1343–1352. https://doi.org/10.1002/btpr.2295
Biotechnology and Bioengineering, 114, 2518–2527. https://doi.org/ von Stosch, M., Oliveira, R., Peres, J., & Feyo de Azevedo, S. (2014).
10.1002/bit.26373 Hybrid semi‐parametric modeling in process systems engineering:
Rogers, A. W., Song, Z., Ramon, F. V., Jing, K., & Zhang, D. (2023). Past, present and future. Computers & Chemical Engineering, 60,
Investigating ‘greyness’ of hybrid model for bioprocess predictive 86–101. https://linkinghub.elsevier.com/retrieve/pii/
modelling. Biochemical Engineering Journal, 190, 108761. https:// S0098135413002639
linkinghub.elsevier.com/retrieve/pii/S1369703X22004302 von Stosch, M., Oliveria, R., Peres, J., & de Azevedo, S. F. (2012). Hybrid
Sansana, J., Joswiak, M. N., Castillo, I., Wang, Z., Rendall, R., Chiang, L. H., modeling framework for process analytical technology: Application
& Reis, M. S. (2021). Recent trends on hybrid modeling for Industry to Bordetella pertussis cultures. Biotechnology Progress, 28, 284–291.
4.0. Computers & Chemical Engineering, 151, 107365. https:// https://doi.org/10.1002/btpr.706
linkinghub.elsevier.com/retrieve/pii/S0098135421001435 von Stosch, M., Portela, R. M. C., & Oliveira, R. (2018). Hybrid model
Schmidhuber, J. (2015). Deep learning in neural networks: An overview. structures for knowledge integration. In J. Glassey, M. Stosch, & Von
Neural Networks, 61, 85–117. https://linkinghub.elsevier.com/ (Eds.), Hybrid Modeling in Process Industries (1st ed., pp. 13–35). CRC,
retrieve/pii/S0893608014002135 Taylor & Francis.
Schwedersky, B. B., Flesch, R. C. C., & Dangui, H. A. S. (2019). Practical von Stosch, M., & Willis, M. J. (2017). Intensified design of experiments for
nonlinear model predictive control algorithm for long short‐term upstream bioreactors. Engineering in Life Sciences, 17, 1173–1184.
memory networks. IFAC‐PapersOnLine, 52, 468–473. https:// https://doi.org/10.1002/elsc.201600037
linkinghub.elsevier.com/retrieve/pii/S2405896319301922 Tholudur, A., Ramirez, W. F., & McMillan, J. D. (1999). Mathematical
Shah, P., Sheriff, M. Z., Bangi, M. S. F., Kravaris, C., Kwon, J. S., Botre, C., & modeling and optimization of cellulase protein production using
Hirota, J. (2023). Multi‐rate observer design and optimal control to Trichoderma reesei RL‐P37. Biotechnology and Bioengineering, 66,
maximize productivity of an industry‐scale fermentation process. 1–16. https://doi.org/10.1002/(SICI)1097-0290(1999)66:1%
AIChE Journal, 69(2):e17946. https://doi.org/10.1002/aic.17946 3C1::AID-BIT1%3E3.0.CO;2-K
Shah, P., Sheriff, M. Z., Bangi, M. S. F., Kravaris, C., Kwon, J. S.‐I., Botre, C., Tsipa, A., Koutinas, M., Usaku, C., & Mantalaris, A. (2018). Optimal
& Hirota, J. (2022). Deep neural network‐based hybrid modeling and bioprocess design through a gene regulatory network – growth
experimental validation for an industry‐scale fermentation process: kinetic hybrid model: Towards replacing Monod kinetics. Metabolic
Identification of time‐varying dependencies among parameters. Engineering, 48, 129–137. https://linkinghub.elsevier.com/retrieve/
Chemical Engineering Journal, 441, 135643. https://linkinghub. pii/S1096717617304718
elsevier.com/retrieve/pii/S1385894722011433 Tsopanoglou, A., & Jiménez del Val, I. (2021). Moving towards an era of
Simutis, R., & Lübbert, A. (2017). Hybrid approach to state estimation for hybrid modelling: Advantages and challenges of coupling mechanis-
bioprocess control. Bioengineering, 4, 21. http://www.mdpi.com/ tic and data‐driven models for upstream pharmaceutical biopro-
2306-5354/4/1/21 cesses. Current Opinion in Chemical Engineering, 32, 100691. https://
Sitapure, N., & Kwon, J. S. (2023a). CrystalGPT: Enhancing system‐to‐ linkinghub.elsevier.com/retrieve/pii/S221133982100023X
system transferability in crystallization prediction and control using Vega‐Ramon, F., Zhu, X., Savage, T. R., Petsagkourakis, P., Jing, K., &
time‐series‐transformers. Computers & Chemical Engineering. Zhang, D. (2021). Kinetic and hybrid modeling for yeast astaxanthin
Advance online publication. http://arxiv.org/abs/2306.03099 production under uncertainty. Biotechnology and Bioengineering, 118,
Sitapure, N., & Kwon, J. S.‐I. (2023b). Exploring the potential of time‐ 4854–4866. https://doi.org/10.1002/bit.27950
series transformers for process modeling and control in chemical Walsh, I., Myint, M., Nguyen‐Khuong, T., Ho, Y. S., Ng, S. K., &
systems: An inevitable paradigm shift? Chemical Engineering Research Lakshmanan, M. (2022). Harnessing the potential of machine
and Design, 194, 461–477. https://linkinghub.elsevier.com/retrieve/ learning for advancing “quality by design” in biomanufacturing.
pii/S026387622300240X mAbs, 14(1):e2013593. https://doi.org/10.1080/19420862.2021.
Sokolov, M., Ritscher, J., MacKinnon, N., Souquet, J., Broly, H., 2013593
Morbidelli, M., & Butté, A. (2017). Enhanced process understanding Wang, X., Chen, J., Liu, C., & Pan, F. (2010). Hybrid modeling of penicillin
and multivariate prediction of the relationship between cell culture fermentation process based on least square support vector machine.
20 | MAHANTY
Chemical Engineering Research and Design, 88, 415–420. https:// Zhang, L., & Mao, S. (2017). Application of quality by design in the current
linkinghub.elsevier.com/retrieve/pii/S0263876209002160 drug development. Asian Journal of Pharmaceutical Sciences, 12, 1–8.
Wang, Z., Sheikh, H., Lee, K., & Georgakis, C. (2018). Sequential parameter https://linkinghub.elsevier.com/retrieve/pii/S1818087616300575
estimation for mammalian cell model based on in silico design of Zhao, M., Zhao, S., & Liu, F. (2023). Semi−supervised hybrid modeling of
experiments. Processes, 6, 100. http://www.mdpi.com/2227-9717/ the yeast fermentation process. Machines, 11, 63. https://www.
6/8/100 mdpi.com/2075-1702/11/1/63
Yang, A., Martin, E., & Morris, J. (2011). Identification of semi‐parametric Zorzetto, L. F. M., Filho, R. M., & Wolf‐Maciel, M. R. (2000). Processing
hybrid process models. Computers & Chemical Engineering, 35, modelling development through artificial neural networks and hybrid
63–70. https://linkinghub.elsevier.com/retrieve/pii/ models. Computers & Chemical Engineering, 24, 1355–1360. https://
S0098135410001626 linkinghub.elsevier.com/retrieve/pii/S0098135400004191
Yang, Q., Gao, H., Zhang, W., & Li, H. (2016). Simultaneous hybrid Zuo K., Wu W. T. 2000. Semi‐realtime optimization and control of a fed‐
modeling of a nosiheptide fermentation process using particle batch fermentation system. Computers & Chemical Engineering 24:
swarm optimization. Chinese Journal of Chemical Engineering, 24, 1105–1109. https://linkinghub.elsevier.com/retrieve/pii/
1631–1639. https://linkinghub.elsevier.com/retrieve/pii/ S0098135400004907
S1004954116304128
Zhang, D., Del Rio‐Chanona, E. A., Petsagkourakis, P., & Wagner, J. (2019).
Hybrid physics‐based and data‐driven modeling for bioprocess SUPP ORTING INFO RM ATION
online simulation and optimization. Biotechnology and Additional supporting information can be found online in the
Bioengineering, 116, 2919–2930. https://doi.org/10.1002/bit.27120
Supporting Information section at the end of this article.
Zhang, D., Savage, T. R., & Cho, B. A. (2020). Combining model structure
identification and hybrid modelling for photo‐production process
predictive simulation and optimisation. Biotechnology and
Bioengineering, 117, 3356–3367. https://doi.org/10.1002/bit.27512
How to cite this article: Mahanty, B. (2023). Hybrid modelling
Zhang, D., Xiao, N., Mahbubani, K. T., del Rio‐Chanona, E. A.,
Slater, N. K. H., & Vassiliadis, V. S. (2015). Bioprocess modelling of in bioprocess dynamics: Structural variabilities,
biohydrogen production by Rhodopseudomonas palustris: Model implementation strategies, and practical challenges.
development and effects of operating conditions on hydrogen yield Biotechnology and Bioengineering, 1–20.
and glycerol conversion efficiency. Chemical Engineering Science,
https://doi.org/10.1002/bit.28503
130, 68–78. https://linkinghub.elsevier.com/retrieve/pii/
S0009250915001815

72. Biotech Bioeng

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

72. Biotech Bioeng

Uploaded by

Copyright:

Available Formats

Received: 16 May 2023 | Revised: 9 July 2023 | Accepted: 10 July 2023

Hybrid modeling in bioprocess dynamics: Structural

Department of Biotechnology, Krunya

Biotechnol Bioeng. 2023;1–20. wileyonlinelibrary.com/journal/bit © 2023 Wiley Periodicals LLC. | 1

F I G U R E 2 General structure of (a) mechanistic first‐principal

TABLE 1 A representative list of data‐driven hybrid models and their structure.

Data‐driven model Hybrid model structure Bioprocess References

ANN Serial/parallel Industrial hydrocracking Bhutani et al. (2006)

PLS regression Parallel Cellulose‐to‐ethanol fermentations Cabaneros Lopez et al. (2021)

PLS Parallel Mammalian cell culture Narayanan et al. (2020a)

PLS Parallel Lignocellulosic fermentation Lopez et al. (2020)

ANN Serial Fed‐batch fermentation Zuo & Wu (2000)

ANN Serial Cell culture Narayanan et al. (2019)

Time invariant Time invariant

a different number of fully or partially connected hidden layers, a

Luna et al. (2021)

fermentation measuring biomass, substrate, and product concentration

variables, through maximization of the expected sum of rewards over

identify the optimal number of neurons and regularization parameters.

(2021b) adopted an ANN‐based hybrid model for fed‐batch cultivation

The rate of production/consumption of culture variables (i.e.,

on the cell concentration), and specific production/consumption rates

estimated by the ANN from reduced concentration vector as input

hyperbolic‐tangent activation function, linear output transfer func-

tion, and minimization of the normalized sum of squared errors with

Parametric white box models not described.

ANN submodels for each of the data‐driven model terms/parameters

be formulated in a hybrid model (Yang et al., 2016). Though, such a

strategy improves the flexibility of the network design, the number of

hyperparameters to be identified increases proportionally.

For single sugar utilization.

4.3.2 | Non‐ANN hybrid models

Due to the perceived risk of overfitting from the ANN‐integrated

Parameter estimation for a mechanistic model is often directed

empirical data‐driven model can now be formalized as both ξ and g(.)

compilation can subsequently be used to train the data‐driven

components of the hybrid model. Experimental data can also be

with et =ξit − ξˆit . The matrix ( )

The training of hybrid models is essentially the identification of time‐

6 | H Y B R I D M O D E L AP P L I C A T I O N S , heuristic. The hybrid model should be considered with prior domain

You might also like