Professional Documents
Culture Documents
1 s2.0 S095741742301686X Main
1 s2.0 S095741742301686X Main
Keywords: In this article, a double strategy is proposed to find the optimal gains of a cascaded PI controller to minimize
Deep Neural Networks the trajectory position error in a five-bar parallel robot. The first strategy employs Differential Evolution to
Differential Evolution tune constant gains during the execution time of the desired trajectory. Once Differential Evolution achieves
Neural control
convergence on the solution by finding the vector of optimal gains that minimize the position tracking error,
all the position error data and current of the two brushless motors are saved. In the second strategy, the data
generated in the first strategy is used to train a Deep Neural Network. After that, the trained Deep Neural
Network replaces the constant gains of the first strategy with time-varying gains for the desired trajectory.
Three working scenarios are proposed to test the generalization of the Deep Neural Network. In the first
scenario, a training trajectory is executed. In the second one, a testing trajectory of the Deep Neural Network
is evaluated. In the third one, a mass change is generated in the middle of the cycle. The results show that
the Deep Neural Network is robust to different trajectories and mass changes during the execution of pick and
place tasks.
1. Introduction Robust FOPID (Hajiloo et al., 2012; Sánchez et al., 2017; Zhang & Liu,
2018), Sliding mode controller (Ye et al., 2021; Zhang et al., 2023), H-
Finding the optimal gains of a controller for a five-bar parallel is infinity controller (Ashok Kumar & Kanthalakshmi, 2018; Rigatos et al.,
a highly iterative process since many variables must be tuned and 2017; Souza & Souza, 2019), among others.
often meet conflicting specifications such as energy efficiency and high On the other hand, the robustness problem in the face of parametric
accuracy (Rodríguez-Molina et al., 2020). In addition, the nonlinear uncertainties and disturbances can also be addressed from an intelligent
dynamic behavior must be solved with the kinetic constraints of the control perspective. Some examples of intelligent control methods are
mechatronic system. An example of a kinetic constraint is the torque
listed in the following four subcategories.
provided by the motor. The torque profile is unknown; it is a time vari-
able and depends on the demanded task (Fang et al., 2016). Moreover, • Adaptive Meta-heuristic algorithms. Particle Swarm Optimiza-
some drawbacks can be found when looking for optimal gains for a tion (PSO) (Bingül & Karahan, 2011). Differential Evolution Based
given controller, such as these optimal gains are specific for an effector
Control Adaptation (DEBAC) (Villarreal-Cervantes et al., 2018).
load, trajectory, and a set of modeled parameters (Li et al., 2022). In
Non-Dominated Sorting in Genetic Algorithms (NSGA)-II (Zhou
the presence of security controls (Salwani et al., 2009), changes in
& Zhang, 2019). Ant Lion Optimization (ALO) (Pradhan et al.,
working conditions, tasks, parametric uncertainties, and disturbances
2020). A comprehensive review of state-of-the-art on the classifi-
in the system, the optimal tuning results in a sub-optimal one (Kumar
& Kumar, 2017). This disadvantage can be addressed from a classical cation of metaheuristic algorithms to tune PID parameters can be
robust control theory such as Robust LQR (Liu et al., 2012), Fractional- found in the reference (Joseph et al., 2022).
Order PID (FOPID) controller (Goyal et al., 2019; Kler et al., 2018),
The code (and data) in this article has been certified as Reproducible by Code Ocean: (https://codeocean.com/). More information on the Reproducibility
Badge Initiative is available at https://www.elsevier.com/physical-sciences-and-engineering/computer-science/journals.
∗ Corresponding author.
E-mail addresses: daniel.blanck@exatec.tec.mx (D. Blanck-Kahan), gerardo.ortiz.cervantes@exatec.tec.mx (G. Ortiz-Cervantes), valentin_mgm@exatec.tec.mx
(V. Martínez-Gama), hector_cervantes@tec.mx (H. Cervantes-Culebro), jchong@tec.mx (J.E. Chong-Quero), cacruz@cinvestav.mx (C.A. Cruz-Villar).
https://doi.org/10.1016/j.eswa.2023.121184
Received 24 May 2022; Received in revised form 9 August 2023; Accepted 10 August 2023
Available online 1 September 2023
0957-4174/© 2023 Elsevier Ltd. All rights reserved.
D. Blanck-Kahan et al. Expert Systems With Applications 236 (2024) 121184
2
D. Blanck-Kahan et al. Expert Systems With Applications 236 (2024) 121184
3
D. Blanck-Kahan et al. Expert Systems With Applications 236 (2024) 121184
4
D. Blanck-Kahan et al. Expert Systems With Applications 236 (2024) 121184
3.1. Experimental setup In this work, the controller tuning of optimal gains is achieved
using a DNN substituting a DE algorithm. For all experiments, the
A five-bar parallel robot has been built, as shown in Fig. 1. The optimization problem minimizes the tracking position error defined in
link lengths of the parallel robot are set, as suggested by Campos Eq. (8).
et al. (2010). This design offers a large workspace for trajectory design The DNN is trained offline with data from six different trajec-
without falling into robot link singularities or link collisions. tories (Figs. 10(a)–10(f)). Then the training is validated with three
Two BLDC motors, Odrive model D 5065 270KV, are used. The unknown trajectories as illustrated in Figs. 11(a)–11(c). After that,
maximum nominal speed of the motor is 8640 rpm, and the torque of the DNN is tested with three unknown trajectories, as illustrated in
1.99 Nm. Angular positions 𝜃1 and 𝜃2 are measured with CUI AMT102- Figs. 11(d)–11(f)
V encoders of 8192 pulses per revolution. Currents are measured with The trajectories to train and validate the DNN are of irregular
the Odrive V3.6 board. geometries, for example, names of people (Fig. 10(f)) or curved lines
The range of the variables for the design vector is set to 𝐊𝐏𝐦 ∈ followed by straight lines (Fig. 11(e)), to have a diversity that allows
[0, 120]; 𝐊𝐕𝐦 ∈ [0, 0.25]; 𝐊𝐈𝐦 ∈ [0, 0.5]. The range of gains is obtained better interpolation for any unknown trajectory. 70% of the data gen-
experimentally to ensure the system’s stability. Likewise, with the erated by running the entire DE algorithm is used to train the DNN.
range of proposed gains, the system’s structural integrity is sought. For The remaining 30% is employed for validation purposes.
instance, it is observed that raising the upper limit of the variable 𝐾𝑉𝑚 Fig. 5, shows with a blue line, the accuracy of the DNN optimization
induces vibrations in the links. algorithm in each iteration, where a smooth line is observed with an
The proposed parameters for the case of the DE algorithm are a improvement trend in training. The orange line shows the evaluation of
population of 𝑁𝑃 = 8 individuals, mutant individuals 𝑁𝑀 = 5, the DNN with an unknown training example; for this reason that oscil-
survivors 𝑁𝑆 = 3, elite individuals 𝐸 = 1, mutation factor 𝑚𝑟 = 0.15 lations are observed in each epoch. The model shows 95.8% accuracy
and, five iterations (𝑀𝐴𝑋𝐺𝐸𝑁 ). on training data and 94.7% on testing.
The experimental platform is developed in Python programming. After validating the offline DNN training, three tuning techniques
TensorFlow and Keras libraries are used to design, train, validate, and are presented for comparison. The three tuning techniques considered
test the DNN. The DNN consists of an input layer of 40 neurons. Four in this work are random gains, the DE algorithm, and the online
hidden layers of 200, 500, 200, and 100 neurons, respectively. An DNN. Each one of these techniques provides a possible solution for
output layer with six neurons (Fig. 4). These outputs represent the several industrial scenarios. The first technique represents an individual
design vector for the robot controller. The activation function used for who has yet to learn about the system which, sets a group of control
all neurons except the output neurons was ReLU due to its simplicity gains. The second technique represents a meta-heuristic approach in
in implementation and low computational cost. which few assumptions about the system are made, and global search
From Fig. 4, a sequence of 𝑄 = 10 consecutive position errors and optimization is conveyed based on a specific working scenario. The
current are taken. The microcontroller sampling time is 5 ms, so a third technique applies a DNN as the search engine and uses online
5
D. Blanck-Kahan et al. Expert Systems With Applications 236 (2024) 121184
Table 1
Objective function performance on training trajectory. Fig. 8. Third experiment, testing trajectory with the change of mass.
Training trajectory, 10(f)
Technique Obj. function
Random 0.00777 that DE obtained the best result in minimizing the objective function.
DE 6.6714e−04 The solution vector obtained by the DE algorithm is the following:
Neural Network 0.0011
𝑥⃗ = [𝐾𝑃1 = 65.84, 𝐾𝑉1 = 0.24, 𝐾𝐼1 = 1.67, 𝐾𝑃2 = 80.88, 𝐾𝑉2 =
0.16, 𝐾𝐼2 = 0.6] . On the other hand, a random tuning approach has
an error over one order of magnitude compared with the DE and
performance feedback to adapt to several working scenarios. This ar- DNN. DNN technique provides an objective function (Eq. (8)), which
ticle exemplifies the scenarios by changing the desired trajectory and is 1.64 bigger than the DE technique. In addition, the most significant
the load at the end effector. amplitude errors between DE and DNN tuning techniques occur in the
Three experiments are conducted, each designed to evaluate the space of time, ranging from 1.8 to 2.4 s, as displayed in the zoomed box
performance under two different working scenarios. In the first exper- at the upper-right corner of Fig. 6. This type of behavior indicates that
iment, trajectory (Fig. 10(f)) is executed using the vector of optimal the two techniques are equivalent because both have a similar reaction
gains of the PID controller given by the DE algorithm that has con- at this time.
verged on, and the DNN has trained. The second experiment compares
a testing trajectory (Fig. 11(e)) for the DNN and DE algorithms. In the
4.2. Second experiment
third experiment, DE and DNN are examined when the same unknown
trajectory is proposed; and the mass of the end effector is changed in
the middle of the cycle time. Fig. 7 shows the results of this experiment, in which the DNN
achieves the best performance based on the objective function, Table 2.
4.1. First experiment The DE static gains obtained the best objective function; however, the
system reaction time at the beginning of the trajectory generates more
Fig. 6 shows the trajectory tracking position error results for the first significant error peaks than DNN. In the zoomed box, it is evidenced
experiment. A black dotted line depicts the result obtained by a random how the DNN obtains peaks of half the magnitude compared to DE. On
PID controller gain. DE and DNN are shown with red dashed line and the other hand, after the first half of the trajectory, the random gains
continuous blue line, respectively. From this, Fig. 6 and Table 1 depict resulted in a critically stable system.
6
D. Blanck-Kahan et al. Expert Systems With Applications 236 (2024) 121184
7
D. Blanck-Kahan et al. Expert Systems With Applications 236 (2024) 121184
Table 3 under a similar change pattern. This behavior is expected since the
Objective function performance on a testing trajectory with the change
system is under the same conditions (no mass). When the change in
of mass.
mass occurs, the proportional gains reach maximum peaks of 102.5 and
Testing trajectory, 11(e), and mass changed
121.188 for each motor, respectively, as can be seen in Figs. 9(a)–9(b).
Technique Obj. function
Approaching the end of the trajectory, the proportional position gains
Random 0.3469 set on both motors start to converge to the same values. For the case
DE 0.1262
of the gains in velocity with the change in mass, an increase in the
Neural Network 1e−04
amplitude of the gains and a decrease in the frequency of change in
these gains are observed in Figs. 9(c)–9(d). In the case of integral gains,
an average decrease is observed in Figs. 9(e)–9(f) for both motors once
once the disturbance occurs. However, the other two tuning techniques the mass disturbance appears.
show the highest error peaks close, to the 1.6 s. In the case of the
random technique, the most significant error is almost four times the 5. Conclusions
magnitude of the other techniques.
Figs. 9(a)–9(f) show the time-varying gains for the two brushless This paper addresses the gain tuning of a PID controller for two
motors for the testing trajectory. Experiment 2 is represented by a blue brushless motors using three techniques: Random assignment, DE, and
dotted line when there are no changes in mass at the end effector. DNN. Three case studies are analyzed to observe each method’s advan-
Experiment 3 is depicted by a continuous red line when the mass is tages and disadvantages. All measurements of position errors, velocity
changed in the middle of the run time of the desired trajectory. During errors, and current are generated using the experimental prototype,
the first half of the trajectory, from second 0 to 0.9, the gains behave thus saving the use of a mathematical model to solve the dynamics of
8
D. Blanck-Kahan et al. Expert Systems With Applications 236 (2024) 121184
the robot and actuators. Likewise, friction effects in the joints and the Writing – review & editing, Visualization, Supervision, Project ad-
closed kinematic chain constraints of the parallel robot are implicitly ministration, Funding acquisition. J. Enrique Chong-Quero: Concep-
considered. tualization, Methodology, Formal analysis, Resources, Writing – re-
view & editing, Project administration, Funding acquisition. Carlos
A DNN can interpolate between known data, while DE finds the
A. Cruz-Villar: Conceptualization, Methodology, Validation, Formal
global optimum for a particular scenario. When there is a change in
analysis, Resources, Writing – review & editing, Supervision, Project
the system (e.g., a change of mass in the end effector), the DNN can administration, Funding acquisition.
interpolate and find a robust solution to changes in the trajectory and
mass at the end effector, see Fig. 8. However, DE must be iteratively Declaration of competing interest
performed if there are changes in the trajectory, in the mass, or the
conjunction of both. The authors declare that they have no known competing finan-
cial interests or personal relationships that could have appeared to
influence the work reported in this paper.
CRediT authorship contribution statement
Data availability
Daniel Blanck-Kahan: Software, Validation, Formal analysis, In- I have shared all data and codes in the appendix section.
vestigation, Data curation. Gerardo Ortiz-Cervantes: Software, Valida-
Acknowledgments
tion, Formal analysis, Investigation, Data curation, Project administra-
tion. Valentín Martínez-Gama: Software, Validation, Formal analysis, The authors would like to acknowledge the financial support of
Investigation. Héctor Cervantes-Culebro: Conceptualization, Method- NOVUS (Grant number N20-144), Institute for the Future of Education,
ology, Validation, Formal analysis, Resources, Writing – original draft, Tecnologico de Monterrey, Mexico, in the production of this work.
9
D. Blanck-Kahan et al. Expert Systems With Applications 236 (2024) 121184
Appendix Kiumarsi, B., Lewis, F. L., & Jiang, Z.-P. (2017). 𝐻∞ Control of linear discrete-time
systems: Off-policy reinforcement learning. Automatica, 78, 144–152.
Kler, D., Sharma, P., Rana, K., & Kumar, V. (2018). A BSA tuned fractional-order PID
Figs. 10(a)–10(f) show the set of trajectories with which the DNN is
controller for enhanced MPPT in a photovoltaic system. In Fractional order systems
trained. The training trajectories are a set of splines executed in differ-
(pp. 673–703). Elsevier.
ent quadrants of the Cartesian plane. Figs. 11(a)–11(c) depict the set of Kumar, A., & Kumar, V. (2017). Evolving an interval type-2 fuzzy PID controller for
validation trajectories. In Fig. 11(a), a spline in the fourth quadrant of the redundant robotic manipulator. Expert Systems with Applications, 73, 161–177.
the Cartesian plane is shown as a trajectory. Fig. 1111(b) reproduces Li, H., Song, B., Tang, X., Xie, Y., & Zhou, X. (2022). Controller optimization using data-
the intersection of trajectories in straight and curved lines. A geometric driven constrained bat algorithm with gradient-based depth-first search strategy.
ISA Transactions, 125, 212–236.
ellipse is generated in Fig. 11(c). Figs. 11(d)–11(f) illustrate the set
Liu, H., Lu, G., & Zhong, Y. (2012). Robust LQR attitude control of a 3-DOF laboratory
of testing trajectories which are trajectories of closed and irregular helicopter for aggressive maneuvers. IEEE Transactions on Industrial Electronics,
geometry. 60(10), 4627–4636.
All of the code used for the project is contained in the following Luo, B., Wu, H.-N., & Huang, T. (2014). Off-policy reinforcement learning for 𝐻∞ control
publicly available GitHub repository. The code is open for free use design. IEEE Transactions on Cybernetics, 45(1), 65–76.
under the MIT license. Pang, H., Liu, F., & Xu, Z. (2018). Variable universe fuzzy control for vehicle semi-active
suspension system with MR damper combining fuzzy neural network and particle
https://github.com/valentin-martinez-gama/robo-evoML
swarm optimization. Neurocomputing, 306, 130–140.
The following are the main components of the codebase: Pradhan, R., Majhi, S. K., Pradhan, J. K., & Pati, B. B. (2020). Optimal fractional order
PID controller design using ant lion optimizer. Ain Shams Engineering Journal, 11(2),
• Logic and control functions for the control loop implementation
281–291.
using Odriver under the Odrive_control folder. Precup, R.-E., David, R.-C., Roman, R.-C., Petriu, E. M., & Szedlak-Stinean, A.-I.
• The evolutionary algorithm logic and its implementation on top (2021). Slime mould algorithm-based tuning of cost-effective fuzzy controllers for
of the ODrive controller are contained in the evo_ML.py file. servo systems. International Journal of Computational Intelligence Systems, 14(1),
• The methodology used to generate data training set out of imple- 1042–1052.
Rigatos, G., Siano, P., Selisteanu, D., & Precup, R. (2017). Nonlinear optimal control of
menting multiple evolutionary iterations on multiple trajectories
oxygen and carbon dioxide levels in blood. Intelligent Industrial Systems, 3, 61–75.
is in the ML_training.py file. Rodríguez-Molina, A., Mezura-Montes, E., Villarreal-Cervantes, M. G., & Aldape-
• ML.py and ML_data.py contain supporting initialization and data Pérez, M. (2020). Multi-objective meta-heuristic optimization in intelligent control:
processing functions. A survey on the controller tuning problem. Applied Soft Computing, 93, Article
• The Trajectories folder contains the trajectories to be followed as 106342.
pairs of angular setpoints for the two motors. All of them are to Salwani, M. I., Norzaidi, M. D., Chong, S. C., & Lin, B. (2009). Factors determining
organisational commitment on security controls in accounting-based information
be uniformly spaced in time.
systems. International Journal of Services and Standards, 5(1), 51–66.
• Datasets and Keras Model contain the input data used for NN Sánchez, H. S., Padula, F., Visioli, A., & Vilanova, R. (2017). Tuning rules for robust
training and the resulting trained model implemented. FOPID controllers based on multi-objective optimization with FOPDT models. ISA
Transactions, 66, 344–361.
Song, R., & Lewis, F. L. (2020). Robust optimal control for a class of nonlinear systems
with unknown disturbances based on disturbance observer and policy iteration.
References Neurocomputing, 390, 185–195.
Souza, A., & Souza, L. (2019). Design of a controller for a rigid-flexible satellite using
Ashok Kumar, M., & Kanthalakshmi, S. (2018). 𝐻∞ Tracking control for an inverted the H-infinity method considering the parametric uncertainty. Mechanical Systems
pendulum. Journal of Vibration and Control, 24(16), 3515–3524. and Signal Processing, 116, 641–650.
Bingül, Z., & Karahan, O. (2011). A fuzzy logic controller tuned with PSO for 2 DOF Storn, R., & Price, K. (1997). Differential evolution–A simple and efficient heuristic for
robot trajectory control. Expert Systems with Applications, 38(1), 1017–1031. global optimization over continuous spaces. Journal of Global Optimization, 11(4),
Brownlee, J. (2017). Gentle introduction to the adam optimization algorithm for deep 341–359.
learning. Machine Learning Mastery, 3. Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: an introduction. MIT Press.
Campos, L., Bourbonnais, F., Bonev, I. A., & Bigras, P. (2010). Development of a five- Trinh, N. H., Vu, N. T.-T., & Nguyen, P. D. (2021). Robust optimal tracking control
bar parallel robot with large workspace. 44106, In International design engineering using disturbance observer for robotic arm systems. Journal of Control, Automation
technical conferences and computers and information in engineering conference (pp. and Electrical Systems, 1–12.
917–922). Ucgun, H., Okten, I., Yuzgec, U., & Kesler, M. (2022). Test platform and graphical user
Cheng, L., Wang, Z., Jiang, F., & Li, J. (2021). Adaptive neural network control of interface design for vertical take-off and landing drones. Science and Technology,
nonlinear systems with unknown dynamics. Advances in Space Research, 67(3), 25(3), 350–367.
1114–1123. Villarreal-Cervantes, M. G., Mezura-Montes, E., & Guzmán-Gaspar, J. Y. (2018).
Fang, J., Zhao, J., Mei, T., & Chen, J. (2016). Online optimization scheme with dual-
Differential evolution based adaptation for the direct current motor velocity control
mode controller for redundancy-resolution with torque constraints. Robotics and
parameters. Mathematics and Computers in Simulation, 150, 122–141.
Computer-Integrated Manufacturing, 40, 44–54.
Wang, Z., Zou, L., Su, X., Luo, G., Li, R., & Huang, Y. (2021). Hybrid force/position
Ghorbel, F. H., Chételat, O., Gunawardana, R., & Longchamp, R. (2000). Modeling
control in workspace of robotic manipulator in uncertain environments based on
and set point control of closed-chain mechanisms: Theory and experiment. IEEE
adaptive fuzzy control. Robotics and Autonomous Systems, 145, Article 103870.
Transactions on Control Systems Technology, 8(5), 801–815.
Ye, M., Gao, G., & Zhong, J. (2021). Finite-time stable robust sliding mode dynamic
Goyal, V., Mishra, P., Shukla, A., Deolia, V. K., & Varshney, A. (2019). A fractional
order parallel control structure tuned with meta-heuristic optimization algorithms control for parallel robots. International Journal of Control, Automation and Systems,
for enhanced robustness. Journal of Electrical Engineering, 70(1), 16–24. 19(9), 3026–3036.
Hajiloo, A., Nariman-Zadeh, N., & Moeini, A. (2012). Pareto optimal robust design Yilmaz, B. M., Tatlicioglu, E., Savran, A., & Alci, M. (2021). Adaptive fuzzy logic
of fractional-order PID controllers for systems with probabilistic uncertainties. with self-tuned membership functions based repetitive learning control of robotic
Mechatronics, 22(6), 788–801. manipulators. Applied Soft Computing, 104, Article 107183.
Hekimoğlu, B. (2019). Optimal tuning of fractional order PID controller for DC motor Zamfirache, I. A., Precup, R.-E., Roman, R.-C., & Petriu, E. M. (2022). Policy iteration
speed control via chaotic atom search optimization algorithm. IEEE Access, 7, reinforcement learning-based control using a Grey Wolf Optimizer Algorithm.
38100–38114. Information Sciences, 585, 162–175.
Jin, X.-Z., He, T., Wu, X.-M., Wang, H., & Chi, J. (2020). Robust adaptive neural Zhang, B., Deng, B., Gao, X., Shang, W., & Cong, S. (2023). Design and implementation
network-based compensation control of a class of quadrotor aircrafts. Journal of of fast terminal sliding mode control with synchronization error for cable-driven
the Franklin Institute, 357(17), 12241–12263. parallel robots. Mechanism and Machine Theory, 182, Article 105228.
Joseph, S. B., Dada, E. G., Abidemi, A., Oyewola, D. O., & Khammas, B. M. (2022). Zhang, S., & Liu, L. (2018). Normalized robust FOPID controller regulation based on
Metaheuristic algorithms for PID controller parameters tuning: Review, approaches small gain theorem. Complexity, 2018.
and open problems. Heliyon, Article e09399. Zhou, X., & Zhang, X. (2019). Multi-objective-optimization-based control parameters
Khan, A. H., Cao, X., Li, S., Katsikis, V. N., & Liao, L. (2020). BAS-ADAM: An ADAM auto-tuning for aerial manipulators. International Journal of Advanced Robotic
based approach to improve the performance of beetle antennae search optimizer. Systems, 16(1), Article 1729881419828071.
IEEE/CAA Journal of Automatica Sinica, 7(2), 461–471.
10