1 s2.0 S2405844023093647 Main

Heliyon 9 (2023) e22156
Contents lists available at ScienceDirect
Heliyon
journal homepage: www.cell.com/heliyon
A fuzzy decision-making system for video tracking with multiple

objects in non-stationary conditions
Payam Safaei Fakhri a, *, Omid Asghari b, Sliva Sarspy c, Mehran Borhani Marand d,
Paria Moshaver e, Mohammad Trik f
a
Department of Artificial Intelligence, Software Engineering, Islamic Azad University, Central Tehran Branch, Iran
b
Department of Mechanics, Power and Computer Faculty of Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran
c
Department of Computer Science, College of Science, Cihan University-Erbil, Erbil, Iraq
d
Department of Computer Engineering, Tabriz Branch, Islamic Azad University, Tabriz, Iran
e
Department of Mechanical Engineering, University of Kentucky, Kentucky, United States
f
Department of Computer Engineering, Boukan Branch, Islamic Azad University, Boukan, Iran
A R T I C L E I N F O A B S T R A C T
Keywords: Computer vision remains challenged by tracking multiple objects in motion frames, despite efforts
Object tracking to improve surveillance, healthcare, and human-machine interaction. This paper presents a
KLT feature points method for monitoring several moving objects in non-stationary settings for autonomous navi
Fuzzy algorithm
gation. Additionally, at each phase, movement information between successive frames, including
Video processing
Decision-making system
the new frame and the previous frame, is employed to determine the location of moving objects
inside the camera’s field of view, and the background in the new frame is determined. With the
help of a matching algorithm, the Kanade–Lucas–Tomasi (KLT) feature tracker for each frame is
determined. To get the new frame, we access the matching feature points between two subsequent
frames, calculate the movement size of the feature points and the camera movement, and subtract
the previous frame of moving objects from the current frame. Every moving object within the
camera’s field of view is captured at every moment and location. The moving items are catego
rized and segregated using fuzzy logic based on their mass center and length-to-width ratio. Our
algorithm was implemented to investigate autonomous navigation surveillance of three types of
moving objects, such as a vehicle, a pedestrian, a bicycle, or a motorcycle. The results indicate
high accuracy and an acceptable time requirement for monitoring moving objects. It has a
tracking and classification accuracy of around 75 % and processes 43 frames per second, making
it superior to existing approaches in terms of speed and accuracy.
1. Introduction
Video analysis begins with tracking and detecting moving objects. Automated vehicle navigation, surveillance of smart vehicles
[1], and tracking of aerial moving objects all rely on tracking moving objects. The mentioned systems are also used in military ap
plications [2], human-computer interaction, and robotics, because they receive and process videos from the surrounding environment
and analyze the behavior and events within them. It is essential to present methods for tracking moving objects with high accuracy and
* Corresponding author.
E-mail addresses: Payam.safaei91@gmail.com (P.S. Fakhri), omid.asghari@srbiau.ac.ir (O. Asghari), sliva.sarspyy@gmail.com (S. Sarspy), stu.
mehranborhani@iaut.ac.ir (M.B. Marand), pmo252@uky.edu (P. Moshaver), trik.mohammad@gmail.com (M. Trik).
https://doi.org/10.1016/j.heliyon.2023.e22156
Received 14 March 2023; Received in revised form 26 October 2023; Accepted 6 November 2023
Available online 10 November 2023
2405-8440/© 2023 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
P.S. Fakhri et al. Heliyon 9 (2023) e22156
short time consumption to grow the quality level and performance of the systems mentioned above. Tracking moving objects can be
defined as tracking the trajectory of moving objects in a sequence of input frames, as described in Ref. [3]. Depending on the
application, moving objects can include motorcycles on the street, boats in the sea, pedestrians on the pavement, and automobiles on
the highway. Automated navigation systems (unmanned vehicles) should be able to detect moving objects around them. To avoid
colliding with other objects, it determines where these objects are at any given time. Tracking algorithms should be flexible against
various challenges, such as changes in illumination, sudden changes in object direction, and the presence of various objects in the
camera’s view. Tracking still faces challenges such as changes in target shape, scale changes, occlusion, and rotation, which require
robust methods [4–6].
The moving object location process is applied to each frame of video today as one of the fundamental ways to understand video
content. A feature should describe the target in a tracking process. Multiple object tracking is one of the most fundamental tasks in
computer vision and remains very challenging for real-world applications due to its severe occlusion and motion blur. Most existing
methods solve these multiple object tracking problems by performing data correlation based on deep feature detections in consecutive
frames, which only contain spatial information of detected objects. Therefore, the inaccuracy of data association occurs easily,
especially in heavily occluded scenes. Meanwhile, the first problem is the simultaneous tracking of several moving objects for fixed and
moving cameras, and the second problem is the identification and classification of moving objects under machine tracking. Changing
the weighted center of mass of the moving object and the length-to-width ratio of the segmented object is used in this study to describe
the moving object and decrease the effect of edge pixels on the target. As a result, the algorithm is less sensitive to state changes, scale
changes, and partial occlusions. A fuzzy logic algorithm is used to identify and classify the targets in the proposed approach. The target
location is determined based on the changes and movement of moving objects in a frame compared to a previous frame. For cameras
that can move forward, backward, left, and right, this method estimates the movement of the camera in the new frame by extracting
different points of the previous frame. Based on the background frame for the new frame, it can be found that moving objects are
tracked in each frame of the image-based on the background image.
The contribution and innovation of this article include the following:
• Implementation of an algorithm to investigate autonomous navigation monitoring of three types of moving objects, counting: a car
vehicle, a pedestrian, a cycling.
• Classification and separation of moving objects using fuzzy logic based on the center of mass and their length-to-width ratio.
• Presenting an adaptive algorithm for tracking moving objects in moving frames.
The remainder of the paper is structured as follows: Section 2 describes the related studies. Section 3 introduces the proposed
method for tracking and identifying the moving objects for each frame of the image. Section 4 examines the performance of the
proposed approach. Finally, the paper is concluded in section 5.
2. Related work
A significant feature of traffic monitoring videos is the ability to see moving vehicles and people. A fast and efficient multi-object
tracking algorithm was presented in Ref. [7] using classification filters GMPFM-GMPHD and VGG16 LSTM to determine target objects
with confined 3D bounding boxes and extract multiple features from raw frames using Three-Frame Differencing
Combined-Background Subtraction (TFDCBS) and HEBT, which were coupled to automatic and fast histogram entropy-based
thresholding. VGG16-LSTM classifiers and GMPFM-GMPHD filters are used to track objects. Using the Karlsruhe Institute of Tech
nology and Toyota Technological Institute (KITTI) 3D bounding box dataset, the proposed strategy is compared to three sophisticated
tracking algorithms.
Using Lidar sensors, drosophila-inspired 3D moving objects were analyzed in Ref. [8]. Drosophila’s appeal lies in its method of
tracking motion based on its superficial visual neuronal circuit. These sensors are extremely sensitive to object movement due to their
ability to filter out background noise. Using neural circuits with changing connections, a moving object’s structure is constructed by
extracting point clouds from each zone of motion. A 3D object detection network is then employed to estimate the point clouds of each
proposal and build 3D bounding boxes and item classifications.
Multi-object tracking based on detection is currently done using commercially available detectors for deep learning. The result is
models with biased detectors and conclusions that are dependent on the detectors themselves. For end-to-end detection and joint
connection [9], developed a Deep Motion Modeling Network (DMM-Net). In DMM-Net models, object and vision classes, as well as
object characteristics are simultaneously inferred over several frames.
Due to the excellent performance of background/foreground separation techniques utilizing low-order matrices, the authors of
[10] developed a revolutionary real-time incremental model based on Total Variation and Low Rank that effectively eliminates a large
number of sounds. In this study, a dynamic background and a time-varying background are only two of the many challenges addressed.
The adaptive model also uses real-time or internet films to identify the backdrop and eliminate background noise automatically.
A correlation filter and a self-organizing mapping network (SOM) are used for video object tracking [11]. Objects can appear
differently in different media. Unsupervised characteristics were learned using a neural network that reads neuronal impulses from
humans. Through the use of various adaptive correlation filters and the memory performance of the target’s look, an accurate targeted
tracking approach has been developed. There are various updating methods for the proposed system’s filters, and they are capable of
following long-term trends at the same time. Contextual information is incorporated into the movement filter to select and label objects
more precisely. The second stage uses scale filters to forecast changes in the target’s dimensions. It is necessary to use a memory filter to
2
preserve the target’s appearance in long-term memory and determine whether the target was tracked successfully. If a tracking error
occurs, the incremental learning tracker can re-establish target tracking using a sliding window.
With a new approach proposed in the paper [12], multiple objects can now be tracked in soccer recordings with player interactions
and substantial occlusions. This study treats the background of a particular frame as the topographical surface. Each item has precise,
overlap-free boundary lines, allowing accurate tracking of the target participants. Using color similarity and geographic proximity, the
predicted location of the target cast is filtered across all video sequences.
Using the geometric split of each time frame, the article [13] describes a method for mapping moving objects in the target image
and locating the moving camera. Using the suggested approach, one can estimate camera location without identifying specific objects
by segmenting the target frame into sections.
A wide variety of purposes can now be served by the cloud-based sensor ecosystem. Among these zones is the wireless multi-sensor
network, which includes multi-sensors for collecting data at the selected location. WMSN’s coarse, dynamic, and energy-restricted
sensors make animal tracking more challenging than standard object-tracking algorithms. In Ref. [14], an energy-efficient WMSN
and deep learning are employed to monitor a large number of objects. Cluster heads that are more energy efficient are initially selected
using fuzzy logic. A Recurrent Neural Network Transducer (RNN-T) recursive neural network with a tumbling effect is implemented in
the second phase to build a new tracking strategy. Each sensor node uses the recommended RNN-T model, and the CHs implement the
tracking method. The tracking outcomes are then sent to the research subject’s cloud server. To assess the model’s efficiency, real-time
footage of wild animals was used.
The detection and tracking of objects are generally studied separately. Deep learning networks have greatly improved 2D object
recognition. In “tracking through detection,” an object must be reliably recognized in the first frame and each subsequent frame;
tracking is accomplished by comparing detection outputs. The identification and tracking of items in networks remain a mystery.
R–CNN architecture can be expanded to recognize a 3D tube conveying a moving item in a video clip using the trackNet network
introduced in Ref. [15]. To determine the objectivity and placement features of each potential tube, a tube proposal network (TPN) is
given within the trackNet. This research aims to apply the framework we’ve developed to traffic footage analysis.
Multiple-object detection (MOD) and background removal are two of the most important functions of video surveillance systems.
Despite this, noise in recorded video sequences remains one of the most significant obstacles to detection precision. De-Noising, a low-
rank optimization and L1-TV regularization-based approach can detect moving objects in noisy video. It is generally assumed that
moving objects have smooth spatial and temporal dimensions, and that film backdrops have low rank. By integrating nuclear norms,
l2-normality, l1-normality, and total variation regularization (TV), denoising and MOD performance can be achieved simultaneously.
The nuclear norm emphasizes the smoothness of the foreground, the l1-norm emphasizes the smoothness of the background, and the
l2-norm eliminates noise.
In computer vision, one of the most difficult tasks is detecting moving objects in dynamic and complex environments. Several
successful methods exist for recognizing and classifying moving objects, but obstacles such as camera shake, plant cover, undulating
water, etc., must be overcome. By removing the backdrop [17], has developed a revolutionary method for identifying moving items. To
overcome these inherent limitations, new feature descriptor and model update method have been developed. A pixel’s color can be
determined by looking at its neighbors, according to the suggested descriptor. This descriptor can reliably identify and extract moving
items due to its local nature. Due to its sloppiness, it’s difficult to push this explanation to its limits.
Asymmetric dual Siamese networks can be used to track several objects in RGB-D images, as demonstrated in Ref. [18]. This
method combines RGB images with deep image contours. By using the module, hybrid features can also hide unnecessary route
drift-related data. To determine the correctness of movement path and duration data, a route analysis module is provided and tested.
The sheer number of options reduces human error. Video tracking of many submerged and floating objects is possible thanks to
Dynamic Logic, a probabilistic ML framework application described in the study [19].
Based on the block’s movement information, the study [20] provides a unique segmentation algorithm for moving objects in
high-definition video surveillance systems. Following measurements of movement vectors, an adaptive block partitioning approach is
used to classify moving objects. The movement orientation histogram is employed to determine the direction of movement. A major
contribution of this study is the use of whole-block movement data provided by an image signal processing chip or digital signal
processor.
To address tiny targets and class imbalances [21], proposes an online multiple-object tracking technique for UAVs that integrates
high-resolution network and data connections. To identify a target, a high-definition hierarchical network is constructed that enables
the model to handle multiple targets of varying sizes and extract more efficient features during online training. The use of prediction
networks can be used to uncover potential targets. Researchers can also tailor the fusion loss approach to meet their specific needs,
combining canonical and GIoU losses for samples with class imbalance and difficulty. During tracking, detection data is sent into an
improved Deep SORT MOT algorithm for efficient one-to-one matching.
It is proposed in Ref. [22] that fuzzy logic data association can be used to monitor several objects online. The proposed method’s
first phase is to build a knowledge-based inference system that incorporates expert knowledge, and uses fuzzy rules to enhance the
efficiency of multiple-object tracking. To distinguish the fuzzy membership degree that can be replaced for the likelihood of a link
between objects and measurements, changes in error and movement errors are taken into account (or detection responses). Using fake
fuzzy performance, tracklets can be joined over sparse routes created by long occlusions. Therefore, the suggested technique does not
require any statistical model assumptions regarding measurement noise and object motion.
Surveillance videos may be used to detect the habits of the general population. Fuzzy logic functions are used to represent the
microscopic population level and disclose adaptive population behavior in this manner. By combining the advantages of CNN tracking
with optical flow tracking, the multiple-person tracker can recognize several people. Fuzzy logic predictions and fuzzy functions define
3
human comprehension of video sequences (whether genuine or created by a population simulator). Fuzzy logic specifies population
movement patterns at the microscopic level for both individuals and group members.
In recent years, machine vision researchers have debated the merits of object-tracking technologies. The intricacies of item
appearance, backdrop complexity, and object blockage have made object tracking a challenging problem to solve for many scholars.
There are a variety of sophisticated ways to achieve this goal, but each option has its drawbacks and consequences. Object tracking has
several applications, including intelligent vehicle monitoring [23,24]. These constraint-resilient approaches need to be explored
further [25,26].
A robust algorithm for object detection and tracking from natural scenes captured by real-time cameras is proposed to increase
sensitivity to real and fundamental changes [27]. This algorithm combines principal component analysis and deep learning networks
to maximize the benefits of these two approaches to achieve an intelligent detection and tracking system that works in real-time.
In [28], a new multi-object tracking model called sequence-tracker (STracker) is proposed, which combines both temporal and
spatial features to perform data communication. This work has trained a sequence feature extraction network based on offline video
pedestrian re-identification, combines the obtained sequence features with the depth features of the previous frame, and then im
plements the Hungarian algorithm for data correlation.
To make an improved trade-off between the appearance feature and the motion feature, motion and appearance integrations are
designed in the paper [29]. In this online approach, the location and motion of each object are given to the adaptive search windows,
and in the search windows, the matching is only related to the similarity of the appearance features. In the offline approach, the
trajectories generated from the online approach are modified by forming motion features as spatio-temporal constraints and using
appearance for clustering.
3. The proposed method
Several people are interested in unmanned vehicles that use machine vision and deep learning. Even though autonomous vehicles
cannot yet replace drivers, they have advanced tremendously in recent years. Using the information collected from millions of drivers,
a computer program learns how to recognize road lines, estimate road curves, detect risky areas, interpret traffic signals, and detect
moving objects.
In this study, we investigate the use of fuzzy rules to recognize and track moving items in the navigation area. The proposed method
creates adaptability through information processing. To develop an optimal strategy, it is necessary to examine the moving model and
environment model as well as classify information. Thus, it is necessary to compare their data in terms of applicability and compat
ibility. In this study, nonlinear processing methods are studied using a simulated model; a suitable response is generated by describing
the environment and applying fuzzy rules.
The outcomes of this method can be used as a measure of value in contexts where machine vision is controllable. An approach for
recognizing and tracking moving objects in a variety of contexts is described in this study. The first stage of tracking moving items is
accomplished by subtracting the new and previous frames and extracting features from the previous frame’s backdrop, followed by a
second stage of identifying and classifying moving objects based on fuzzy logic. A diagram of the proposed approach is demonstrated in
Fig. 1.
3.1. Problem formulation and assumptions
In the proposed method, it is assumed that moving objects have identifiable features that can be tracked in consecutive frames. The
tracking algorithm assumes that the movement of pixels will be constant in successive frames and that objects will have small
movements between frames.
Let’s say you have n consecutive video frames that go like this: f0 , f0 , …, fn− 1 . The goal of object detection, given a tth frame ft , is to
identify each object i ∈ L in the frame and assign it a set of coordinates bit = (xt , wt , yt , ht ) that stand for the centroid (xt ,yt ), width (w),
and height (ht) of the object bounding box. To ensure that each item is followed from frame to frame, the proposed method is tasked
with associating each framewise detection {bit , bi1 , …, bin } for every i ∈ L with a unique object identifier k ∈ 1, 2…K where K is the total
number of unique objects across all frames. We add the proposed method task to multiple object tracking, shown in Fig. 3.
Given ft− p , ft− p+1 , …, ft with associated object detections {bit− p , bit− p+1 , …, bin } and tracks, we define MOF as the joint problem of
Fig. 1. Flowchart of the moving object detection, tracking, and classification.
4
predicting the future bounding boxes {bit+1 , bit+2 , …, bit+q } and associated object tracks of the upcoming ft+1 , ft+2 , …, ft+q video frames
for each object present in frame ft , where p is the number of past frames used as input and q is the number of future frames to be
predicted. P = 30 and Q = 60, or 1 s in the past and 2 s in the future at 30 Hz, are used in this work.
3.2. Object tracking in moving and static camera
To track objects in different frames, two subsequent frames are subtracted from an image. When the video image is taken with a
static camera, there is no matching problem, but when the camera is moving, it is difficult to match the previous frame and the new
one. Therefore, to subtract the frames, camera movement must be taken into account.
In this paper, we present a method for estimating and detecting camera movement, as well as compensating for the detection of
moving objects. Tracking algorithms encounter problems tracking moving cameras because the camera movement creates two types of
Fig. 2. Diagram of the suggested method for tracking camera movement in moving object tracking problems using moving camera.
5
movement, including the movement of objects and additional background movements. In tracking applications, it is necessary to
distinguish between moving objects and fixed background components. By matching feature points between successive frames, this
paper calculates camera movement based on movement information. Based on the maximum iteration for the size and direction of
movement, this is done. By combining the size and direction of the feature points, the KLT algorithm matches the points between two
successive frames. Fig. 2 illustrates the proposed method. Here are the steps of the proposed method presented in the following
sections.
3.2.1. Receiving the input frames

To determine camera movement, two consecutive frames should be used. A moving platform is required to receive a series of
images. TFDCBS-Gaussian filters-VGG16 is a unique classification technique used to monitor bicycles, motorbikes, cars, trucks, and
school buses in motion. This information was compiled based on real-world driving scenarios. It is possible to conduct detailed in
vestigations on moving automobiles with them. In the KITTI 3D dataset, there are 7000 training images and 5000 test images.
3.2.2. Pre-processing and feature extraction

To increase the accuracy of the proposed model, the Gaussian noise removal strategy is employed for each input frame after
receiving it. A polished image is obtained after applying this filter, and it is expected that the feature points are extracted accurately.
With Shi and Tomasi’s method [31,47], good feature points can be obtained for the t-1 frame used for tracking. By calculating the
second derivative (soble operator), the authors of [31,33] calculated special corner points called Good Feature for Track. By using
these feature points, we can determine the camera movement between two subsequent frames in this study. Fig. 3 shows the results of
this algorithm for an example of a video that was tested.
3.2.3. Matching feature points

To determine the camera movement, it is necessary to determine the movement of feature points between two subsequent frames.
Based on the coordinates of the feature points between frame t and t+1, the movement information is calculated. A KLT algorithm [32]
is used to match the feature points in this paper. For matching feature points in frame sequences, the KLT algorithm relies on three
main assumptions, including “brightness constancy”, “temporal persistence,” and “spatial coherence.”. Each of these assumptions
implies that pixels will have small movements in subsequent frames, and the points in the neighborhood of a pixel will have similar
properties and movements [46,48]. As shown in Fig. 4, the results of this algorithm are shown for frames t and t+1 of the videos in
Fig. 3. In this case, two frames are checked consecutively and with the help of Shi and Tomasi’s method, the features of different points
in the images are extracted together, and then the points with similar image features are matched to identify the background image.
Meanwhile, the good features are those that create the most similarity in the points so that optimal matching occurs in the frames.
3.2.4. Obtaining movement size and direction

As the moving objects will be detected and matched based on their movement information in the following steps, Eq. (1) and Eq. (2)
are used here to calculate the size and direction of movement of each feature point [32].
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
( )2 ( )2̅
mi = xpi − xci − ypi − yci , i = 1, …, n (1)
( )
ypi − yci
di = arctan , i = 1, …, n (2)
xpi − xci
in the preceding equations, n is the number of matched feature points between two successive frames. (xpi,ypi) and (xci,yci) are the
feature point coordinates in frames t and t+1, respectively.
Fig. 3. Feature point extraction results.
6
Fig. 4. Results of matching using KLT algorithm for the frame of the video shown in Fig. 3.
3.2.5. Background matching

Following the calculation and extraction of all similar points in the two frames, different parts of the previous frame are displaced
for the new frame. As a result, an image is generated between subsequent frames. As seen in Fig. 5, Each frame in the new framework is
resized, creating black areas at the boundaries of the mapped. Using a simple program, the irrelevant parts of the new frame are
removed and the frame’s size is extended. Fig. 6 is an example of a matching background image for a new frame image.
Here, we present a method for detecting and compensating camera movement for tracking moving objects based on combining the
size and direction information of the feature points. Combining the size and direction information of feature points increases accuracy,
according to the results. Because only reliable feature points are used to detect camera movement, it is not necessary to process the
entire image; therefore, processing time is reduced significantly.
3.3. Moving object classification
To distinguish the position of the moving object, two frames are subtracted after discriminating the background frame from the
image of the previous frame. Due to the possibility of mask images containing some noise, a noise removal method is used to remove
these noises from holes resulting from masking. This is done with an area smaller than 750 pixels. Each segment is extracted by labeling
the masked segment with the corresponding number. As a result of the proposed features, different segments are then classified. To
discriminate moving objects, such as machines, motorcycles, and pedestrians, a type I fuzzy system with a Takagi-Sugeno (TS)
structure is also defined. Fig. 7 shows the steps involved in detecting and classifying objects. In the proposed method for training the
Fig. 5. Mapped images after matching the feature points pixels.
7
Fig. 6. The matched background image.
fuzzy logic system for 100 data samples of different images, including cars, pedestrians, and cyclists, a trial and error method has been
used in this work. After learning the fuzzy classification system and determining the parameters of the membership functions, in the
testing phase in video footage, this method works very fast and in real-time and has a good speed compared to other methods.
Fig. 7. Block diagram of the object classification.
8
3.3.1. Extracting features of segmented objects

A description of the features obtained from segmented objects is provided in this subsection. An example of a target section is
shown in Fig. 8.
( ) ( )
Px max = Xmax,x , Xmax,y Px min = Xmin,x , Xmin,y (3)
in which Xmin,y, Xmin,x, Xmax,y, and Xmax,x are the x-y coordinated from right and left sides of boundary i. Eq. (4) is also derived:
( ) ( )
Py max = Ymax,x , Ymax,y Py min = Ymin,x , Ymin,y (4)
Then the height h and width w of the objects are calculated based on Eq. (5):
wi (t) = Xmax,x − Xmin,y hi (t) = Ymax,x − Ymin,y (5)
c) Position: In Eq. (6), the coordinates of each object in the image are given:
Xmax,x + Xmin,x Ymax,x + Ymin,x
Xi (t) = (t) = (6)
2 2
d) Center of mass: The mass center of a system of average weighted particles is the position of that particle. In 2D images, a center of
mass is described based on the position of the particles.
Our general system includes n particles of m1, m2, m3, …, mn masses that their position vectors are r1, r2, r3, …, rn. The mass center
of this system is a point to the position vector rcm that is obtained using Eq. (7):
→ ∑ →
r2 +…+mn → rn mi ri
→r cm = m1 →
ri + m2 = i (7)
m1 + m2 +…+mn m
Here, m is the total mass of the model. It is evident that the above definition is equivalent to Eq. (8) in 2D space:
∑ → ∑ →
̅→ mi xi ̅→ mi yi
Xcm = i , ycm = i (8)
m m
where x and y represent the center of gravity [30,49]. The position of the center of gravity according to equation (7) for each selected
object is calculated with the help of pixel weighting and is introduced and defined with the help of the coordinate vector of each pixel
inside the object with a unit weight for each pixel. With the help of this relationship, the coordinates of the center of gravity for each
object are defined based on the x,y coordinates, and the relationships governing 8 for each object are defined based on the unit weights
for mi and the position of each pixel according to xi, yi.
3.3.2. Classifying and labeling objects based on fuzzy logic
• Fuzzy Logic model
Fuzzy logic for modeling human intellectual concepts refers to the unreliability of information to make decisions based on different
criteria. Considerable weight is given to linguistic values and substitution fit criteria. When making decisions, unit intervals take the
Fig. 8. Extracting features of a segmented model. a) Area: by counting the number of pixels in segment I of t-th frame, the area of object Ai(t) is
calculated. b) Width and height: The position of the pixels Pxmax(Pxmin) is extracted with the maximum (minimum) components of x based on
Eq. (3).
9
place of logical real values to describe fuzzy conditions. Consequently, a set is mathematically defined as a set of countable, infinite, or
finite elements [34]. This collection is made up of either members or non-members. However, in fuzzy models, the component may or
may not be part of the set. Therefore, answering X as a “member of a set” is neither correct nor incorrect. The block diagram of fuzzy
logic is shown in Fig. 9.
• Fuzzy set
The global discourse defines a fuzzy set A, which is specified by a membership function μA: U→[0,1] given in Eq. (9) [34].
A = {(x, μA(x)) (9)
According to Eq. (10), For each member x of the set A, μA(x) represents the strength of the relationship between x and A.
x ͼ(A, μ) ⇔ x ͼ A Λμ(x)∕
=0 (10)
As a result of the membership function, each element x in U has a degree of connection to each set A that indicates the likelihood
that the element x belongs to that set. Thus, an element with a relationship degree of 0 is excluded from the fuzzy set, while an element
with a relationship degree of 1 is included.
3.3.2.1. Fuzzification process. Considering the characteristics of the application area of the current study, triangular membership
functions are used. An adaptive function given in Eq. (11) can adjust a triangular fuzzy number A by three numbers (a,b,c) [34].
⎧
⎪
⎪ 0, se x < a;
⎨
(x − a)/(b − a), se a ≤ x ≤ b;
(11)
⎪ (c − x)/(c − b)se b ≤ x ≤ c;
⎪
⎩
0, se c < x;
• Fuzzy Inference
Defuzzification is a procedure that creates amount and size in fuzzy logic; in other words, the fuzzy numbers are converted to unity
by the use of various ways. This was the greatest weight average as specified by Eq. (12) [34].
∑
μ(x)i × wi
Z0 = ∑ (12)
μ(x)i
Z0 is the defuzzification output, μ(x)i is the relationship degree with the fuzzy set, and wi is the output fuzzy weight, according to
Eq. (12).
• Structure of the Proposed Fuzzy Logic for Classification
Takagi-Sugeno type I fuzzy logic is used in this study. Two inputs are included in this model based on the features extracted in the
previous section. These inputs are the height to width ratio of the mask holes detected in the range of [0,5] and the center of mass
position changes based on the factors calculated using Eq. (13) in the range of [− 0.1,0.1].
Ym − Yp
Yr = (13)
Yp
Fig. 10 illustrates the general structure of the proposed fuzzy logic, including input membership functions, input and output
characteristics, and schematics of fuzzy rules. The fuzzy rules are also represented in Table 1. The paper describes the writing of fuzzy
rules according to a semantic model of interpretation for vehicles, pedestrians, and bicycles/motorcycles.
Fig. 9. Diagrammatic representation of fuzzy logic [31].
10
Fig. 10. Representation of the proposed TS fuzzy classification system.
11
Objects are weighed with isotropic kernels with radial windows, h. Pixels farther from the center of the target area, denoted by x,
receive less weight or are excluded from the calculation. Using this kernel increases the resistance of target modeling since these areas
are sometimes obscured and interfere with the background image. According to Eq. (14),The following kernel is recommended for this
purpose, and it is an optimal kernel.
{
Mix |r|≥ h
MA(x) = (14)
0|r|≤ h
where Mix denotes the gravity of objects. As a result of this method, invisible areas for identification in the image of different frames are
removed from the identification calculations. These areas are left to the frame or other sections to decide. In this way, the obstruction
problem can be effectively resolved. The Mumentrate parameter is used as the input of the fuzzy system to determine the location of the
center of gravity relative to the object’s peripheral center, which is defined in up or down states. To increase the sensitivity parameters
of this work, we have tried to use a trial and error approach for different objects to improve the quality assurance for object
identification.
The fuzzy control system model serves as the foundation for the decision-making process for the safety monitoring system.
Although the lack of expertise in many difficulties causes movement in fuzzy logic control systems to be lost, the fuzzy approach is one
of the finest methods for systems with high noise levels. A fuzzy control system is introduced with text features to replicate how moving
objects are located in the camera’s field of vision and to produce a suitable semantic output for human operators. Fuzzy-based models,
in contrast to formal probability analysis, are based on semantic classes rather than numerical values. A fuzzy control system converts
sensor measurements from the input into linguistic labels or input descriptions. The mapping of operator knowledge into a decision
model is provided by the fuzzy system. Instead of statistically expressing the observation, the approach of locating moving objects to
detect aberrant activity uses language-type reasoning to determine whether a behavior is appropriate or inappropriate. The fuzzy
model produces a verbal output known as a categorization label to simulate this cognitive action. Based on the defuzzification formula
utilized by digital computers, this output can be converted into a number. Results are communicated to the cameras by the level of
focus (based, for example, on confidence, validity, or reliability estimation) necessary to assess a specific scene. The system assigns a
fuzzy probability to the suspicious behavior based on the five linguistic variables that correspond to each fuzzy set for each contextual
characteristic. After defuzzing, it is possible to take into account the location of the camera and the order in which objects move in the
frames to establish an appropriate alarm threshold so that the camera’s position and the order of the frames are regarded as two final
attributes.
In Fig. 11, the variable x is shown at 1 on the x-axis, and the membership functions to which x belongs are shown as valid with the
change on the y-axis. For instance, in component (d), five persons in the box indicate a confidence ratio of 0:1 for “moderate activity”
and 0:27 for “high activity.” Equation (2) thus establishes that the presence of five individuals in the box denotes “moderate activity”
and is taken into account for the linguistic variable of the set, “NumPpl".
For each feature interval, three membership functions are taken into account, and five membership functions are employed to
highly accurately represent all potential levels of suspicious behavior. As seen in Fig. 3, the fuzzy system is given the textual features
listed below, which causes the input to be fuzzified and the output to be created by the rules:
(a) The time of day, which is [0:00 to 24:00] hours. (b) The number of velocity changes made by an item within a single frame
(0–30). There is a chance that an item is wandering if it travels around in the frame for an extended period and changes direction
frequently, indicating that it does not have a defined destination. (c) If an item is walking at a regular speed of [0 30] seconds and
enters or exits a frame, it often does so within a few seconds. (d) The quantity of objects in the frame that is being processed [0 40]; (e)
The frequency with which an object alters its direction inside a frame [0 30].
Although the ensemble bounds are built to accommodate practically every situation, the fuzzy system is best suited to handle noise
and outliers. The fuzzification procedure is crucial. Fuzzification will nonetheless draw the language variable that is closest to the limit
of the defined range if the values during computation are outliers (outside the specified range). The human operator is informed that
the inputs are erroneous and is given the option to see the status in the videos if the fuzzy system is unable to match the measurements
with the values. The output in the error log is likewise set to zero so there will be no alarm. There is a point on the x-axis where at least
one membership function is higher than 0:5, which is important in the models in Fig. 3 where each set is covered by at least one
membership function for any given input. The nominal value suggests that the classification choices made by the collections are very
reliable.
A membership function with a strong value over a wide domain indicates that the related linguistic variable has a significant
contribution and power. With this in mind, all membership function shapes were selected. So no bias is added to a behavior, destructive
behavior only has one point of highest confidence in membership functions.
Table 1
Proposed fuzzy rule set.
XYrate Mumentrate Detectout
Low Down 3(car)

Med Up 2(cyclist)
Med Down 1(pedestrian)
Low Up 3(car)
High – 1(pedestrian)
12
Fig. 11. Membership functions: (a) hours per day; b) the number of velocity changes for the item. c) the duration of the object’s presence in the
frame. (d) how many things are visible in the frame. (e) The frequency with which each object’s motion changes direction. f) the severity of the
disruptive behavior; The types of fuzzy relations are depicted in Fig. 3 where the first five sets (Fig. 11a–e) represent the input membership functions
and the last set (Fig. 3f) represents the output. This figure is drawn for the state before input and output normalization to show the input and output
ranges of the system for the reader’s information, otherwise, it is normalized for the real system and is defined in the range of 0–1.
All conceivable combinations of fuzzy linguistic variables must be taken into account in the rule set of a well-defined fuzzy control
system. The quantity of rules, however, is inversely proportional to the processing time. As a result, the I-SAFE rule set is built to use a
variety of features and cover all relevant cases.
13
4. Experimental outcomes
The KITTI dataset, which contains moving object bounding boxes generated from stationary and mobile camera footage, was
deployed for testing. Everything is based on real-world driving experiences. They enable a complete and in-depth examination of
vehicle monitoring data while the vehicle is in motion. This analysis employs numerous distinct data sets. All tests are conducted on a
computer running Windows 8.1 (64-bit) with a 2.67 GHz Intel Core i5-M480 processor and 4 GB of RAM. Matlab2017b is used to
implement the recommended algorithms.
The TFDCBS-Gaussian filters-VGG16 classifier [7,44], the Joint Monocular [35], the Part-based particle filter [36], and the Kalman
filter and IOU tracker [37] are used to compare the tracking efficiency of the proposed strategy to that of existing methods. In the
comparison, several performance criteria such as accuracy, computation speed, average f-measure, average similarity, average recall,
and average precision are examined. True positive (TP) represents the total number of correctly classified background pixels, true
negative (TN) represents the total number of pixels currently classified as background, and false negative (FN) represents the total
number of incorrectly classified background pixels.
/
RecallAverage = TP (FN + TP)
Fig. 12. Initial frames (top) and tracked frames (bottom) while different objects move. The frames are tracked using the proposed method in
KITTI dataset.
14
Fig. 12. (continued).
/
PrecisionAverage = TP (FP + TP)
/
SimilarityAverage = TP (FN + TP + FP)
F-measureAverage = (2 × Recall × Precision) / (Precision + Recall)
Accuracy = TP/(TN + TP)

To test the four advanced approaches and the proposed method, features are gathered and processed in a particular way. For object
categorization and tracking, the process characteristics are introduced as inputs to the fuzzy classifier (Fig. 1). The limitations in the
problem of extracting the features of objects in video images depend on the moving positions of objects, which causes errors in object
recognition. For this purpose, in the framework of this method, a numerical limit has been adopted for the inputs of the fuzzy system.
Also, in stating the rules governing the fuzzy system, we have used an optimal approach to deal with these limitations.
This table shows the suggested method together with four additional sophisticated moving object tracking techniques using the
KITTI dataset, which contains 70,000 training photos and 5000 test images. The TFDCBS-Gaussian filters-VGG16 classifier [7] has
optimal performance for all performance parameters, including precision (0.8509), recall (0.9261), similarity (0.7529), f-measure
(0.8863), and accuracy (94) with the fastest calculation time (12.8s). In terms of performance and applicability, the proposed strategy
beats previous sophisticated methods for tracking vehicles by providing precision (0.8509), recall (0.9261), similarity (0.7529),
f-measure (0.8863), and accuracy (94) in the shortest calculation time (12.8s).
15
Fig. 12. (continued).
Fig. 12 illustrates a strategy for following cars in constant motion (b1, b2, b3). Compared to the initial comparable frames, the final
frames show the tracked objects with 3D bounding boxes (a1, a2, a3). Examples of bounding boxes are shown in Fig. 13.
According to a general assessment of optimal performance parameters (Table 2) and tracked images enclosed within bounding
boxes (Figs. 12 and 13), the proposed method is more effective and faster than existing methods for tracking moving objects. But, there
are various examples of damage that causes problems in the detection of objects. For example, when the moving object in video footage
is placed behind another image, such as a trash can or a column of a wall, etc., the system has difficulty in recognition. When the
pedestrian bends while moving, it can make the system ineffective.
In the studied frames, the proposed approach of tracking is also useful and the weighting method relating to the main windows can
maintain the target in the tracking framework. This work has been proposed to investigate the challenge of obstruction. Obstruction
means placing another object in front of the tracking target so that part or all of the target’s appearance is covered. In this sequence, the
car moves quickly and is blocked by trees. For this problem in this sequence of frames, some methods are not able to track the target
and lose the target in the face of obstruction. The method of weighing objects based on area and radius is introduced to solve the
objective obstruction challenge. This proposed approach has worked well to solve the problem. The basis of this work is tracking and
detecting moving objects in images with fixed and moving cameras. The focus of this work is to present a new method based on fuzzy
logic in the detection of moving objects. Therefore, in the technique presented in this work, compared to other similar works in Table 2,
the use of fast detection and detection methods is used so that it can work well in a real-time approach. Therefore, similar to the articles
16
Fig. 13. Initial frames (top) and tracked frames (bottom) while different objects move. The frames are tracked using the proposed method in
KITTI dataset.
Table 2
Using the KITTI dataset, this table compares the performance metrics of four advanced and suggested tracking systems.
Methods Accuracy F-measure Similarity Recall Precision Time (s)
Joint Monocular [16,38] 86 0.6698 0.6124 0.7232 0.6317 14.3

Part-based particle filter method [39,45] 83 0.6165 0.5564 0.6683 0.5721 16.7
Kalman filter and IOU [40,42,43] 86 0.5719 0.4475 0.6874 0.4896 22.6
TFDCBS-Gaussian filters-VGG16 classifier [7,41] 94 0.8869 0.7529 0.9261 0.8509 12.8
Frames submission and FUZZY based classification 87 0.6433 0.773 0.6893 0.66 0.03
in Tables 2 and in this work, an attempt was made to use a hybrid approach for tracking and detection.
5. Conclusion
A summary of the proposed method can be found in the following points. (1) A technique for efficiently and rapidly monitoring
multiple objects is developed using separated background subtraction, camera movement estimation, background image prediction
from the previous frame, and fuzzy logic classifier. (2) By combining TFDCBS and HEBT, specific objects with contained borders can be
detected in raw images, and low-complexity features extracted, simplifying the processing of the fuzzy system in comparison with
17
object classification. (3) In our proposed method for reducing data processing time, using a Gaussian filter guarantees the image’s
resolution for feature extraction and classification with a simple fuzzy system without training. (4) The proposed system for monitoring
moving vehicles and pedestrians is compared to four advanced tracking algorithms using the KITTI bounding box dataset [7]. (5)
According to empirical findings of comparable studies, the suggested technique’s performance characteristics (average recall, average
precision, average similarity, and average F-size) and data processing time are superior and minimal, respectively. By reducing
processing time and increasing reaction speed while driving, this study provides a simple, low-complexity method to track and classify
moving cars and people. According to the experimental results generated from the video sequence featuring numerous challenges from
the KITTI database, the suggested technique produced the most accurate results for the size, form, and rotation difficulties, as well as
the anticipated outcome for the blockage problem. In the future, to continue this work, we plan to use machine learning techniques to
classify the extracted features. Also, let’s add to the problem with new feature extraction algorithms such as color features.
Ethical approval
The images of all the people used for this research were taken with their consent and their information is protected.
Data availability statement
The data is available, so it is possible to share the data if the editor or reviewers request them.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
CRediT authorship contribution statement
Payam Safaei Fakhri: Formal analysis. Omid Asghari: Project administration, Investigation, Data curation, Conceptualization.
Sliva Sarspy: Software, Resources, Methodology. Mehran Borhani Marand: Validation, Resources, Investigation. Paria Moshaver:
Visualization, Validation, Resources, Project administration, Funding acquisition. Mohammad Trik: Writing – review & editing,
Writing – original draft, Resources, Data curation.
Declaration of competing interest
Please check the following as appropriate:
o All authors have participated in (a) conception and design, or analysis and interpretation of the data; (b) drafting the article or
revising it critically for important intellectual content; and (c) approval of the final version.
o This manuscript has not been submitted to, nor is under review at, another journal or other publishing venue.
o The authors have no affiliation with any organization with a direct or indirect financial interest in the subject matter discussed in
the manuscript
o The following authors have affiliations with organizations with direct or indirect financial interest in the subject matter discussed in
the manuscript:
References
[1] H. Wang, S. Wang, J. Lv, C. Hu, Z. Li, Non-local attention association scheme for online multi-object tracking, Image and Vision, Computing 102 (2020),
103983.
[2] I. Pham, M. Polasek, IEEE/AIAA 33rd Digital Avionics Systems Conference (DASC), IEEE, 2014, pp. 1–15, 2014.
[3] A. Yilmaz, O. Javed, M. Shah, Object tracking: a survey, ACM Comput. Surv. 38 (2006) 13–es.
[4] G.S. Walia, R. Kapoor, Recent advances on multicue object tracking: a survey, Artif. Intell. Rev. 46 (2016) 1–39.
[5] B. Zhuang, H. Lu, Z. Xiao, D. Wang, Visual tracking via discriminative sparse similarity map, IEEE Trans. Image Process. 23 (2014) 1872–1881.
[6] H. Farsi, Improvement of minimum tracking in minimum statistics noise estimation method, Signal Processing, Int. J. 4 (2010) 17–22.
[7] K.S. Chandrasekar, P. Geetha, Multiple objects tracking by a highly decisive three-frame differencing-combined-background subtraction method with GMPFM-
GMPHD filters and VGG16-LSTM classifier, J. Vis. Commun. Image Represent. 72 (2020), 102905.
[8] H. Zhang, Q. Zou, Y. Ju, C. Song, D. Chen, Distance-based support vector machine to predict DNA N6-methyladenine modification, Curr. Bioinf. 17 (5) (2022)
473–482.
[9] W. Zheng, Y. Zhou, S. Liu, J. Tian, B. Yang, L. Yin, A deep fusion matching network semantic reasoning model, Appl. Sci. 12 (7) (2022) 3416.
[10] S. Yan, Z. Gu, J.H. Park, X. Xie, A delay-kernel-dependent approach to saturated control of linear systems with mixed delays, Automatica 152 (2023), 110984.
[11] M. Trik, A.M. Bidgoli, H. Vashani, S.P. Mozaffari, A new adaptive selection strategy for reducing latency in networks on chip, Integration 89 (2022) 9–24.
[12] J. Zhang, C. Zhu, L. Zheng, K. Xu, ROSEFusion: random optimization for online dense reconstruction under fast camera motion, ACM Trans. Graph. 40 (4)
(2021) 1–17.
[13] W. Zheng, L. Yin, Characterization inference based on joint-optimization of multi-layer semantics and deep fusion matching network, PeerJ Computer Science 8
(2022) e908.
[14] S. Yue, B. Niu, H. Wang, L. Zhang, A.M. Ahmad, Hierarchical Sliding Mode-Based Adaptive Fuzzy Control for Uncertain Switched Under-actuated Nonlinear
Systems with Input Saturation and Dead-Zone, Robotic Intelligence and Automation, 2023.
18
[15] M. Trik, S. Pour Mozaffari, A.M. Bidgoli, Providing an Adaptive Routing along with a Hybrid Selection Strategy to Increase Efficiency in NoC-Based
Neuromorphic Systems, Computational Intelligence and Neuroscience, 2021, p. 2021.
[16] Q. She, R. Hu, J. Xu, M. Liu, K. Xu, H. Huang, Learning High-DOF Reaching-And-Grasping via Dynamic Representation of Gripper-Object Interaction, 2022 arXiv
preprint arXiv:2204.13998.
[17] T. Wang, L. Zhang, N. Xu, K.H. Alharbi, Adaptive critic learning for approximate optimal event-triggered tracking control of nonlinear systems with prescribed
performances, Int. J. Control (2023) 1–15.
[18] D. Li, S.S. Ge, T.H. Lee, Fixed-time-synchronized consensus control of multiagent systems, IEEE transactions on control of network systems 8 (1) (2020) 89–98.
[19] M. Trik, A.M.N.G. Molk, F. Ghasemi, P. Pouryeganeh, A hybrid selection strategy based on traffic analysis for improving performance in networks on chip,
J. Sens. (2022) 2022.
[20] C. Zong, Z. Wan, Container ship cell guide accuracy check technology based on improved 3D point cloud instance segmentation, Brodogradnja: Teorija i praksa
brodogradnje i pomorske tehnike 73 (1) (2022) 23–35.
[21] J. Sun, Y. Zhang, M. Trik, PBPHS: a profile-based predictive handover strategy for 5G networks, Cybern. Syst. (2022) 1–22.
[22] B. HassanVandi, R. Kurdi, M. Trik, Applying a modified triple modular redundancy mechanism to enhance the reliability in software-defined network, Int. J.
Electr. Comput. Sci. 3 (1) (2021) 10–16.
[23] G. Zhou, R. Zhang, S. Huang, Generalized buffering algorithm, IEEE Access 9 (2021) 27140–27157.
[24] M. Samiei, A. Hassani, S. Sarspy, I.E. Komari, M. Trik, F. Hassanpour, Classification of skin cancer stages using a AHP fuzzy technique within the context of big
data healthcare, J. Cancer Res. Clin. Oncol. (2023) 1–15.
[25] Y. Han, B. Wang, T. Guan, D. Tian, G. Yang, W. Wei, J.H. Chuah, Research on road environmental sense method of intelligent vehicle based on tracking check,
IEEE Trans. Intell. Transport. Syst. 24 (1) (2022) 1261–1275.
[26] T. Liu, L. Yong, Moving camera-based object tracking using adaptive ground plane estimation and constrained multiple kernels, J. Adv. Transport. (2021) 2021.
[27] J. Xu, X. Zhang, S.H. Park, K. Guo, The alleviation of perceptual blindness during driving in urban areas guided by saccades recommendation, IEEE Trans. Intell.
Transport. Syst. 23 (9) (2022) 16386–16396.
[28] Z. Wang, Z. Jin, Z. Yang, W. Zhao, M. Trik, Increasing Efficiency for Routing in Internet of Things Using Binary Gray Wolf Optimization and Fuzzy Logic, Journal
of King Saud University-Computer and Information Sciences, 2023, 101732.
[29] Fan Yan, et al., Tackling multiple object tracking with complicated motions—Re-designing the integration of motion and appearance, Image Vis Comput.
(2022), 104514.
[30] J. Xu, S.H. Park, X. Zhang, J. Hu, The improvement of road driving safety guided by visual inattentional blindness, IEEE Trans. Intell. Transport. Syst. 23 (6)
(2021) 4972–4981.
[31] C. Li, G. Dobler, X. Feng, Y. Wang, Tracknet: Simultaneous Object Detection and Tracking and its Application in Traffic Video Analysis, 2019 arXiv preprint
arXiv:1902.01466.
[32] S. Lu, Y. Ban, X. Zhang, B. Yang, S. Liu, L. Yin, W. Zheng, Adaptive control of time delay teleoperation system with uncertain dynamics, Front. Neurorob. 16
(2022), 928863.
[33] H. Zhang, Y. Mi, Y. Fu, X. Liu, Y. Zhang, J. Wang, J. Tan, Security defense decision method based on potential differential game for complex networks, Comput.
Secur. 129 (2023), 103187.
[34] X. Yang, X. Wang, S. Wang, K. Wang, M.B. Sial, Finite-time adaptive dynamic surface synchronization control for dual-motor servo systems with backlash and
time-varying uncertainties, ISA Trans. 137 (2023) 248–262.
[35] Y. Cao, N. Xu, H. Wang, X. Zhao, A.M. Ahmad, Neural networks-based adaptive tracking control for full-state constrained switched nonlinear systems with
periodic disturbances and actuator saturation, Int. J. Syst. Sci. (2023) 1–16.
[36] Y. Yao, F. Shu, Z. Li, X. Cheng, L. Wu, Secure transmission scheme based on joint radar and communication in mobile vehicular networks, IEEE Trans. Intell.
Transport. Syst. (2023).
[37] H. Zhao, H. Wang, N. Xu, X. Zhao, S. Sharaf, Fuzzy approximation-based optimal consensus control for nonlinear multiagent systems via adaptive dynamic
programming, Neurocomputing 553 (2023), 126529.
[38] Y. Zheng, Y. Zhang, L. Qian, X. Zhang, S. Diao, X. Liu, H. Huang, A lightweight ship target detection model based on improved YOLOv5s algorithm, PLoS One 18
(4) (2023), e0283932.
[39] B. Cheng, D. Zhu, S. Zhao, J. Chen, Situation-aware IoT service coordination using the event-driven SOA paradigm, IEEE Transactions on Network and Service
Management 13 (2) (2016) 349–361.
[40] S. Yang, Q. Li, W. Li, X. Li, A.A. Liu, Dual-level representation enhancement on characteristic and context for image-text retrieval, IEEE Trans. Circ. Syst. Video
Technol. 32 (11) (2022) 8037–8050.
[41] A.A. Liu, Y. Zhai, N. Xu, W. Nie, W. Li, Y. Zhang, Region-aware image captioning via interaction learning, IEEE Trans. Circ. Syst. Video Technol. 32 (6) (2021)
3685–3696.
[42] F. Tang, H. Wang, L. Zhang, N. Xu, A.M. Ahmad, Adaptive optimized consensus control for a class of nonlinear multi-agent systems with asymmetric input
saturation constraints and hybrid faults, Communications in Nonlinear Science and Numerical Simulation 126 (2023), 107446.
[43] H. Liu, H. Yuan, Q. Liu, J. Hou, H. Zeng, S. Kwong, A hybrid compression framework for color attributes of static 3D point clouds, IEEE Trans. Circ. Syst. Video
Technol. 32 (3) (2021) 1564–1577.
[44] Q. Liu, H. Yuan, R. Hamzaoui, H. Su, J. Hou, H. Yang, Reduced reference perceptual quality model with application to rate control for video-based point cloud
compression, IEEE Trans. Image Process. 30 (2021) 6623–6636.
[45] C. Mi, S. Huang, Y. Zhang, Z. Zhang, O. Postolache, Design and implementation of 3-D measurement method for container handling target, J. Mar. Sci. Eng. 10
(12) (2022) 1961.
[46] C. Cao, J. Wang, D. Kwok, F. Cui, Z. Zhang, D. Zhao, Q. Zou, webTWAS: a resource for disease candidate susceptibility genes identified by transcriptome-wide
association study, Nucleic acids research 50 (D1) (2022) D1123–D1130.
[47] J. Ma, J. Hu, Safe consensus control of cooperative-competitive multi-agent systems via differential privacy, Kybernetika 58 (3) (2022) 426–439.
[48] W.-L. Zhang, K. Yang, Y.-T. Xin, T.-S. Zhao, Multi-object tracking algorithm for RGB-D images based on asymmetric dual siamese networks, Sensors 20 (2020)
6745.
[49] L. Liang-qun, Z. Xi-yang, L. Zong-xiang, X. Wei-xin, Fuzzy logic approach to visual multi-object tracking, Neurocomputing 281 (2018) 139–151.
19

1 s2.0 S2405844023093647 Main

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 s2.0 S2405844023093647 Main

Uploaded by

Copyright:

Available Formats

Heliyon 9 (2023) e22156

Contents lists available at ScienceDirect

A fuzzy decision-making system for video tracking with multiple

3. The proposed method

3.1. Problem formulation and assumptions

Fig. 1. Flowchart of the moving object detection, tracking, and classification.

3.2. Object tracking in moving and static camera

3.2.1. Receiving the input frames

3.2.2. Pre-processing and feature extraction

3.2.3. Matching feature points

3.2.4. Obtaining movement size and direction

Fig. 3. Feature point extraction results.

3.2.5. Background matching

3.3. Moving object classification

Fig. 5. Mapped images after matching the feature points pixels.

Fig. 6. The matched background image.

Fig. 7. Block diagram of the object classification.

3.3.1. Extracting features of segmented objects

3.3.2. Classifying and labeling objects based on fuzzy logic

• Fuzzy Logic model

• Structure of the Proposed Fuzzy Logic for Classification

Fig. 9. Diagrammatic representation of fuzzy logic [31].

Fig. 10. Representation of the proposed TS fuzzy classification system.

Low Down 3(car)

Fig. 12. (continued).

F-measureAverage = (2 × Recall × Precision) / (Precision + Recall)

Accuracy = TP/(TN + TP)

Fig. 12. (continued).

Joint Monocular [16,38] 86 0.6698 0.6124 0.7232 0.6317 14.3

Data availability statement

CRediT authorship contribution statement

Declaration of competing interest

Please check the following as appropriate:

You might also like