A Combined Mmwave Tracking and Classification Framework Using A Camera For Labeling and Supervised Learning

A Combined mmWave Tracking and Classification
Framework Using a Camera for Labeling and

Supervised Learning
Abstract
Millimeter-wave (mmWave) radar may be used in a variety of imaginative ways to monitor

and observe several things at the same time. However, problems with the radar itself are
possible. It is challenging to improve mmWave radar detection while maintaining reliable
data categorization. Building a reliable model that is consistent with tracking and sensing
goals is a significant barrier to realising the full potential of mmWave sensing. Because radar
frames are associated with major events, manually annotating mmWave data takes a lot of
time and frequently requires topic expertise. This article lays the groundwork for training
mmWave radar by utilising a camera as a labelling and monitoring tool.
During the research phase, a preliminary drawing of the framework was constructed. The
proposed strategy is evaluated in contrast to other methods that strive to achieve the same
goals. Experiments in a variety of real-world settings have verified the conceptual
framework. The mmWave technology is used in the multi-object tracking system that arises
from the stated architectural design. This technology differs from others in that it can
discriminate between sprinting, tripping, and simply strolling.
The results of the experiment show that guided tagging from a camera may be utilised to train
a radar model consistently. This trained model consistently demonstrates great classification
accuracy over a wide range of instances, not just the ones it was trained on. This report lays
the groundwork for further research into hybrid monitoring setups. It specifically supports in
the construction of mmWave classification models by tackling labelling and training issues.
1. Introduction
In the field of research on radar tracking, instructing millimeter-wave (mmWave) sensors to

solve sorting issues appears to be an extremely significant goal. A significant amount of
research has been done in this area as a direct result of the discovery that methods based on
deep learning have the highest probability of being successful. On the other hand, in order to
properly classify things, it may be necessary to train a deep learning model by providing it
with many essential characteristics. Given the infant stage at which deep learning is now
operating, it is reasonable to anticipate that a substantial quantity of training data will be
required. Because mmWave data is so intricate and multifaceted, classifying the raw data
obtained from monitoring activities is a challenging process that calls for the participation of
knowledgeable individuals. The classification of raw mmWave data is widely acknowledged
to be difficult.
It's possible that merging data from a variety of different sources provides the answer to this
problem. One situation in which the term "information fusion" may be applicable is when
data from mmWave radar and cameras are combined into a single cohesive whole. As a
result, having a solid understanding of the fundamentals of mmWave radar as well as camera
signal fusion is very necessary. The technology known as "Information fusion with mmWave
radar and camera" [1] is used to combine the results of several data sources into a single
format that can then be analysed and displayed. However, in order to have a complete
knowledge, you will need to conduct research on a variety of different aspects as well as the
complexities involved. In the phrases that follow, an attempt will be made to classify the
many facets of knowledge fusion.
Table 1
Table 2
The fusion approach makes distinguishing between camera fusion and mm-wave radar fusion
easier. The mechanism that allows the two devices to unite is known as the "fusion process,"
which also includes the technique and approach. There are various fusion processes, each
with its own set of benefits and drawbacks. [5] has conducted significant research and testing
on a variety of strategies for merging mm-wave radar and camera data in the context of space
integration. The cameras use different coordinate systems while shooting photographs at the
same time. Each screen's coordinate systems are then mapped into a single, global space that
roughly matches the three-dimensional space in which people experience the world.
The sensors may also need to be tuned for spatial fusion. Numerous approaches for
calibrating mm-wave and video devices have been examined because of the tight link
between sensor calibration and spatial fusion (see, for example, [6, 7, 8, 9]). Even when a
variety of underlying structures are used, the time-based technique provides a strong basis for
the fusion process. Even though there are numerous fusion approaches, having a plan for
correlation and connection is essential.
In this research, we show how to employ information fusion with a camera to develop a
system that can recognise mm-wave radar data autonomously. The study's findings are
reported below. Our study provides novel quantitative approaches that expand on current
information. The first automated system for complete and automatic mmWave data tagging is
given. This ground-breaking attempt to eliminate explicit learning objectives was developed
in collaboration with students and instructors. Second, the data we've supplied is unusual in
that it includes everything needed to train a camera-based mm-wave radar classification
model. Our work stands out since it is one of the first of its type. Finally, the prospective
applications of our system show how well it links and combines data from cameras and radar.
This brings our conversation to a close and serves as a summary.
1.1. Key challenges
There are major hurdles in combining a camera for labelling and supervised learning with a
mmWave tracking and classification system. These include achieving real-time processing,
generalising across scenarios, robustly tracking objects despite occlusions and noise, and
establishing effective validation metrics, in addition to synchronising disparate data streams,
addressing temporal misalignment, ensuring consistent labelling between sensors, reducing
environmental variability, and other issues. In the face of these problems, the suggested
framework's accuracy, efficiency, and malleability must be improved, necessitating different
knowledge and novel solutions.
1.2. Key contributions
The most important result of this research is a novel approach for monitoring and
categorising mmWave signals that seamlessly combines camera-based labelling with
supervised learning. This novel technique increases object tracking and classification
accuracy by tackling challenges such as data synchronisation, temporal alignment, and
consistent labelling. Using supervised learning, the system correctly categorises objects,
allowing for accurate tracking in a range of settings. This is performed by using data from
mmWave radar and camera sensors. Practical applications frequently require the synergistic
integration of many sensory modalities to improve object recognition, tracking, and
categorization. Such real-world applications are critical for AI progress.
1.3. Review of Related Literature

The difficulties associated with detecting mmWave data will be highlighted in a study that
will be based on deep learning and will be conducted on mm-wave radar research. In light of
these challenges, current research has studied various distinct tagging methodologies,
revealing the feasibility of integrating radar data with information from a different sensor,
such as a camera. This research was carried out in response to the aforementioned issues. It is
not possible to adequately highlight the significance of this discovery.
[10] is one of the earliest demonstrations of a fusion-based approach for classifying things by
making use of both radar and a camera. This method was developed in the 1970s. This
project has been lauded for being ahead of its time and receiving such accolades. The
approach proposed by the authors [10] necessitates the completion of two separate stages.
The first step is to capture data, and then the next step is to apply a way to identify items that
can be detected by radar. This method often involves the use of a Kalman filter. The dots that
were recognised by the radar are retained for use in the third stage, which is the completion of
the classification, and they are then highlighted by mapping them onto a line that is
perpendicular to the field of view of the camera.
[11] draws attention to a different recent work that illustrates how to combine data from
cameras with mm-wave radar. [11] explains their procedures in a way that is comparable to
how [10] did it. Radar and cameras are used in the system that the authors of [11] claim as
being able to distinguish objects. Early radar emissions are filtered out during the object
identification process, in a manner analogous to the method described in [10]. Following the
process of calibrating the camera, mmWave data is superimposed into the picture plane
through the use of coordinate translation. In the last phase, we make use of a unified state and
machine learning to identify moving objects in the field of vision and keep track of their
movements.
A dataset that is equivalent for frequency-modulated continuous-wave (FMCW) was

developed as a result of additional research that was carried out thereafter [12]. This archive
has photos that were captured by cameras in addition to the corresponding measurements that
were obtained using inertial measurement units. According to [12], tagging is a method that
requires total temporal coherence across all three senses in order to be successful. This is a
requirement for the approach. According to [12], the authors made sure that the time of the
radar sensors and the camera sensors were synchronised with one another by coordinating the
timing of the sensors and, afterwards, the spatial alignment. In order to calibrate the camera
and the radar in terms of their spatial relationship, a special item that was employed that was
reflected in the radar domain and also visible to the eye was used. Due to the fact that the
radar grid will be vividly emphasised at this reference point, it will be much simpler to find
the spot in the camera's range of view that corresponds to the radar grid. As a direct
consequence of this, the item could be distinguishable in the radar image.
2. Background
2.1. We begin by providing an overview of the five selected methods, highlighting their
underlying principles, methodologies, and strengths. Subsequently, we delve into a
comprehensive evaluation that encompasses various aspects, including accuracy,
efficiency, robustness, and adaptability. By subjecting these methods to a standardized
evaluation framework, we aim to discern their relative merits and limitations in the
context of the proposed combined mmWave tracking and classification framework.
Our comparative study addresses the following key aspects:
1. Performance Metrics: We define a set of performance metrics to assess the methods'
classification accuracy, tracking precision, and overall efficacy. These metrics will
enable us to quantify and compare the outcomes achieved by each method.
2. Experimental Setup: We establish a consistent experimental setup to ensure a fair and
unbiased comparison. This includes selecting appropriate datasets, configuring
parameters, and delineating testing conditions that mimic real-world scenarios.
3. Results and Analysis: We present the empirical results obtained from applying each
method within the combined framework. These results are thoroughly analyzed,
providing insights into the strengths and weaknesses of each approach.
4. Robustness and Adaptability: We explore the methods' ability to maintain accurate
tracking and classification across varying conditions, such as changes in lighting,
object distance, and environmental factors. Robustness and adaptability are crucial
factors for real-world deployment.
5. Computational Efficiency: We evaluate the computational requirements of each
method, considering factors such as processing speed and memory consumption.
Efficiency is a critical consideration, particularly for real-time applications.
6. Comparison and Discussion: Based on the evaluation, we offer a comparative analysis
that highlights the relative advantages and drawbacks of each method. We discuss
their suitability for different scenarios and use cases within the proposed framework.
3. Materials and methods

3.1. Radar Training with Camera Labeling and a Supervision
Methodology
Here, we describe and show a general technique for tagging radar data and building a
distinctive radar model using a camera as the source data. The researchers that produced the
paper created the methodology. The purpose of this technique is to make the switch from
camera-based to radar-based model development easier. The many components of the
strategy described in Section 2.2 of this article are shown.
3.1.1. Problem Space

Pre-processing is commonly used to extract relevant information from unprocessed radar data
and make it more understandable. Even for professionals in the field, comprehending the
importance of raw radar data may be difficult and time-consuming. This is typically due to
the massive volume of data. The complexity of categorising data makes it difficult to develop
a system that can accurately categorise nuanced occurrences using radar data.
To address this issue, current solutions often entail either lowering the quantity of radar data
required to train the classifier or limiting training to a limited number of categories. Both
techniques attempt to limit the number of viable categories. These alternatives may work, but
they may reduce the overall accuracy of the radar model. In order to train classifiers rapidly,
it is critical to provide a simplified method of annotating radar data that does not violate the
model's fundamental constraints.
After years of research, the capabilities and limits of camera classification networks are
clearly clear. As discussed in Section 1.1, several models excel at categorising complicated
tasks. Because of their varied ability sets, these models are useful in a wide range of settings.
We present a strategy for generating independent models or classifier networks using only
camera inputs, eliminating the challenge of annotating raw radar data in this study. The study
focuses on the use of visual data to identify radar data and as a reference for these measures,
and it delves deeper into these two topics.
Visual data represents a certain amount of space in the horizontal and vertical planes and
captures a specific point in time, usually in two dimensions but sometimes in three. In two-
dimensional views, a stable viewpoint is frequently assumed, but this is rarely the case in
reality. Radar data, on the other hand, typically exhibits readings at an angle, over a distance,
or along an erroneous horizontal line. Moving targets are widely used as a source of
information in radar data collection. The domain alignment variations between radar and
camera data make it more difficult to establish a relationship between stationary things in
visual data and moving ones in radar data.
When numerous things happen at once or when objects move in and out of an observer's
field of view, the difficulty level rises. This is the second biggest difficulty, and it is also one
of the most difficult. This situation exemplifies the challenges of establishing reliable
connections between similar properties in radar data and distinguishable objects in vision
data.
3.1.2. Proposed approach

A single manufacturing chain supports the three strategic pillars. Several sources, including
radar and cameras, are used in the framework's data collection phase to obtain data. After
sensor data has been gathered, it is subjected to pre-processing and normalisation processes
that are specially designed for the use case that served as the motivation for the framework's
creation. The aforementioned steps are modified to meet the particular needs of each
individual application. The best way to send radar data is through a steady stream of frames
that are evenly spaced out in time. To provide simultaneous access to the relevant data, the
camera identification network—which is used to calibrate the radar—has been improved.
Fig 1
After the data collection and processing have been finished, a camera classifier that was
constructed at the same time may be used to train the radar utilising the data from the
preprocessed camera. This methodology comes after the data collecting and analysis steps.
The camera's information will be organised into several categories by the camera analyzer,
which will also compile a set of photographs with relevant notes attached to them. This
process is taken into consideration by the algorithm that controls the camera. The nature of
this technology is such that it is dependent on the context in which it is used; hence, the
environment in which these camera photographs were shot is ideal for illustrative reasons.
The suggested method is founded on the essential role that the matching device plays in
synchronising the data from the camera and the radar. It is possible to estimate and compare
the temporal span of the radar data by utilising the time period that was allotted to the camera
as a baseline. This synchronisation is accomplished by using the timestamp that is contained
inside the camera data sample. The temporal gap between the radar view and the camera view
is seen in Figure 2.
They are connected by the continuous supply chain that is in place to provide support for the
three strategic components. Currently, a vast variety of data is being gathered through the use
of radar and cameras. After collecting all of the pertinent sensor data, the information must be
preprocessed and normalised so that it may be used in the context for which the framework
was designed. These methods are amenable to a wide range of alterations in order to
accommodate the requirements of a particular application. In a perfect world, the data from
the radar would be delivered in the form of a nonstop stream of frames. The radar training
system is now being used by the camera classification network, and it will soon start making
use of camera data as well.
Fig 2
3.2. System design and implementation

Fig 3
3.2.1. Radar pipeline
Fig 4
Fig 5
3.2.2. Camera pipeline
Fig 6
Fig 7
3.2.3. Fused pipeline
The fusion system brings the entities that are being tracked by radar and those that are being
identified and recognised by cameras into synchronisation with one another. In order to
accomplish this goal, the two data sets are combined. Before combining the findings from
each domain's dataset, the author of this study highlights in Section 2.1 that it is necessary to
take into consideration the temporal difference that exists between the datasets being
compared. In order to overcome this issue, we have proposed a method for the segmentation
of radar data, which will enable radar data to be utilised for more in-depth geographical
studies. In order to achieve more precise results from the calculations, this positional data has
been connected to the sample rate of the camera system.
The next thing that has to be done is to find out whether there are any connections or
correlations between what the cameras observe and what the radar picks up. Because of this
link, information from both fields is required. Because of this symmetry, it is much simpler to
link the two different groupings of objects. Analysing the site's speed and acceleration while
being monitored by cameras and radar led to the discovery of this relationship. It is possible
to use the Pearson correlation coefficient to evaluate the degree to which the motion vectors
for the camera and the radar correspond with one another. As a consequence of this, utilising
this approach can be of assistance in the elimination of anything that was seen visually but
could not be acknowledged with perfect certainty. This strategy may be used to take into
consideration the possibility that one of the sensors will detect an item when the other sensor
does not. The displacement vectors for the camera and the radar are given in Equations 7 and
8, respectively, and an illustration relating to this topic may be seen further down this page.
4. Results
The dependability of the system that is discussed in Section 2.2 has been validated by
extensive testing carried out in a variety of environments. It was essential to get the necessary
information in advance in order to correctly orient the radar towards the intended target and
successfully complete the mission. Methods of data fusion require the collection of
information from several sources, such as cameras and radar, which is difficult for a single
person to do.
The project required a total of one thousand photographs, which could only be gathered by
organising four distinct occasions, all of which took place in an open-air setting. Only two of
the tracks were recorded in a studio, while the other four were either taken live or in a
situation that was more natural. Throughout the entirety of these meetings, copious amounts
of notes were taken to offer a comprehensive record of all that was said and done.
Table 3
Fig 8
"The first category includes anything outside of our solar system. When tested with objects
more than six yards away, the accuracy of the camera-trained radar classifier declined by
7.66%. A detailed inspection of the data indicated that objects farther away from the radar
had fewer data points per cluster than those within a 6-metre radius of the radar. This
disparity came from the fact that objects at increasing distances had fewer data points per
cluster (i.e., recognised entities). A more thorough planning of the radar's "chirp" pattern may
have avoided the issue of inadequate point clouds in the radar zone, which prevented the
formation of identifiable features.
The findings of the second experiment were particularly notable in low light. It's worth noting
that the camera-trained radar categorization functioned consistently across a wide variety of
lighting conditions. As a consequence, under the assumption of consistent lighting
circumstances, the camera-trained radar classifier outperformed the stand-alone camera
classifier by 56.84%.
If the radar is educated using camera data, then a camera alone might function as a teacher
network and produce results comparable to radar training. There are a few prominent
exceptions to this rule, such as a camera's sensor struggling to capture a photograph in low
light. The discovery of the second outlier revealed that ambient light has a role in the camera-
trained radar network's superior performance over the camera network alone.
The performance of the trained radar system is equivalent to that of a single camera-based
system. Prior exclusions, on the other hand, must be considered. The similarities between the
two systems' training processes are clearly summarised in Table 4; a trained similarity value
reflects how closely one system matches the taught radar in terms of accuracy. The proposed
comprehensive framework makes use of the high degree of similarity between the teacher
network, an autonomous camera system, and the student network, a radar system educated by
a camera, to train a radar model with camera-labelled data.
Table 4
Fig 9
5. Conclusion
A detailed discussion is given of a project in which a camera was utilised to help train a
student network's mmWave radar classifier. The findings of this study support the
fundamental premise that the study's major purpose was to create the framework for the
classifier. In Section 2.2, we'll look at how this architecture may be utilised to create a radar
classifier with the same accuracy as a reference camera classifier. It's worth noting that the
radar's capacity to attenuate the impacts of light remained constant.
When camera recognition produced correct findings, the suggested camera-trained technique
produced the same outcomes as a radar system taught using annotated data. Using this
method might make it much easier to manually mark up radar data, which takes a long time.
However, "Outdoors with Distant Objects" demonstrates that the camera-trained technique
has difficulty distinguishing between distinct items. The results of this study might be
improved by testing new camera-based tagging networks using the approach outlined in
Section 2.1 to determine how well they function at training radar networks that can compete
with the performance stated above.
In the future, researchers may utilise various types of radar equipment to verify the network's
performance and fine-tune radar algorithms. These testing might demonstrate how the built-in
elements of the radar impact how effectively the proposed system performs as a whole. The
sample rate and greatest accuracy of the ADC are both essential properties.
The design shown in Section 2.1 provides an excellent foundation for developing mmWave
algorithms. If professionals follow the strategy outlined in this study, they may discover that
understanding mmWave data is simpler than they previously imagined. Because data is
difficult to categorise, researchers struggle to collect enough training data for mmWave
classifier training. It is critical to remember that there may not be enough data to train the
network to its maximum capacity throughout the development phase. Because of how the
network functioned in its early days, it was difficult to obtain adequate information. This
work proposes a method for creating mmWave radar predictors without having to name data
by hand, which takes a long time and can lead to errors. In the future, this may be a good area
to study. We make it simple to achieve these objectives by providing you with a system that
includes everything you want.
Reference
1. Cakici, Z., & Murat, Y. S. (2019). A differential evolution algorithm-based traffic control model for
signalized intersections. *Advances in Civil Engineering*, 2019, Article ID 7360939, 16 pages.
2. He, Y., Liu, Z., Zhou, X., & Zhong, B. (2017). Analysis of urban traffic accidents features and
correlation with traffic congestion in large-scale construction district. In *Proceedings of the 2017
International Conference on Smart Grid and Electrical Automation (ICSGEA)* (pp. 641–644).
Changsha, China.
3. Arnott, R., & Inci, E. (2006). An integrated model of downtown parking and traffic congestion.
*Journal of Urban Economics*, 60(3), 418–442.
4. Golias, J., Yannis, G., & Antoniou, C. (2002). Classification of driver-assistance systems according to
their impact on road safety and traffic efficiency. *Transport Reviews*, 22(2), 179–196.
5. Shengdong, M., Zhengxian, X., & Yixiang, T. (2019). Intelligent traffic control system based on cloud
computing and big data mining. *IEEE Transactions on Industrial Informatics*, 15(12), 6583–6592.
6. Zhao, D., Dai, Y., & Zhang, Z. (2011). Computational intelligence in urban traffic signal control: a
survey. *IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)*, 42,
485–494.
7. Yang, Y., He, K., Wang, Y.-p., Yuan, Z.-z., Yin, Y.-h., & Guo, M.-z. (2022). Identification of dynamic
traffic crash risk for cross-area freeways based on statistical and machine learning methods. *Physica
A: Statistical Mechanics and Its Applications*, 595, 127083.
8. Yang, Y., Wang, K., Yuan, Z., & Liu, D. (2022). Predicting freeway traffic crash severity using
XGBoost-Bayesian network model with consideration of features interaction. *Journal of Advanced
Transportation*, 19, 4257865.
9. Ma, Y. Y., Chowdhury, M., Sadek, A., & Jeihani, M. (2009). Real-time highway traffic condition
assessment framework using vehicle-infrastructure integration (VII) with artificial intelligence (AI).
*IEEE Transactions on Intelligent Transportation Systems*, 10(4), 615–627.
10. Dai, X., Liu, D., Yang, L., & Liu, Y. (2019). Research on headlight technology of night vehicle
intelligent detection based on Hough transform. In *Proceedings of the 2019 International
Conference on Intelligent Transportation, Big Data & Smart City (ICITBS)* (pp. 49–52). Changsha,
China.
11. Mandal, V., Mussah, A. R., Jin, P., & Adu-Gyamfi, Y. (2020). Artificial intelligence-enabled traffic
monitoring system. *Sustainability*, 12(21), 9177.
12. Akhtar, M., & Moridpour, S. (2021). A review of traffic congestion prediction using artificial
intelligence. *Journal of Advanced Transportation*, 2021, Article ID 8878011, 18 pages.
13. Datondji, S. R. E., Dupuis, Y., Subirats, P., & Vasseur, P. (2016). A survey of vision-based traffic
monitoring of road intersections. *IEEE Transactions on Intelligent Transportation Systems*, 17(10),
2681–2698.
14. Weil, R., Wootton, J., & Garcia-Ortiz, A. (1998). Traffic incident detection: sensors and algorithms.
*Mathematical and Computer Modelling*, 27, 257–291.
15. Xiao, J., & Liu, Y. (2012). Traffic incident detection using multiple-kernel support vector machine.
*Transportation Research Record: Journal of the Transportation Research Board*, 2324(1), 44–52.
16. Marszalek, Z., Gawedzki, W., & Duda, K. (2021). A reliable moving vehicle axle-to-axle distance
measurement system based on multi-frequency impedance measurement of a slim inductive-loop
sensor. *Measurement*, 169, 108525.
17. Li, Q., Cheng, H., Zhou, Y., & Huo, G. (2015). Road vehicle monitoring system based on intelligent
visual internet of things. *Journal of Sensors*, 2015, Article ID 720308, 16 pages.
18. Naik, U. P., Rajesh, V., & Kumar, R. (2021). Implementation of YOLOv4 algorithm for multiple
object detection in image and video dataset using deep learning and artificial intelligence for urban
traffic video surveillance application. In *Proceedings of the 2021 Fourth International Conference on
Electrical, Computer and Communication Technologies (ICECCT)*, September 2021.
19. Ke, R., Zhuang, Y., Pu, Z., & Wang, Y. H. (2020). A smart, efficient, and reliable parking surveillance
system with edge artificial intelligence on IoT devices. *IEEE Transactions on Intelligent
Transportation Systems*, 13.
20. Sivaraman, S., & Trivedi, M. M. (2010). A general active-learning framework for on-road vehicle
recognition and tracking. *IEEE Transactions on Intelligent Transportation Systems*, 11(2), 267–276.
21. Teoh, S. S., & Braunl, T. (2012). Symmetry-based monocular vehicle detection system. *Machine
Vision and Applications*, 23(5), 831–842.
22. Zhu, H., Yuen, K.-V., Mihaylova, L., & Leung, H. (2017). Overview of environment perception for
intelligent vehicles. *IEEE Transactions on Intelligent Transportation Systems*, 18(10), 2584–2601.
23. Mukhtar, A., Xia, L., & Tang, T. B. (2015). Vehicle detection techniques for collision avoidance
systems: a review. *IEEE Transactions on Intelligent Transportation Systems*, 16(5), 2318–2338.
24. Rezaei, M., Terauchi, M., & Klette, R. (2015). Robust vehicle detection and distance estimation
under challenging lighting conditions. *IEEE Transactions on Intelligent Transportation Systems*,
16(5), 2723–2743.
25. Liu, L.-C., Fang, C.-Y., & Chen, S.-W. (2017). A novel distance estimation method
leading a forward collision avoidance assist system for vehicles on highways. *IEEE
Transactions on Intelligent Transportation Systems*, 18(4), 937–949.
26. Joglekar, A., Joshi, D., Khemani, R., Nair, S., & Sahare, S. (2011). Depth estimation
using monocular camera. *International Journal of Computer Science and Information
Technology*, 2(4), 1758–1763.
27. Lessmann, S., Meuter, M., Muller, D., & Pauli, J. (2016). Probabilistic distance
estimation for vehicle tracking application in monocular vision. In *Proceedings of the IEEE
Intelligent Vehicles Symposium* (pp. 1199–1204), Gothenburg, Sweden.
28. Han, S., Wang, X., Xu, L., Sun, H., & Zheng, N. (2016). Frontal object perception for
intelligent vehicles based on radar and camera fusion. In *Proceedings of the 35th Chinese
Control Conference* (pp. 4003–4008), Chengdu, China.
29. Gao, D., Duan, J., Yang, X., & Zheng, B. (2010). A method of spatial calibration for
camera and radar. In *Proceedings of the 8th World Congress on Intelligent Control and
Automation* (pp. 6211–6215), Jinan.
30. Feng, Y., Pickering, S., Chappell, E., iravani, P., & Brace, C. (2017). Distance estimation
by fusing radar and monocular camera with K filter. *SAE Technical Paper Series*.
31. Nishigaki, M., Rebhan, S., & Einecke, N. (2012). Vision-based lateral position
improvement of RADAR detections. In *Proceedings of the 15th International IEEE
Conference on Intelligent Transportation Systems* (pp. 90–97), Anchorage, AK, USA.
32. Du, Y., Man, K. L., & Lim, E. G. (2020). Image radar-based traffic surveillance system:
an all-weather sensor as an intelligent transportation infrastructure component. In
*Proceedings of the 2020 International SoC Design Conference* (ISOCC).
33. Patole, S. M., Torlak, M., Wang, D., & Ali, M. (2017). Automotive radars: a review of
signal processing techniques. *IEEE Signal Processing Magazine*, 34(2), 22–35.
34. Liu, H., Li, N., Guan, D., & Rai, L. (2018). Data feature analysis of non-scanning multi
target millimeter-wave radar in traffic flow detection applications. *Sensors*, 18(9), 2756.

A Combined Mmwave Tracking and Classification Framework Using A Camera For Labeling and Supervised Learning

Uploaded by

Copyright:

Available Formats

You might also like

A Combined Mmwave Tracking and Classification Framework Using A Camera For Labeling and Supervised Learning

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Combined Mmwave Tracking and Classification Framework Using A Camera For Labeling and Supervised Learning

Uploaded by

Copyright:

Available Formats

A Combined mmWave Tracking and Classification

Framework Using a Camera for Labeling and

Millimeter-wave (mmWave) radar may be used in a variety of imaginative ways to monitor

In the field of research on radar tracking, instructing millimeter-wave (mmWave) sensors to

1.2. Key contributions

1.3. Review of Related Literature

A dataset that is equivalent for frequency-modulated continuous-wave (FMCW) was

3. Materials and methods

3.1.1. Problem Space

3.1.2. Proposed approach

3.2. System design and implementation

3.2.1. Radar pipeline

3.2.2. Camera pipeline

3.2.3. Fused pipeline

You might also like