Professional Documents
Culture Documents
YOLOv5-Tassel-UAV
YOLOv5-Tassel-UAV
YOLOv5-Tassel-UAV
Abstract—Unmanned aerial vehicles (UAVs) equipped with With the development of UAV platforms, introduction of sen-
lightweight sensors, such as RGB cameras and LiDAR, have signif- sors, including LiDAR [5], RGB, and multispectral cameras [6],
icant potential in precision agriculture, including object detection. [7], [8], and GNSS/INS solutions [9], [10], [11], the UAV has
Tassel detection in maize is an essential trait given its relevance as
the beginning of the reproductive stage of growth and development become a popular and cost-effective technology in precision
of the plants. However, compared with general object detection, agriculture. UAV RGB imagery has been demonstrated to be
tassel detection based on RGB imagery acquired by UAVs is more particularly useful for high throughput phenotyping for plant
challenging due to the small size, time-dependent variable shape, breeding applications, including information on plant counting,
and complexity of the objects of interest. A novel algorithm referred flowering date, and yield prediction [4]. Thus, the application
to as YOLOv5-tassel is proposed to detect tassels in UAV-based
RGB imagery. A bidirectional feature pyramid network is adopted and extensions of advanced detection algorithms to cropping
for the path-aggregation neck to effectively fuse cross-scale fea- systems is a current topic of great interest for researchers.
tures. The robust attention module of SimAM is introduced to For maize, flowering is an important trait to monitor as it
extract the features of interest before each detection head. An defines the end of the vegetative stages and the beginning of
additional detection head is also introduced to improve small-size the reproductive stages. The importance of tracking the tassel
tassel detection based on the original YOLOv5. Annotation is per-
formed with guidance from center points derived from CenterNet development relates to the determination of the starting point
to improve the selection of the bounding boxes for tassels. Finally, to for grain filling. A late flowering time typically indicates that
address the issue of limited reference data, transfer learning based the filling and senescence periods would not be adequate for
on the VisDrone dataset is adopted. Testing results for our proposed harvesting. In addititon, environmental or biological stressors
YOLOv5-tassel method achieved the mAP value of 44.7%, which is may have a negative impact, thereby reducing the final grain
better than well-known object detection approaches, such as FCOS,
RetinaNet, and YOLOv5. yield. Plant breeders usually consider flowering time variation
as one of the physiological traits to assess different varieties [12].
Index Terms—CenterNet, SimAM attention module, small tassel Evaluating the performance of different genotypes in multiple
detection, transfer learning, YOLOv5.
environments or under different management practices has been
part of numerous studies in agriculture [13]. In the field, flower-
I. INTRODUCTION ing is traditionally monitored manually. This practice is prone to
N RECENT years, both the fundamentals and applications errors as it is a subjective evaluation, typically time-consuming
I of artificial intelligence have developed rapidly [1]. Ob-
ject detection, which has been the focus of many studies, has
and labor-intensive. Automatic detection of the tassels at all
stages of development (from early to the later stage) using
achieved success in many applications, including autonomous UAV RGB imagery can potentially improve the evaluation of
driving [2], crowd counting [3], and precision agriculture [4]. flowering time variation in maize.
Currently, most object detectors are designed for general
object detection, and they perform well on datasets, such as
Manuscript received 30 June 2022; revised 25 August 2022; accepted 28
August 2022. Date of publication 13 September 2022; date of current version VOC, COCO, and ImageNet [14]. In the COCO dataset, objects
23 September 2022. This work was supported in part by the Advanced Research are categorized as: small objects (area < 322 ), medium objects
Projects Agency-Energy (ARPA-E), U.S. Department of Energy under Grant (322 < area < 962 ), and large objects (area > 962 ). However,
DE-AR0000593 and in part by the National Science Foundation (NSF) under
NSF Award Number EEC-1941529. (Corresponding author: Melba Crawford.) as more than half of objects in these three widely used datasets
Wei Liu is with the School of Electrical and Computer Engineering, Purdue are medium and large-sized, most current detection approaches
University, West Lafayette, IN 47907 USA (e-mail: liu3044@purdue.edu). have struggled with detecting small objects, which are densely
Karoll Quijano is with the Department of Environmental and Ecological
Engineering, Purdue University, West Lafayette, IN 47907 USA (e-mail: kqui- distributed [15]. Because of these limitations, object detection
jano@purdue.edu). in UAV imagery has been explored using datasets, such as
Melba M. Crawford is with the Lyles School of Civil Engineering and School MOR-UAV, UAV123, and VisDrone [16]. However, compared
of Electrical and Computer Engineering, Purdue University, West Lafayette, IN
47907 USA (e-mail: mcrawford@purdue.edu). with traditional UAV imagery object detection, tassel detection
Digital Object Identifier 10.1109/JSTARS.2022.3206399 based on UAV imagery encounters several challenges, including:
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
8086 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 15, 2022
Fig. 2. Network architecture of YOLOV5-tassel. A patch of a UAV RGB image would be input to the detection backbone.
Fig. 4. Different neck architecture designs [60]. (FPN, PANnet, and BiFPN.)
1
w1 in out bt = − (t + μt ) wt
P5 = Conv
out
P + w2 · Resize P4 w1 + w2 + . 2
· 5 −1
1
M
(3) μt = xi (5)
Based on these operations, the output feature map integrates M − 1 i=1
the input feature map and the intermediate feature map with
different scaling features. Thus, it enhances the feature fusion where the mean is calculated over all the neurons in a channel,
in the neck module. except for t. Based on (5), the minimal energy can be obtained
LIU et al.: YOLOv5-TASSEL: DETECTING TASSELS IN RGB UAV IMAGERY WITH IMPROVED YOLOv5 BASED ON TRANSFER LEARNING 8089
Fig. 6. UAV platform for image acquisition [(A) Velodyne VLP-16 Lite, (B)
Applanix APX-15v3, (C) Headwall Nano Hyperspec (VNIR), and (D) Sony
Alpha 7RIII].
Fig. 7. Experiment location and layout: HIPS 2021 at ACRE.
as follows:
M HIPS experiment was planted on May 24th. The imagery of
1
4 M i=1 (xi − μ̂)2 + λ three dates during tasseling in the early stage of flowering were
et ∗ = 2
M collected for annotation (July 19th, July 21st, and July 23rd).
(t − μ̂)2 + M
2
i=1 (xi − μ̂) + 2λ
For the 2020 HIPS experiment, the data were annotated in the
1
M middle and later stage of tassel development. Compared with
μ̂ = xi (6) 2020 HIPS annotation data, the tassels were much smaller in
M i=1
2021 as they were mainly in the early stage. The percentage of
assuming all the pixels have the same distribution (which saves small size tassels was useful for verifying the performance of
computation). Based on (6), the weight for each neuron is e1t ∗ . the proposed algorithm.
Consequently, the SimAM attention module can be described Data annotation: Object detection based on deep learning
as follows: is a data-driven approach where the network architecture relies
1 heavily on the dataset to train the model. Creating the dataset
X = sigmoid X (7) accurately and efficiently is not a trivial task, particularly in
E
the complex agricultural setting. Initially, the orthophoto for
where E groups e1t ∗ both in channel and spatial dimensions. The the HIPS experiment was a large file with spatial resolution
sigmoid function is used to avoid a weight value that is too large. (0.25 cm), making it difficult to train the model directly. Thus,
the orthophoto was split into small patches prior to training.
IV. EXPERIMENTAL RESULTS The row segments were extracted using the COPE method [62],
thereby providing plot boundary information. Fig. 8 shows the
A. Description of the Experiment
bottom left coordinates (x0 , y0 ), top-right coordinates (x1 , y1 ),
Data acquisition: The UAV platform used to acquire data plot identification number (plot_ID), and row in plot for each
for these experiments was a DJI Matrice M600 Pro, as shown row-segment (row_in_plot). For the training model, two row-
in Fig. 6. It was equipped with a Velodyne VLP-16 Lite, an segments with the same plot_ID are set as one input image
Applanix APX-15v3, a Headwall Nano Hyperspec (VNIR), patch. However, as the sizes of the row segments differ, splicing
and a Sony Alpha 7RIII camera. The images were collected from two row-segments with the same plot_ID often results in the
the high-intensity phenotype sites (HIPS) experiment at Pur- shape of the generated patch not being a rectangle. To create
due University’s Agronomy Center for Research and Education a final rectangular patch, the plot coordinates of the row and
(ACRE) in Indiana, USA (see Fig. 7). The data were collected column pairs are averaged, and the patches of the original HIPS
during the 2020 and 2021 growing seasons. Two replications orthophoto are generated successfully with an approximate size
for both inbred and hybrid varieties were planted in a two-row of 620 × 2100.
plot layout with a plant population of 30,000 plants per acre. Compared with the general object annotation, tassel anno-
The UAV was flown at the height of 20 m to capture the tation is complicated as tassels are present in small sizes in a
images, and the RGB imagery was processed to a 0.25 cm pixel high-density environment, with a large amount of occlusion. In
resolution orthophoto. The method in [61] was used to generate addition, multiple annotators, even when trained, have different
the orthomosaics to reduce pixelization and distortion of the perspectives when defining a given bounding box. In the process
tassels. of dataset labeling, minimizing labeling inconsistency is crucial.
The experiment’s location, layout, and details for the 2020 To the best of our knowledge, a satisfactory method to address
data are described in [38]. For the 2021 growing season, the these issues does not exist in the area of plant annotation. Here,
8090 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 15, 2022
TABLE I
HYPERPARAMETER OF ANCHOR SIZE
TABLE II
TASSEL DETECTION PERFORMANCE COMPARISON WITH DIFFERENT
PRETRAINED DATASET
TABLE III
COMPARISON OF RESULTS OF BASELINE METHODS ON THE TASSEL DATASET
TABLE IV
ABLATION TEST RESULTS FOR YOLOV5-TASSEL
TABLE V
DETECTION PERFORMANCE COMPARISON WITH DIFFERENT ATTENTION
MODULES
C. Ablation Study
The performance of the proposed algorithm YOLOv5-tassel
was evaluated with a thorough ablation study (see Table IV). An
additional detection head embedded into the original YOLOv5 is down-sampled into 1/32 for the feature map, making it less
improved the mAP from 42.6% to 43.6%, with the number effective for global attention. Second, the transformer block
of parameters increasing by only 2.6%. The introduction of lacks image-specific inductive bias, namely two-dimensional
BiFPN and SimAM boosted the metric of mAP by 0.2% and neighborhood structure and translation equivalence.
0.9%, respectively. In total, the proposed YOLOv5-tassel im- Table V compares the detection performance with different
proved the mAP by 2.1%, with the size of the parameter set attention modules, including shuffle attention, CBAM, SELayer,
increasing by 12.4%. These results show that the addition of ECALayer, and SimAM. It shows that the attention module of
the attention mechanism significantly influences the detection SimAM achieves the best metrics of AP50 and mAP.
of small tassels, as it enhances AP50 and mAP by 0.5% and
0.9%, respectively. To evaluate the model’s complexity, the D. Visualization of the Prediction Result
floating-point operations per second (FLOPs) were determined.
Besides the quantitative comparison for the proposed
The FLOPs value of the original YOLOv5l was 107.8, while the
YOLOv5-tassel through the metrics of AP50 and mAP, examples
FLOPs value of the proposed model was 143.4, which increased
of the detection results are shown in Fig. 10. From (a) to (f), the
by nearly 33%.
size and shape of tassels vary over the period of flowering. Based
The success of the transformer encoder block, with its MHSA
on (a) and (b), the tassels could be detected precisely even though
mechanism, has motivated researchers to embed it in the back-
they are small. In the middle stage, the proposed detection
bone to enhance the ability of feature map extraction (see Fig. 9).
algorithm also performs perfectly through (c) and (d). In the
In [33], [34], the transformer block is embedded into the end of
later stage shown in (e) and (f), while there is overlap between
the CSPDarknet53 for UAV imagery object detection, improving
neighboring tassels with high density, nearly all tassels are also
the mAP metric. Thus, a comparison experiment was conducted
detected successfully. Overall, the precision and robustness of
to evaluate the effectiveness of the transformer encoder block
the proposed algorithm are clearly illustrated.
in tassel detection based on the YOLOv5-tassel. In Table IV,
compared with the proposed method, the number of parameters
V. CONCLUSION
increased by 7.2 million after embedding the transformer block,
while the metrics of AP50 and mAP decreased by 0.6% and A novel algorithm referred to as YOLOv5-tassel is developed
0.4%, respectively. There are two limitations to the detection to improve tassel detection. Four detection heads and BiFPN
performance of a transformer block on the tassel dataset. First, are adopted to enhance feature fusion for small tassel detection.
the transformer blocks usually perform better on larger datasets, In addition, the attention mechanism of SimAM is introduced
with their global attention mechanism requiring more parame- to extract the interesting parts in the feature map. Remarkably,
ters. However, the size of the tassel dataset is relatively small. the mAP of YOLOv5 is boosted to 44.7% on the tassel dataset,
In addition, at the end of the backbone, the small-sized tassel enhanced by 2.1%. It also demonstrates that transfer learning
8092 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 15, 2022
based on the VisDrone dataset, compared to the traditional [6] C. Papaioannidis, I. Mademlis, and I. Pitas, “Autonomous UAV safety by
COCO data, could enhance tassel detection based on UAV RGB visual human crowd detection using multi-task deep neural networks,” in
Proc. IEEE Int. Conf. Robot. Automat., 2021, pp. 11074–110 80.
imagery. In addition, CenterNet is utilized to provide a reference [7] W. Liu, L. Xiong, X. Xia, Y. Lu, L. Gao, and S. Song, “Vision-aided
for the tassel dataset annotation. This study’s results will help intelligent vehicle sideslip angle estimation based on a dynamic model,”
provide a foundation for further development of detection of IET Intell. Transport Syst., vol. 14, no. 10, pp. 1183–1189, 2020.
[8] D. Hong, N. Yokoya, J. Chanussot, and X. X. Zhu, “An augmented linear
objects of interest in agriculture based on RGB imagery acquired mixing model to address spectral variability for hyperspectral unmixing,”
by UAVs. IEEE Trans. Image Process., vol. 28, no. 4, pp. 1923–1938, Apr. 2019.
Further work is merited in the following areas: introducing [9] L. Ruan et al., “Cooperative relative localization for UAV swarm in GNSS-
denied environment: A coalition formation game approach,” IEEE Internet
deformable convolutional networks to address the problem of Things J., vol. 9, no. 13, pp. 11560–11577, Jul. 2022.
tassel shape variation, combining the benefits of CNN with the [10] W. Liu, X. Xia, L. Xiong, Y. Lu, L. Gao, and Z. Yu, “Automated vehicle
transformer in the network architecture to further enhance tas- sideslip angle estimation considering signal measurement characteristic,”
IEEE Sensors J., vol. 21, no. 19, pp. 21675–21687, Oct. 2021.
sel detection, and adopting unsupervised learning with domain [11] L. Xiong et al., “IMU-based automated vehicle body sideslip angle and
adaptation to detect tassels with only unlabeled data in the target attitude estimation aided by GNSS using parallel adaptive Kalman filters,”
domain. IEEE Trans. Veh. Technol., vol. 69, no. 10, pp. 10668–10680, Oct. 2020.
[12] E. Durand et al., “Flowering time in maize: Linkage and epistasis at a
major effect locus,” Genetics, vol. 190, no. 4, pp. 1547–1562, 2012.
ACKNOWLEDGMENT [13] M. L. Buchaillot et al., “Evaluating maize genotype performance under low
nitrogen conditions using RGB UAV phenotyping techniques,” Sensors,
The authors would like to thank Taojun Wang, Claudia Aviles, vol. 19, no. 8, 2019, Art. no. 1815.
An-te Huang, Purnima Jayaraj, and Franciele Marques Tolentino [14] L. Liu et al., “Deep learning for generic object detection: A survey,” Int.
J. Comput. Vis., vol. 128, no. 2, pp. 261–318, 2020.
for their contributions to data collection and annotation. [15] G. Chen et al., “A survey of the four pillars for small object detection:
Multiscale representation, contextual information, super-resolution, and
region proposal,” IEEE Trans. Syst., Man, Cybern.: Syst., vol. 52, no. 2,
REFERENCES pp. 936–953, Feb. 2020.
[1] Q. Zhou, D. Zhao, B. Shuai, Y. Li, H. Williams, and H. Xu, “Knowl- [16] P. Zhu et al., “Detection and tracking meet drones challenge,” 2020,
edge implementation and transfer with an adaptive learning network for arXiv:2001.06303.
real-time power management of the plug-in hybrid vehicle,” IEEE Trans. [17] K. Duan, S. Bai, L. Xie, H. Qi, Q. Huang, and Q. Tian, “CenterNet:
Neural Netw. Learn. Syst., vol. 32, no. 12, pp. 5298–5308, Dec. 2021. Keypoint triplets for object detection,” in Proc. IEEE/CVF Int. Conf.
[2] R. Xu, H. Xiang, Z. Tu, X. Xia, M.-H. Yang, and J. Ma, “V2X-ViT: Vehicle- Comput. Vis., 2019, pp. 6569–6578.
to-everything cooperative perception with vision transformer,” in Proc. [18] H. Law and J. Deng, “CornerNet: Detecting objects as paired keypoints,”
Eur. Conf. Comput. Vis., 2022. in Proc. Eur. Conf. Comput. Vis., 2018, pp. 734–750.
[3] X. Chen, H. Yan, T. Li, J. Xu, and F. Zhu, “Adversarial scale-adaptive [19] X. Zhou, J. Zhuo, and P. Krahenbuhl, “Bottom-up object detection by
neural network for crowd counting,” Neurocomputing, vol. 450, pp. 14–24, grouping extreme and center points,” in Proc. IEEE/CVF Conf. Comput.
2021. Vis. Pattern Recognit., 2019, pp. 850–859.
[4] A. Karami, M. Crawford, and E. J. Delp, “Automatic plant counting and [20] S. Zhang, C. Chi, Y. Yao, Z. Lei, and S. Z. Li, “Bridging the gap between
location based on a few-shot learning technique,” IEEE J. Sel. Topics Appl. anchor-based and anchor-free detection via adaptive training sample se-
Earth Observ. Remote Sens., vol. 13, pp. 5872–5886, 2020. lection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020,
[5] F. Luet al., “HRegNet: A hierarchical network for large-scale outdoor pp. 9756–9765.
LiDAR point cloud registration,” in Proc. IEEE/CVF Int. Conf. Comput. [21] Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “YOLOX: Exceeding YOLO
Vis., 2021, pp. 15994–16003. series in 2021,” 2021arXiv:2107.08430.
LIU et al.: YOLOv5-TASSEL: DETECTING TASSELS IN RGB UAV IMAGERY WITH IMPROVED YOLOv5 BASED ON TRANSFER LEARNING 8093
[22] J. Redmon and A. Farhadi, “YOLOv3: An incremental improvement,” [48] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in Proc.
2018, arXiv:1804.02767. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 7132–7141.
[23] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal [49] H. Xue, M. Sun, and Y. Liang, “ECANet: Explicit cyclic attention-
speed and accuracy of object detection,” 2020, arXiv:2004.10934. based network for video saliency prediction,” Neurocomputing, vol. 468,
[24] Q. Chen, Y. Wang, T. Yang, X. Zhang, J. Cheng, and J. Sun, “You only pp. 233–244, 2022.
look one-level feature,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern [50] J. Dai et al., “Deformable convolutional networks,” in Proc. IEEE Int.
Recognit., 2021, pp. 13034–13043. Conf. Comput. Vis., 2017, pp. 764–773.
[25] G. Jocher, “YOLOv5,” 2020. [Online]. Available: https://github.com/ [51] A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for
ultralytics/yolov5 image recognition at scale,” in Proc. Int. Conf. Learn. Representations,
[26] J. R. Uijlings, K. E. Van De Sande, T. Gevers, and A. W. Smeulders, 2021. [Online]. Available: https://openreview.net/forum?id=YicbFdNTTy
“Selective search for object recognition,” Int. J. Comput. Vis., vol. 104, [52] R. Xu, Z. Tu, H. Xiang, W. Shao, B. Zhou, and J. Ma, “CoBEVT: Coop-
no. 2, pp. 154–171, 2013. erative bird’s eye view semantic segmentation with sparse transformers,”
[27] R. Girshick, “Fast R-CNN,” in Proc. IEEE Int. Conf. Comput. Vis., 2015, 2022, arXiv:2207.02202.
pp. 1440–1448. [53] Z. Liu et al., “Swin transformer: Hierarchical vision transformer using
[28] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time shifted windows,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2021,
object detection with region proposal networks,” in Proc. Adv. Neural Inf. pp. 10012–10022.
Process. Syst., 2015, vol. 28, pp. 1137–1149. [54] Y. Chen, X. Dai, M. Liu, D. Chen, L. Yuan, and Z. Liu, “Dynamic
[29] Z. Cai and N. Vasconcelos, “Cascade R-CNN: Delving into high qual- convolution: Attention over convolution kernels,” in Proc. IEEE/CVF
ity object detection,” in Proc. IEEE Int. Conf. Comput. Vis., 2018, Conf. Comput. Vis. Pattern Recognit., 2020, pp. 11030–11039.
pp. 6154–6162. [55] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “CBAM: Convolu-
[30] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” in Proc. tional block attention module,” in Proc. Eur. Conf. Comput. Vis., 2018,
IEEE Int. Conf. Comput. Vis., 2017, pp. 2980–2988. pp. 3–19.
[31] Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: Fully convolutional one- [56] L. Yang, R.-Y. Zhang, L. Li, and X. Xie, “SimAM: A simple, parameter-
stage object detection,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, free attention module for convolutional neural networks,” in Proc. Int.
pp. 9626–9635. Conf. Mach. Learn, 2021, pp. 11863–11874.
[32] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for [57] K. Han et al., “SCNet: Learning semantic correspondence,” in Proc. IEEE
dense object detection,” in Proc. IEEE Int. Conf. Comput. Vis., 2017, Int. Conf. Comput. Vis., 2017, pp. 1849–1858.
pp. 2980–2988. [58] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie,
[33] X. Zhu, S. Lyu, X. Wang, and Q. Zhao, “TPH-YOLOv5: Improved “Feature pyramid networks for object detection,” in Proc. IEEE/CVF Conf.
YOLOv5 based on transformer prediction head for object detection on Comput. Vis. Pattern Recognit., 2017, pp. 936–944.
drone-captured scenarios,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., [59] S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “Path aggregation network for
2021, pp. 2778–2788. instance segmentation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern
[34] Z. Zhang, X. Lu, G. Cao, Y. Yang, L. Jiao, and F. Liu, “ViT-YOLO: Recognit., 2018, pp. 8759–8768.
Transformer-based YOLO for object detection,” in Proc. IEEE/CVF Int. [60] M. Tan, R. Pang, and Q. V. Le, “EfficientDet: Scalable and efficient object
Conf. Comput. Vis., 2021, pp. 2799–2808. detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020,
[35] G. Tian, J. Liu, and W. Yang, “A dual neural network for object pp. 10778–10787.
detection in UAV images,” Neurocomputing, vol. 443, pp. 292–301, [61] Y.-C. Lin, T. Zhou, T. Wang, M. Crawford, and A. Habib, “New orthophoto
2021. generation strategies from UAV and ground remote sensing platforms
[36] X. Wu, D. Hong, J. Tian, J. Chanussot, W. Li, and R. Tao, “ORSIm detector: for high-throughput phenotyping,” Remote Sens., vol. 13, no. 5, 2021,
A novel object detection framework in optical remote sensing imagery Art. no. 860.
using spatial-frequency channel features,” IEEE Trans. Geosci. Remote [62] C. Yang, S. Baireddy, E. Cai, M. Crawford, and E. J. Delp, “Field-based
Sens., vol. 57, no. 7, pp. 5146–5158, Jul. 2019. plot extraction using UAV RGB images,” in Proc. IEEE/CVF Int. Conf.
[37] X. Wu, W. Li, D. Hong, R. Tao, and Q. Du, “Deep learning for unmanned Comput. Vis., 2021, pp. 1390–1398.
aerial vehicle-based object detection and tracking: A survey,” IEEE Geosci. [63] B. C. Russell, A. Torralba, K. P. Murphy, and W. T. Freeman, “Labelme:
Remote Sens. Mag., vol. 10, no. 1, pp. 91–124, Mar. 2022. A database and web-based tool for image annotation,” Int. J. Comput. Vis.,
[38] A. Karami, K. Quijano, and M. Crawford, “Advancing tassel detection vol. 77, no. 1, pp. 157–173, 2008.
and counting: Annotation and algorithms,” Remote Sens., vol. 13, no. 15, [64] Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, “Distance-IoU loss:
2021, Art. no. 2881. Faster and better learning for bounding box regression,” in Proc. AAAI
[39] S. Oh et al., “Plant counting of cotton from UAS imagery using deep Conf. Artif. Intell., 2020, vol. 34, no. 07, pp. 12993–13000.
learning-based object detection framework,” Remote Sens., vol. 12, no. 18, [65] K. Chen et al., “MMDetection: Open MMLab detection toolbox and
2020, Art. no. 2981. benchmark,” 2019, arXiv:1906.07155.
[40] E. Cai, S. Baireddy, C. Yang, E. J. Delp, and M. Crawford, [66] Q.-L. Zhang and Y.-B. Yang, “SA-Net: Shuffle attention for deep convo-
“Panicle counting in UAV images for estimating flowering time in lutional neural networks,” in Proc. IEEE Int. Conf. Acoust., Speech Signal
sorghum,” in Proc. IEEE Int. Geosci. Remote Sens. Symp., 2021, Process., 2021, pp. 2235–2239.
pp. 6280–6283.
[41] B. Gong, D. Ergu, Y. Cai, and B. Ma, “Real-time detection for wheat
head applying deep neural network,” Sensors, vol. 21, no. 1, 2020,
Art. no. 191.
[42] Y. Wu, Y. Hu, and L. Li, “BTWD: Bag of tricks for wheat detection,” in
Proc. Eur. Conf. Comput. Vis., Springer, 2020, pp. 450–460.
[43] E. C. Tetila et al., “Automatic recognition of soybean leaf diseases using
UAV images and deep convolutional neural networks,” IEEE Geosci.
Remote Sens. Lett., vol. 17, no. 5, pp. 903–907, May 2020.
[44] M. Bhandari et al., “Assessing winter wheat foliage disease severity using
aerial imagery acquired from small unmanned aerial vehicle,” Comput. Wei Liu (Graduate Student Member, IEEE) received
Electron. Agriculture, vol. 176, 2020, Art. no. 105665. the B.E. degree in vehicle engineering from the
[45] M.-H. Guo et al., “Attention mechanisms in computer vision: A survey,” Wuhan University of Technology, Wuhan, China, and
Comput. Visual Media, vol. 8, pp. 331–368, 2022. the M.E. degree in vehicle engineering from Tongji
[46] H. Cao, G. Chen, J. Xia, G. Zhuang, and A. Knoll, “Fusion-based feature University, Shanghai, China.
attention gate component for vehicle detection based on event camera,” He is currently working toward the Ph.D. degree in
IEEE Sensors J., vol. 21, no. 21, pp. 24540–24548, Nov. 2021. electrical and computer engineering with the School
[47] G. Chen, H. Cao, J. Conradt, H. Tang, F. Rohrbein, and A. Knoll, “Event- of Electrical and Computer Engineering, Purdue Uni-
based neuromorphic vision for autonomous driving: A paradigm shift for versity, West Lafayette, IN, USA. His research in-
bio-inspired visual sensing and perception,” IEEE Signal Process. Mag., terests include deep learning, computer vision, au-
vol. 37, no. 4, pp. 34–49, Jul. 2020. tonomous driving, and UAV remote sensing.
8094 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 15, 2022
Karoll Quijano (Graduate Student Member, IEEE) Melba M. Crawford (Life Fellow, IEEE) received
received the B.S. degree in environmental engineer- the B.S. degree in environmental engineering from
ing from the Universidad Distrital Francisco José de the Universidad Distrital Francisco José de Caldas,
Caldas, Bogota, Colombia, in 2016, and the M.S. Bogota, Colombia, in 2016, and the M.S. degree
degree in environmental and ecological engineering in environmental and ecological engineering from
from Purdue University, West Lafayette, IN, USA, in Purdue University, West Lafayette, IN, USA, in 2020.
2020. She is a Nancy Uridil and Francis Bossu Professor of
She is currently working toward the Ph.D. degree in Civil Engineering with Purdue, where she is also a
environmental and ecological engineering with Pur- Professor in the Schools of Electrical and Computer
due University. Her research interests include remote Engineering and the Department of Agronomy. Previ-
sensing for agriculture, UAV-based hyperspectral im- ously, she was an Engineering Foundation Endowed
agery and LiDAR, precision agriculture, and crop growth modeling. Professor in Mechanical Engineering with the University of Texas at Austin,
Austin, TX, USA, where she founded an interdisciplinary research and appli-
cations development program in space-based and airborne remote sensing. She
has authored or coauthored more than 200 publications in scientific journals,
conference proceedings, book chapters, and technical reports. Her research
focuses on development of machine learning-based algorithms for classification
and prediction, and applications of these methods to hyperspectral, and LIDAR
remotely sensed data.
Dr. Crawford is a Fellow and Life Member of the IEEE, Past President of
the IEEE Geoscience and Remote Sensing Society (GRSS), an IEEE GRSS
Distinguished Lecturer, and past Treasurer of the IEEE Technical Activities
Board. She received the GRSS outstanding Service Award in 2020 and the IEEE
GRSS David Landgrebe Research Award in 2021.