Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Airplane Detection Based on Unsupervised Deep

Domain Adaptation in Remote Sensing Images


Youssef BEN YOUSSEF  (  youssef.benyoussef@uhp.ac.ma )
Université Hassan 1er
Sou ane Lyaqini 
Université Hassan 1er
Khalid Fakhar 
Université Hassan 1er
Elhassane Abdelmounim 
Université Hassan 1er

Research Article

Keywords: Deep Domain Adaptation, Deep Convolutional Neuronal Network, Computer Vision, Object
Detection, Deep Transfer Learning

Posted Date: September 30th, 2022

DOI: https://doi.org/10.21203/rs.3.rs-2088221/v1

License:   This work is licensed under a Creative Commons Attribution 4.0 International License.  
Read Full License
Springer Nature 2021 LATEX template

Airplane detection based on Unsupervised


Deep Domain Adaptation in Remote Sensing
Images
Youssef Ben Youssef1*, Soufiane Lyaqini2 , Khalid Fakhar3
and Elhassane Abdelmounim4
1* GEER , Hassan First University of Settat, Ecole Nationale des
Sciences Appliquées, Avenue de l’université, B.P :218, Berrechid,
26100, , Morocco.
2* GIM, Hassan First University of Settat, Ecole Nationale des

Sciences Appliquées, Avenue de l’université, B.P :218, Berrechid,


26100, , Morocco.
3* GIM, Hassan First University of Settat, Ecole Nationale des

Sciences Appliquées, Avenue de l’université, B.P :218, Berrechid,


26100, , Morocco.
4* PA, Hassan First University of Settat, Faculté des Sciences,

Km 3, B.P. : 577 Route de Casablanca, Settat, 26000, , Morocco.

*Corresponding author(s). E-mail(s):


youssef.benyoussef@uhp.ac.ma;
Contributing authors: lyaqini.soufiane@gmail.com;
kld.fakhar@@gmail.com; hassan.abdelmounim@hotmail.fr;

Abstract
In this work, we use the detection and localization capabilities of the
pre-trained Faster Region Convolutional Neuronal Network (Faster R-
CNN) model including Resnet50 as the backbone of the architecture.
Our model has been trained with a huge benchmark dataset Common
Objects in Context (MSCOCO) as a source domain. The model pre-
trained is used in Unsupervised Deep Domain adaptation (UDDA) for
airplane detection and localization in Remote Sensing Image (RSI) as a
domain target. We evaluate our proposed approach using images contain-
ing multi-objects (airplanes) on different scales and types collected from

1
Springer Nature 2021 LATEX template

2 Article Title

the public dataset for Object Detection in Aerial Images (DOTA) and
different airport images extracted from Google Earth. Extensive exper-
iments in a Cloud environment reveal the usefulness of the proposed
approach regarding the score. UDDA algorithm proposed is a nondeter-
ministic machine learning approach for detecting the airplane in RSI.

Keywords: Deep Domain Adaptation, Deep Convolutional Neuronal


Network, Computer Vision, Object Detection, Deep Transfer Learning

1 Introduction
With the explosion of satellites and data available, extracting meaningful infor-
mation from Remote Sensing Images (RSI) requires technology tools based on
computer vision and machine learning. Object detection is one of the most
fundamental and challenging problems in computer vision. Recently, in RSI
several studies have been developed for detecting and localizing objects in an
image in many applications using neural networks [1]. In response to a multi-
tude of applications, several neural network architectures have been developed
[2]. The main objective of improving neural networks is the structure of hid-
den layers to optimize the architecture for a task [3]. Initially, neural network
architecture has achieved success in image classification tasks. In object detec-
tion, the model must be able to recognize a single object or several objects
in a single image and draw boundaries for each object and label them. Due
to recent developments in deep learning, detection and localization systems
have been proposed to solve these issues. Among the different architectures
of deep neural networks, CNNs are the relevant ones and have made progress
in the processing of images and video. The first design of CNN was proposed
by LeCun et al. [4]. Object detection is based on two separate tasks: classifi-
cation and localization. The object detection algorithm uses derived features
and learning algorithms to detect an object and find its location. There are
several civil and military applications based on object detection from RSI,
including aircraft detection, airport security, and flight tracking, which offer
more specific information. In previous work, we used Deep CNN to classify
aircraft [5, 6]. The two categories of image object detectors are: one utilizes
one-stage methods to predict bounding boxes of the region of interest; the sec-
ond uses a two-stage approach to propose candidate object bounding boxes.
The algorithm-based two-stage detectors identify the subsets of the image
that might contain an object (region proposal) and classify them for making
predictions within the proposed region. The advantage for the second cate-
gory is the high detection accuracy achieved [7]. Deep CNN consists of many
layers concatenated; the basic convolutional neural network architecture con-
tains: i)convolutional layers, ii) pooling layers, and iii) fully connected layers,
which map the representation between the input and the output. Using the
softmax activation function in the output layer gives a posterior distribution
Springer Nature 2021 LATEX template

Article Title 3

over the class labels and localization. One of the convolutional neural network
models that perform object detection in two stages is the Residual network
(Resnet). Resnet is a network in which the input doesn’t need any connection
to the next layer, but it can skip one or more layers to the back of the input
[8]. According to application needs, variant CNN architectures have been pro-
posed in the literature, such as R-CNN [9, 10], Fast R-CNN is faster than
R-CNN [11], and Faster R-CNN to avoid some limitations [12]. The perfor-
mance and popularity of various cutting-edge object detection models, as well
as the Resnet’s ability to solve the vanishing gradient problem on CNN, are
the reasons for using it in this work. Furthermore, we selected Faster R-CNN,
which is the most representative model from the two-stage object detection
combining region proposal and classification. In this study, the Resnet model
is employed as the core architecture for the generic functionality extraction
module. Among Resnet algorithms, Resnet50 is a popular CNN model that is
50 hidden layers deep and used in the classification task. A block of 64 filters
with a kernel size of 7 × 7 and an exploring layer with a size of 7 × 7 and a
stride of 2 make up the network’s initial configuration. The remaining portion
of the network is divided into four further blocks of three convolutional layers
that are connected through shortcut connections, with the kernel sizes being
1,3, and 1 accordingly. There are a total of 50 levels created by repeating the
second, third, fourth, and fifth blocks three times each, four times, and six
times respectively, and followed by fully connected layers with 1000 nodes and
an average pooling layer with a Softmax.
Among the approaches in machine learning, supervised deep learning
assumes that the test data is randomly sampled from the annotated dataset
known as the domain source. In deep learning, the model requires the same
processing stages as the machine learning model: training and testing or valida-
tion. Over-fitting can be caused by a small dataset in the training of the model,
and the model will fail in generalization. Furthermore, acquiring and annotat-
ing a huge number of RSI for deep learning algorithms would be expensive.
Despite their significant effects on a variety of tasks, deep learning models are
difficult to use in practical tasks since they need a huge amount of annotated
data and can not generalize to data with skewed distributions. To overcome
these problems, deep transfer learning is a technique that uses knowledge from
prior training data to new training data to accelerate the process and use less
memory [13]. Due to its popularity and compelling results in many domains,
deep transfer learning has attracted a lot of attention from different areas.
Deep transfer learning can be considered more as domain adaptation, in which
we use the learner’s knowledge to solve a related but distinct problem [14]. The
main goal of domain adaptation is to derive an object detector algorithm for
the unlabeled (target) data from the labeled (source) data [15]. Several algo-
rithms for object detection have been presented in the past decade. Faster
R-CNN is a popular object detection approach in the adaptive domain detec-
tion. In this context, the UDDA formulation assumes that the source domain is
labeled while no labels are available for the target domain. UDDA is commonly
Springer Nature 2021 LATEX template

4 Article Title

used in learning tasks without target labels since it immediately translates the
source features into the target feature space [16, 17]. Thus, it can be used to
adapt deep networks to possibly smaller and unlabeled dataset. Due to the
limited availability of labeled dataset in RSI, we used UDDA based on Deep
Transfer Learning for a pre-trained model to detect airplanes in a given image.
The following are the significant contributions of this study:
• We used UDDA, which provides full supervision in source domain while
providing no supervision in target domain.
• Faster R-CNN was chosen as the detector in the domain source, and we
improved its ability for object detection in domain task.
• Sharing network weight and applying unsupervised domain adaptation for
airplane detection and localization in RSI.
• Our model detects and localizes airplanes at different scales and directions
in the complex background without modifying the model’s parameters.
The remainder of the document is organized into: section 2 reviews the related
work. The third section presents a basic theory, while section 4 details the
methodology proposed. Experimental results and discussion are presented in
section 5. Finally, a conclusion is provided in section 6.

2 Related Work
Several papers in the literature have developed object detector models based on
deep domain adaptation. Shen et al. proposed a novel algorithm. Deeply super-
vised object detectors, the authors argued deep supervision with a densely
connected network architecture could significantly reduce optimization prob-
lems [18]. Rochan et al. used word vectors to build a relationship between
the weakly annotated source domain and the target domain, then information
was transported from the source bounding box to the target objects based on
their relationship [19]. Many researchers have applied Deep CNN to airplane
detection in RSI including ship detection [20]. Wu et al. used a method based
on R-CNN to detect airplanes in the selective search context; the authors
extracted region proposals and classified region proposals [21]. The model for
detecting unlabeled samples in RSI was proposed based on the Local Binary
Pattern algorithm; the authors use the LBP algorithm to extract feature vec-
tors in the target domain and use hybrid regularization in transfer learning.
Cail et al. propose a feature-shared transform network that utilizes the gen-
eral information in the bottom layers to boost performance. Furthermore,
the authors add a special regularization term to the loss function to allevi-
ate the negative effect caused by the vanishing gradient phenomenon [14].
Deep transfer learning is an important method in machine learning to solve
the training-less data problem. Deep transfer learning includes all techniques
aimed at minimizing the effort involved in developing new models by trans-
ferring the knowledge without training them from scratch. UDDA focuses on
unsupervised machine learning tasks in both the source and the target domain
Springer Nature 2021 LATEX template

Article Title 5

[22]; this method can provide faster results because the object detectors have
already been trained on a large number of images in domain source. We focused
on airplane detection, which is more challenging as both object localization
and class need to be predicted. In the last decade, UDDA has been identi-
fied as a prominent part of machine learning advancement. It is already being
applied in a variety of fields, including computer vision and natural language
processing [23].Unsupervised deep transfer learning is a deep transfer learning
that uses unannotated data from the target domain [24]. The authors in [25]
introduce a generic unsupervised deep learning approach to train deep mod-
els without the need for manual label supervision, and use a strategy to learn
the underlying class decision boundaries iterative. According to the properties
of data and different approaches used in different domain adaptation scenes,
different deep domain adaptation scenarios are presented in [15, 26]. Hosang
et al. studied the performance of region proposal algorithms in depth using
various dataset and revealed that these algorithms have low repeatability and
therefore are not robust to noise and disturbance; the architecture of models
and parameters extracted from the natural or synthetic scenes is transferred
to the detection object in the target domain of remote sensing image [27]. Han
et al. proposed a semi-supervised generative model to generate a new train-
ing set, by combining pre-trained CNN and SVM classifiers [28]. Wei et al.
proposed a model named X-LineNet based on one-stage and anchor-free for
aircraft detection; the model proposed transforms the goal of aircraft detection
in RSI from detection to prediction and grouping of connected intersecting
line segments, allowing the network to learn using visual grammar informa-
tion [29]. Teng et al. proposed adversarial domain adaptation in RSI to align
the feature distribution of the source and the target for the classification task
[30].To improve the precision of airplane and car detection in RSI, Ding et al.
enhanced the structure of the VGG16-Net algorithm [31]. Chen et al. proposed
a domain adaptation Faster R-CNN algorithm to detect aircraft in RSI as a
domain task, the authors used images extracted from the dataset DOTA as a
source but in target domain images with different brightness conditions and
achieved a 54,28% average precision score [32].

3 Basic theory
The domain adaptation concept is based on the similarity of two domains in
which the model is applied to a different but related new area. Consider the
source domain Ds = {Xsi , Ysi }N i=1 where Ns is the number of images and Xs
s i
th i
denotes i image and Ys denotes the bounding boxes and respective object
labels in the corresponding image. in DA. we denote the target dataset as
Nt
Dt = {Xti , Yti }i=1 and having Nt number of target domain labeled images Xti
i
and Yt denotes the object labels in the corresponding image. In UDDA. we
denote the target dataset as Dt = {Xti }N i=1 and having Nt number of target
t

i
domain images Xt whitout labels. Each domain is caracterised by four parts:
the feature space X , the label space Y, the marginal probability distribution
Springer Nature 2021 LATEX template

6 Article Title

P(X) and conditional probability distribution P(X/Y ), where X ∈ X , Y ∈ Y.


Since domain source and target domain are different, they have different data
distributions, P(Xsi ) ̸= P(Xti ). UDDA algorithm identify a labeling hypothesis
h : X → [0, 1] from a space of labeling or scoring function H such that the
following expected error over the target distribution is minimized [33].

err(h) = Ex,y | (h(x) − y) | (1)

In the improved domain adaptation Faster R-CNN algorithm, K-distance is


presented as a measurement of similarity between domains [33] and defined

dH (Ds , Dt ) = 2(errs − errt ) (2)

where errs and errs indicate the error probability of the source and tar-
get domains, respectively. The following equation will be used to express the
distance between the two domains.
 
dH (s, t) = 2 1 − min(errs (h(xs )) + errt (h(xt ))) (3)
h∈H

In order to align the distribution between two domains, the distance dH (s, t)
should be minimized :
 
mindH (s, t) ⇐⇒ max min(errs (h(xs )) + errt (h(xt ))) (4)
f f h∈H

f denote the feature vector generated by the Network. Adversarial objective


domain adaptation has been used to minimize domain discrepancy in [34, 35].

4 Design Methodology
The following section provides a detailed methodology. Recently, many effec-
tive deep learning algorithms require a large number of natural images.
Overfitting is a common issue in the limited dataset. To overcome this prob-
lem in RSI, deep transfer learning is adopted. The final challenge is based on
UDDA, which is a label less airplane detection model.

4.1 Deep transfer learning Model


The models are Faster R-CNN networks and Resnet50 as backbone. In this
paper, 50 layers convolution neural network ResNet50 is applied to detect
object in natural image and Faster R-CNN to generate the region of interests
which may contain targets. For our experiments, Resnet50 and Faster R-CNN
models were trained on the benchmark MSCOCO dataset which comprises
328k images containing 91 object categories with 2.5 million object instances,
the background of each object varies and use it as a source domain [36]. Useful
features relevant are extracted from the source domain that can be used in a
Springer Nature 2021 LATEX template

Article Title 7

totaly unsupervised setting by jointly learning the parameters in domain tar-


get. These networks are publicly availables in Pytorch [37]. All our networks
have been implemented using PyTorch in Google Colaboratory environment
and some hyper-parameters are setup : epochs=10,learning rate of 0.001, batch
size:28. The block diagram of airplane detection using UDDA is shown in Fig.1.
The model is trained on minimizing the loss function. The loss function model

Fig. 1 The block diagram adopted

embeds classification and bounding box regression loss functions. Originally,


the model trained and tested generated a Hierarchical Data Format (HDF5)
file. The file is extracted from the open source deep learning library Detecto
package [38]. This package is designed to detect objects in videos. Nonethe-
less, we used it to detect airplanes in this work. The file HDF5 is designed to
store and organize large amounts of data. We used the same network design
for sharing weights in a new task in which the dataset is unlabeled and named
domain task. This work includes 50 images with over 9000 objects collected
from the DOTA dataset and Google Earth. The images in DOTA are acquired
from many types of sensors and reflect the diversity of the sample. There are
multi-airplanes in one image, and the majority include dense airplane sample
regions [39]. To address the issue presented above in the remote sensing tar-
get domain, the model’s parameters are shared aiming to detect and localize
airplanes in the context of UDDA. Fig. 2 shows a sample of the images used
as a target domain, which are unlabeled.

4.2 Unsupervised Deep Domain Adaptation


In domain target, the weight share is extracted after training and test-
ing the model using the benchmark MSCOCO dataset, which is one of the
Springer Nature 2021 LATEX template

8 Article Title

Fig. 2 Images sample in the target domain.

most popular large-scale labeled natural image dataset publicly available. In


the literature, there are many metrics used to identify how well an algo-
rithm is performing in object detection tasks. One of the popular metrics in
object detection and localization is the mean Average Precision mAP (score),
which combines the Intersection over Union (IoU), precision, recall, and the
precision-recall curve and defined as

1 X T Pc
mAP = (5)
classes F Pc + T Pc
c∈classes

5 Result and Discussions


We’ve provided example visual results and detection scores to demonstrate our
model’s ability to handle both detection and localization tasks simultaneously.
The visual detection result and score achieved are shown in Figures 3, 4, and
5. In this work, the aim is to detect and localize airplanes in RSI. As we
can see in Fig.3(b) and Fig.4(b), the airplane detected with a score of 90,38%
and 99,41% respectively which means that the model has learned perfectly.
In addition, other objects are detected and localized with different scores.
Because the domain source is rich with items utilized in the training phase,
other objects with differing scores can be detected despite their absence in the
domain target. We use the proposed approach in a known difficult circumstance
to assess its success. Despite the poor quality of the aircraft in Figures 5(a)
and 5(b), the system detects an aircraft with a score of 97.87%. The objects
present in the image were detected and localized with varying scores, as seen
in the below output images.
Springer Nature 2021 LATEX template

Article Title 9

Fig. 3 Airplane detection(a),Object detection score achieved(b)

Fig. 4 Airplane detection(a),Object detection score achieved(b)

To assess our approach, we compared our proposed work with the latest work.
Our model improves airplanes detection and localization, and also detects

Table 1 Comparison with other approaches

Models Backbone Score

Ref.[32] ZF 88.13%
Ref.[32] VGGM 89.76%
Ref.[32] VGG16 90.17%
Our approach Resnet 50 99,41%

other objects. From Table 1, Resnet50 model used in this work provides the
highest score.
Springer Nature 2021 LATEX template

10 Article Title

Fig. 5 Airplane detection(a),Object detection score archieved (b)

6 Conclusion
This paper presents an unsupervised domain adaptation-based approach for
airplane detection and localization. The pre-trained model is based on a huge
labeled MSCOCO dataset and, subsequently, transfers the learned information
to the unlabeled target domain to detect and localize airplanes in RSI. The
approach is used and the results have been reported using unannotated images
in RSI. Moreover, a comparison with other approaches reveals that our method
improves the detection and localization of airplanes in remote sensing images.
Although the proposed approach yields a high score, there are different aspects
for improvement in future work.

Declarations
Ethics approval
Not applicable.

Competing interests
The authors affirm that the research was conducted in the absence of any
commercial or financial relationships that could be construed as a potential
conflict of interest.

Funding
This research received no specific grant from any funding agency in the public,
commercial, or not-for-profit sectors.

Availability of data and materials


The datasets generated and analysed during the current study are available
from the corresponding author on reasonable request.
Springer Nature 2021 LATEX template

Article Title 11

Author’s contributions
YBY:Visualization, Investigation, Software, Validation, Writing - review &
editing.
SL and KF: Participated in the basic theory associated and implementation.
EA: Supervised the research.
All authors contributed to article revision and read and approved the submit-
ted version.

References
[1] Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., Alsaadi, F.: A survey of deep
neural network architectures and their applications. Neurocomputing 234
(2013). https://doi.org/10.1016/j.neucom.2016.12.038

[2] Amirian, S., Wang, Z., Taha, T.R., Arabnia, H.R.: Dissection of deep
learning with applications in image recognition. In: 2018 International
Conference on Computational Science and Computational Intelligence
(CSCI), pp. 1142–1148 (2018). https://doi.org/10.1109/CSCI46756.2018.
00221

[3] Wu, X., Sahoo, D., Hoi, S.C.: Recent advances in deep learning for object
detection. Neurocomputing 396, 39–64 (2020)

[4] LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning
applied to document recognition. Proceedings of the IEEE 86(11), 2278–
2324 (1998)

[5] Youssef, Y.B., Merrouchi, M., E.Abdelmounim, T.Gadi: Aircraft type


classification in remote sensing images using deep learning. In: 2020 IEEE
2nd International Conference on Electronics, Control, Optimization and
Computer Science (ICECOCS), pp. 1–6 (2020). https://doi.org/10.1109/
ICECOCS50124.2020.9314611

[6] Youssef, Y.B., Merrouchi, M., E.Abdelmounim, T.Gadi: Classification of


aircraft in remote sensing images based on deep convolutional neural net-
works. Statistics, Optimization and Information Computing 10(1), 4–11
(2022). https://doi.org/10.19139/soic-2310-5070-1143

[7] Li.Liu, W.Ouyang, X.Wang, P.Fieguth, J.Chen, X.Liu, M.Pietikäinen:


Deep learning for generic object detection: A survey. International jour-
nal of computer vision 128(2), 261–318 (2020). https://doi.org/10.1007/
s11263-019-01247-4

[8] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image
Springer Nature 2021 LATEX template

12 Article Title

recognition. In: 2016 IEEE Conference on Computer Vision and Pat-


tern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/
CVPR.2016.90

[9] Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies
for accurate object detection and semantic segmentation. In: Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition,
pp. 580–587 (2014)

[10] Agrawal, P., Girshick, R., Malik, J.: Analyzing the performance of multi-
layer neural networks for object recognition. In: European Conference on
Computer Vision, pp. 329–344 (2014). Springer

[11] Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International
Conference on Computer Vision, pp. 1440–1448 (2015)

[12] Faster, R.: Towards real-time object detection with region proposal net-
works. Advances in neural information processing systems 9199(10.5555),
2969239–2969250 (2015)

[13] Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Transactions
on Knowledge and Data Engineering 22(10), 1345–1359 (2010). https:
//doi.org/10.1109/TKDE.2009.191

[14] Cai, G., Wang, Y., He, L., Zhou, M.: Unsupervised domain adaptation
with adversarial residual transform networks. IEEE transactions on neural
networks and learning systems 31(8), 3073–3086 (2020). https://doi.org/
10.1109/tnnls.2019.2935384

[15] Wang, Mei, Deng, Weihong: Deep visual domain adaptation: A survey.
Neurocomputing 312, 135–153 (2018)

[16] Kouw, W.M., Loog, M.: A review of domain adaptation without target
labels. IEEE transactions on pattern analysis and machine intelligence
43(3), 766–785 (2019). https://doi.org/10.1109/TPAMI.2019.2945942

[17] Zhang, Y., Deng, B., Tang, H., Zhang, L., Jia, K.: Unsupervised
multi-class domain adaptation: Theory, algorithms, and practice. IEEE
Transactions on Pattern Analysis and Machine Intelligence (2020)

[18] Shen, Z., Liu, Z., Li, J., Jiang, Y.-G., Chen, Y., Xue, X.: Dsod: Learn-
ing deeply supervised object detectors from scratch. In: Proceedings of
the IEEE International Conference on Computer Vision, pp. 1919–1927
(2017)

[19] Rochan, M., Wang, Y.: Weakly supervised localization of novel objects
using appearance transfer. In: Proceedings of the IEEE Conference on
Springer Nature 2021 LATEX template

Article Title 13

Computer Vision and Pattern Recognition, pp. 4315–4324 (2015)

[20] Tang, J., Deng, C., Huang, G.-B., Zhao, B.: Compressed-domain ship
detection on spaceborne optical image using deep neural network and
extreme learning machine. IEEE Transactions on Geoscience and Remote
Sensing 53(3), 1174–1185 (2015). https://doi.org/10.1109/TGRS.2014.
2335751

[21] Wu, H., Zhang, H., Zhang, J., Xu, F.: Fast aircraft detection in satellite
images based on convolutional neural networks. In: 2015 IEEE Interna-
tional Conference on Image Processing (ICIP), pp. 4210–4214 (2015).
https://doi.org/10.1109/ICIP.2015.7351599

[22] Michau, G., O.Fink: Unsupervised transfer learning for anomaly


detection: Application to complementary operating condition transfer.
Knowledge-Based Systems 216, 0950–7051 (2021). https://doi.org/10.
1016/j.knosys.2021.106816

[23] Andersson, L., Lupu, M., Hanbury, A.: Domain adaptation of general
natural language processing tools for a patent claim visualization system.
In: Information Retrieval Facility Conference, pp. 70–82 (2013). Springer

[24] Tan, C., Sun, F., Kong, T., Zhang, W., Yang, C., Liu, C.: A survey on
deep transfer learning. In: International Conference on Artificial Neural
Networks, pp. 270–279 (2018). Springer

[25] Huang, J., Dong, Q., Gong, S., Zhu, X.: Unsupervised deep learning
by neighbourhood discovery. In: International Conference on Machine
Learning, pp. 2849–2858 (2019). PMLR

[26] Zhang, Y., Davison, B.D.: Modified distribution alignment for


domain adaptation with pre-trained inception resnet. arXiv preprint
arXiv:1904.02322 (2019)

[27] Hosang, J., Benenson, R., Dollar, P., Schiele, B.: What makes for effective
detection proposals? IEEE transactions on pattern analysis and machine
intelligence 38(4), 814–830 (2015)

[28] Han, W., Feng, R., Wang, L., Cheng, Y.: A semi-supervised generative
framework with deep learning features for high-resolution remote sensing
image scene classification. ISPRS Journal of Photogrammetry and Remote
Sensing 145, 23–43 (2018)

[29] Wei, H., Zhang, Y., Wang, B., Yang, Y., Li, H., Wang, H.: X-linenet:
Detecting aircraft in remote sensing images by a pair of intersecting line
segments. IEEE Transactions on Geoscience and Remote Sensing 59(2),
1645–1659 (2020)
Springer Nature 2021 LATEX template

14 Article Title

[30] Teng, W., Wang, N., Shi, H., Liu, Y., Wang, J.: Classifier-constrained
deep adversarial domain adaptation for cross-domain semisupervised clas-
sification in remote sensing images. IEEE Geoscience and Remote Sensing
Letters 17(5), 789–793 (2019)

[31] Ding, P., Zhang, Y., Deng, W.-J., Jia, P., Kuijper, A.: A light and
faster regional convolutional neural network for object detection in opti-
cal remote sensing images. ISPRS journal of photogrammetry and remote
sensing 141, 2086218 (2018)

[32] J.Chen, Y.L. J.Sun, C.Hou: Object detection in remote sensing images
based on deep transfer learning. Multimed Tools Appl 81, 12093–12109
(2021). https://doi.org/10.1007/s11042-021-10833-z

[33] Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F.,
Vaughan, J.W.: A theory of learning from different domains. Machine
learning 79(1), 151–175 (2010)

[34] Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning
with deep convolutional generative adversarial networks. arXiv preprint
arXiv:1511.06434 (2015)

[35] Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backprop-


agation. In: International Conference on Machine Learning, pp. 1180–1189
(2015). PMLR

[36] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D.,
Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. In:
European Conference on Computer Vision, pp. 740–755 (2014). Springer

[37] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G.,
Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imper-
ative style, high-performance deep learning library. Advances in neural
information processing systems 32 (2019)

[38] PyPi: Detecto (2020). http://pypi.org/project/detecto/1.1.2,


accessedon30-01-2022

[39] Xia, G.-S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., Datcu, M.,
Pelillo, M., Zhang, L.: Dota: A large-scale dataset for object detection
in aerial images. In: Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (CVPR) (2018)

You might also like