Neurocomputing: Fei Yan, Jiawei Wang, Guojian He, Huan Chang, Yan Zhuang

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Neurocomputing 400 (2020) 333–342

Contents lists available at ScienceDirect

Neurocomputing
journal homepage: www.elsevier.com/locate/neucom

Sparse semantic map building and relocalization for UGV using 3D


point clouds in outdoor environments ✩
Fei Yan, Jiawei Wang, Guojian He, Huan Chang, Yan Zhuang∗
School of Control Science and Engineering, Dalian University of Technology, Dalian 116024, China

a r t i c l e i n f o a b s t r a c t

Article history: In this paper, we proposed a sparse semantic map building method and an outdoor relocalization strat-
Received 19 December 2019 egy based on this map. Most existing semantic mapping approaches focus on improving semantic un-
Revised 6 February 2020
derstanding of single frames and retain a large amount of environmental data. Instead, we don’t want
Accepted 24 February 2020
to locate the UGV precisely, but use the imprecise environmental information to determine the general
Available online 27 March 2020
position of UGV in a large-scale environment like human beings. For this purpose, we divide the environ-
Communicated by Bo Shen ment into environment nodes according to the result of scene understanding. The semantic map of the
outdoor environment is obtained by generating topological relations between the environment nodes. In
Keywords:
Relocalization the semantic map, only the information of the nodes is saved, so that the storage space can be kept at
Semantic map a very small level with the increasing size of environment. When the UGV receives a new local semantic
3d point clouds map, we evaluate the similarity between local map and global map to determine the possible position
Outdoor environment of the UGV according to the categories of the left and right nodes and the distance between the current
Unmanned Ground Vehicles position and the nodes. In order to validate the proposed approach, experiments have been conducted
in a large-scale outdoor environment with a real UGV. Depending on the semantic map, the UGV can
redefine its position from different starting points.
© 2020 Elsevier B.V. All rights reserved.

1. Introduction tion. However, each place may contain hundreds of local features
and match place features directly can be time consuming. The BoW
The ability to acquire accurate position information in real-time technique quantizes local features into a vocabulary and matches
is the premise and basis for the UAVs to perform various tasks. Re- places using numerical vectors, which is adopt to increase effi-
localization consists in locating the UAVs on an existing map when ciency of relocalization in large environments [5]. Although some
they lost their positions because of features matching failure or be- methods to improve the image quality can improve the relocaliza-
ing kidnapped. It is commonly used for navigation, simultaneous tion results [6], the fact that the visual sensor is vulnerable to poor
localization and mapping (SLAM) and automatic drive [1,2]. illumination or dynamic changes exists objectively. So the visual
Visual sensors and laser sensors are the most important sen- relocalization based on deep learning are proposed, such as Deep
sors for UGVs to perceive environments. At present, visual relocal- Belief Network [7] and Convolutional Neural Network [8], which
ization plays a critical role in the field of relocalization. Given a estimated the camera poses by training an end-to-end deep model
pre-built map or a series of images with the exact position, vi- using images. Although the deep model methods can cope with the
sual relocalization aims to determine the 6D pose of the current environmental change, these methods need a long time to learn
camera view. The traditional methods of visual relocalization de- the different appearances of a place or to train robust features. Fur-
tect the most similar database image of the current image by using thermore, visual sensors are limited by their environmental factors,
image matching methods [3,4]. In order to describe a part of the such as brightness and light intensity.
map, there are several feature descriptors, such as SIFT, SURF and Comparing to visual sensors, the laser sensors are not affected
ORB. All of these descriptors have been used in visual relocaliza- by environment factors and can provide precise depth information.
So laser sensors are also considered as the main sensors for re-
localization. Traditional methods are usually relying on geometry

This work was supported by the National Natural Science Foundation of China matching to achieve relocalization [9,10], and some filtering meth-
(U1913201) and the Science and Technology Foundation of Liaoning Province of ods are also used for localization, such as unscented Kalman fil-
China (20180520031). tering, particle swarm optimization, set-membership, H∞ [11–15].

Corresponding author. These methods largely depend on environment maps with dense
E-mail address: zhuang@dlut.edu.cn (Y. Zhuang).

https://doi.org/10.1016/j.neucom.2020.02.103
0925-2312/© 2020 Elsevier B.V. All rights reserved.
334 F. Yan, J. Wang and G. He et al. / Neurocomputing 400 (2020) 333–342

points or geometrical feature. To build these maps, a dense ex- methods are all aimed at the narrow indoor environment. Al-
ploration need to be made in the environment, which is usually though they can provide abundant environment information, they
computationally inefficient and need a lot of storage space. There- are not suitable for UGVs localization and navigation in a large-
fore, the relocalization based on geometry matching is infeasible in scale environment. In order to make semantic map more practical,
some cases such as long term navigation. On the other hand, deep the semantic maps of the environment as priori maps to realize
model methods also be used in laser relocalization [16,17], but the the location of robots is manually built in [22] and [23]. Further-
problem that training model needs a long time still exists. more, in [24], a 3D online semantic mapping system is presented,
In this paper, a sparse semantic map building method and a which utilized CNN model to transfer pixel label distributions of
laser relocalization strategy are proposed. First, in order to ob- 2D image to 3D grid space, and the authors proposed a Conditional
tain robust features, the 3D point clouds are compressed into 2D Random Field (CRF) model with higher order cliques to enforce se-
ODVL images, which are divided into super pixels, and 23 dimen- mantic consistency among grids.
sional texture features are extracted for each super pixel. Then, the Considering that many papers convert 2D images into 3D in-
Gentle-AdaBoost algorithm is used to accomplish feature screening formation, some researchers directly use 3D LIDAR data or fusion
on these texture features to obtain the initial classification results information to build semantic map. Sun et al. [25] represented
of the 3D point clouds. Moreover, reclassification is carried out to the 3-D map as an OctoMap and modeled each cell as a recur-
improve the classification accuracy by using the Centroid method rent neural network, to obtain a Recurrent-OctoMap. In this case,
and considering the height distribution of the point clouds. Ac- the semantic mapping process can be formulated as a sequence-
cording to the results of scene understanding, the outdoor environ- to-sequence encoding–decoding problem. In [26], a real-time end-
ments are divided into scene nodes and road nodes. The semantic to-end semantic segmentation method is proposed for road-objects
map of the outdoor environment is obtained by generating topo- based on spherical images, which are transformed from the 3D Li-
logical relations between the scene nodes and the road nodes. Fi- DAR point clouds and taken as input of the convolutional neural
nally, the map is simplified so that the positioning accuracy can be networks to predict the point-wise semantic mask. In comparison,
improved. Different from the general concept maps, for the seman- Liu et al. [27] transformed 3D laser point clouds into 2D images
tic map, a map format which only stores the distance information and obtained the semantic information based on the images. The
between objects is proposed. According to the scenes on the left fusion methods based on the data fusion of 3D Lidar and camera
and right sides observed at the current position, combined with can obtain the semantic segmentation in the camera frame, which
the environmental nodes of the semantic map, the current posi- reduce the complexity of semantic segmentation based on laser
tions of the UGV can be estimated. Combining with the triangular data, then use the LiDAR points to localize the segmentation in 3D
relationship, a method for obtaining the possible positions of the [28,29]. However, in [28], the semantic maps retain a large num-
UGV by calculating the intersect points of two circles is proposed. ber of voxels, that make storage space and computing efficiency
And then the last position information is maintained and used as face severe challenges in growing environments. Ma et al. [29] for-
the priori condition to calculate the next possible positions of the mulated the problem in a Bayesian filtering framework, and ex-
UGV. In addition, each point has the score indicating the possibil- ploited lanes, traffic signs, as well as vehicle dynamics to localize
ity of the UGV at the current position. The point with the high- robustly with respect to a sparse semantic map. Although the map
est score is the most likely position of the UGV. In order to vali- can avoid the problem of storage and computing efficiency, its ap-
date the proposed approach, experiments have been conducted in plication environment has great limitations.
a large-scale outdoor environment with a real UGV. Depending on
the semantic map, the UGV can redefine its position from different
starting points. 2.2. Laser relocalization
This paper is organized as follows: in Section II we first review
related work on semantic mapping and laser relocalization. Imple- Visual relocalization algorithms struggle to cope with strong ap-
mentation details of scene understanding and semantic map build- pearance changes that commonly occur during long-term applica-
ing are provided in Section III, while our relocalization method is tions in outdoor environments, and fail under certain ill-lighted
described in Section IV. The experiment is conducted and the re- conditions. In contrast to that, LiDAR sensors are mainly unaffected
sults is analyzed in Section V and Conclusion is reached in the last by appearance change and used for relocalization. Some works are
section. matching current laser data designed with some different features
and prior map or pre-calculated database of features. Chong et al.
2. Related works [30] projected 3D point cloud into 2D plane and extracted the en-
vironment features from the plane to localize UGV in 3D urban en-
2.1. Semantic map building vironment, but the basic assumption was that many surfaces in the
urban environment are rectilinear in the vertical direction. Collier
To enable UGVs to recognize an environment and act accord- et al. [31] extracted highly descriptive features, called the Variable
ingly in a real world, the UGVs need to infer not only the geome- Dimensional Local Shape Descriptors, from LiDAR range data to en-
try but also semantic information of surrounding environment. Se- code environment features. These features were used to learn a
mantic mapping is a map building process, which fuses the Simul- generative model of place appearance and determines if an obser-
taneous Localization and Mapping (SLAM) algorithm and scene un- vation comes from a new or previously seen place. In [32], when
derstanding algorithm [18]. the map-matching method is not feasible in two parallel straight
Visual sensors are usually used for semantic mapping. McCor- wall environments, the data from the laser range sensor and the
mac et al. [19] combined Convolutional Neural Networks (CNNs) odometry is integrated to localize the robot based on moving hori-
and a state-of-the-art dense Simultaneous Localization and Map- zon estimation [33]. Wolcott and Eustice [34] proposed a Gaus-
ping (SLAM) system to build indoor semantic map based on RGB- sian mixture map, which was a 2D grid structure where each grid
D camera. Zhao et al. [20] proposed a deep learning method which cell maintains a Gaussian mixture model characterizing the dis-
performs 3D reconstruction while simultaneously recognizing dif- tribution over z-height in that cell. Cao et al. [35] converted 3D
ferent types of materials and labeling them at the pixel level. A laser points to 2D bearing angle images and extracted ORB fea-
novel deep learning approach for semantic segmentation of RGB- tures, then a visual bag of words approach was used to improve
D images with multi-view context was presented in [21]. These matching efficiency. In [36], the 3D LiDAR data was projected into
F. Yan, J. Wang and G. He et al. / Neurocomputing 400 (2020) 333–342 335

2D scan context, which was used to calculate the similarity score vectors corresponding to each point are obtained. The gray value
of two LiDAR scans. GOi,j about the OVDL image model according to GDi,j and GBi,j is
Instead of relying on geometry based matching, learning-based obtained as following:
approaches have shown amazing performances in field of laser re-
localization. Yin et al. [37] proposed a novel synchronous adver-
λ1 GBi, j + λ2 GDi, j
GOi, j = (1)
sarial feature learning, a kind of unsupervised learning method for λ1 + λ2
LiDAR based loop closure detection task. Furthermore, Yin [38] ac- where λ1 and λ2 are weights. The variance of the gray value ma-
complished the place recognition task based on an end-to-end fea- trix is
ture learning framework with the LiDAR inputs. In [39], the semi-
 m 
n  2
handcrafted range histogram features are used as an input to a 2D GOi, j −GO
convolutional neural network, while Uy et al. [40] proposed the i=1 j=1
PointNetVLAD where they leverage on the recent success of deep
δ2 = (2)
nm
networks to solve point cloud based retrieval for place recogni-
The optimization method is used to obtain the λ1 and λ2 ,
tion. Kim et al. [41] recently presented the idea to transform point
which generate the maximum variance.
clouds into scan context images [36] and feed them into a CNN
for solving the place recognition problem. In [42], a Convolutional
Neural Network is trained to extract compact descriptors from sin- 3.2. Segmentation for ODVL image
gle 3D LiDAR scans.
In summary, the traditional laser relocalization approaches need In order to classify the objects in the environment, the Simple
to save lots of LiDAR data or features. The complexity tends to Linear Iterative Clustering (SLIC) is used to segment the OVDL im-
grow with the environment scale, and matching the current scans age into many pixel blocks. The SLIC method produces a relatively
with potential target scans is computationally expensive. Mean- regular lattice and costs low computational complexity, which are
while, the learning-based approaches need a long time to learn the desirable properties for the subsequent applications. In the SLIC
different appearances of an environment or to train robust CNN method, the local K-means clustering is performed on pixels based
features. Instead, we present a laser relocalization method based on the color distance. Then isolated small clusters are merged with
on a sparse sematic map without learning. the largest neighbor clusters to obtain the superpixels [1]. Despite
of its simplicity and superiority over the conventional methods, the
3. Scene understanding and semantic map building SLIC method encounters problems in two aspects: first, during the
iterative clustering of the K-means algorithm, the inaccurate clas-
The Bearing Angle (BA) image and the range image are the gen- sification of the pixels can be propagated to produce wrong su-
eral models which compress 3D point clouds data into 2D images. perpixels; second, small clusters are merged into its largest neigh-
From the two aspects of describing the details and the texture of bor without considering their color similarity. When we use SLIC
the images and classifying the objects, the range image does not method to segment the OVDL image, we consider not only the
perform well and the BA image model performs poorly in distin- color information, but also the depth information of each pixel. In
guishing the objects at different depths. The novel optimal depth & this way, the two problems of SLIC mentioned above can be im-
vector length (ODVL) image model compensates for the shortcom- proved.
ings of the two methods by calculating the optimal depth value Firstly, SLIC calculates the size of each pixel block.
and the length of the BA vector corresponding to the point clouds 
and mapping images to gray images with weights. It also can be N
used for continuous scanning data with large range and fixed-point S= (3)
C
scanning data. There is no special requirement for the scanning
mode and scanning position and it has good adaptability. where S is the side length of the square and N is the total number
of pixels and C is the number of pixel blocks to be separated. On
3.1. ODVL image model the current plane, the cluster center points are selected evenly at
intervals of S pixels to produce the same size of pixel blocks. In
The origin scenes of the point clouds consist of multiple con- a 2S × 2S neighborhood, since the interval of each cluster center
tinuous scans of the laser data. The number of laser data obtained is S, a pixel point will be searched by many cluster center points.
in one scan is 381. 361 laser points in the middle are considered In the search process, the distance from one pixel to one cluster
as the effective laser data. 250 scans of the laser data are consid- center point which is defined as D’:
ered as a scene. The points are projected onto the horizontal plane ⎧

and the eigenvectors are calculated according to the distribution of ⎪


⎪D = ( dc ) + ( ds ) k2
2 2
⎨  2
the point clouds. The direction of the largest eigenvector represents dc = pi − p j (4)
the main direction in which the laser points are distributed on the ⎪
plane. In the main direction, the variance of the data is the largest. ⎩d = x − x 2 + y − y 2

s i j i j
The main direction is perpendicular to the optimal depth direc-
tion. Take the point at the lower left corner of the current plane as where p is the grey value, (x, y) is the position of a pixel, dc is
the coordinate origin point and make a straight line parallel to the the grey distance and ds is the pixel distance between two pixels.
main direction through the origin point. The distance from each k is used as a weight to control the influence of gray distance and
point on the plan to the straight line is the optimal depth value of pixel distance on the clustering distance. A pixel point has the dif-
this point. Sort all the obtained optimal depth values from small ferent distances corresponding to different center points, and the
to large and divide them into 256 sets evenly which correspond to center point corresponding to the smallest distance is selected as
256 Gy values (0–255). The gray value GDi,j with respect to the op- the cluster center of this pixel point. After calculating all the pixels,
timal depth value can be obtained according to the optimal depth the new cluster center points are calculated according to average
value corresponding to each point. The obtained BA vectors of each of pixel positions with the same category. Next, the error between
point are sorted from small to large and they are divided into 256 the new center positions and the last center positions is obtained.
sets evenly like above. The gray values GBi,j according to the BA If the error is less than the threshold, the algorithm is finished.
336 F. Yan, J. Wang and G. He et al. / Neurocomputing 400 (2020) 333–342

3.3. Point clouds classification based on ODVL image

We classify the super pixel blocks segmented from the ODVL


images. The classification method is the Gentle- AdaBoost classifi-
cation algorithm which is based on the AdaBoost framework in the
open source library and combines the Classification and Regression Fig. 1. The three-dimensional border of the object.
Trees (CART) method.
The algorithm extracts feature vectors made of the texture fea-
tures and spatial geometric features of the pixel blocks from each and have similar texture features in the ODVL image. In order to
pixel block for training. The dimension of the feature vector is 23. improve the classification accuracy, the Centroid method is used to
The first twenty data are from four gray symbiotic matrices of the cluster the point clouds corresponding to the “object to be opti-
same pixel block. Each matrix is generated by the different direc- mized”. In each clustering process, the method clusters two closest
tion parameters and each gray symbiotic matrix can calculate five objects, until the distances between the centers of all objects are
parameters [43]. The first parameter is: larger than a threshold. This paper selects 2.5 m as the thresh-
old. The Gentle-AdaBoost classifier is used to classify the point
Asm = P (i, j )2 (5) clouds with the “object to be optimized” tag. The features to be
i j
trained which are obtained from the “object to be optimized” are
where Asm is defined as the energy of the pixel block. It reflects 4-dimensional vectors.
the uniformity of gray distribution and texture thickness. The sec- In the previous work [44], for the “object to be optimized”, a
ond parameter is: three-dimensional border is added according to the point cloud.
The longest diagonal lb of the border and the height h of the bor-
Ent = [P (i, j ) log P (i, j )] (6) der can be obtained (as shown in Fig. 1). The other two parameters
i j are obtained from the distribution of the z direction. For the point
where Ent is defined as the entropy of the pixel block. It represents clouds of the “object to be optimized”, they are equally divided
the degree of non-uniformity and complexity of the texture. The into three regions from top to bottom according to the height in
third parameter is: the z direction. The point clouds of the vehicle are concentrated
2
 in the middle layer and the point clouds of the trunk are evenly
Con = P (i, j )(i − j ) (7) distributed among the three layers. The point clouds in the lower
i j part of the building are scattered randomly. For each “object to be
where Con is defined as the contrast of the pixel block. It mea- optimized”, the numbers in three layers of the points from top to
sures the sharpness of the image and the depth of the texture. The bottom are recorded as n1 , n2 and n3 . d1 =n1 /n2 and d2 =n2 /n3 are
fourth parameter is: defined as the distribution characteristics of point clouds in the
height direction. In summary, the classifier uses the feature vec-
P (i, j )
Idm = (8) tors composed of lb , h, d1 and d2 to classify the point clouds with
i j 1 + ( i − j )2 the “object to be optimized” tag.
After the reclassification, the confidence of the classification re-
where Idm is defined as the inverse difference moment of the pixel sults of buildings and ground are very high. If the other types
block. It reflects the local variation of the image texture. The last of point clouds are in the classification results of buildings and
parameter is: ground, the point clouds with incorrect classification results will

(i × j )P (i, j ) − μx μy be corrected by clustering and calculate the covariance matrices.
i j
Corr = (9)
δx δy 3.4. Semantic map building
where Corr is defined as the correlation of the pixel block. It mea-
sures the degree of gray value of the image in the row and the The semantic map is defined as the map containing the spa-
column. tial characteristics of the environment and the distribution char-
The twenty-first data is the height feature of the pixel blocks: acteristics of the scenes with known categories. In this paper, the
semantic map consists of information about nodes, such as trees,
1 vehicles, roads, buildings and ground.
Hp = zi (10)
n The environment nodes of trees, buildings and the ground can
i
be added to the semantic map which are represented by ‘T’, ‘B’
The twenty-second information is the spatial shape of the pixel and ‘G’ respectively. If there are no buildings and trees on this sec-
block. For the spatial shape, we calculate the covariance matrix of tion of the ground, it will be divided as the ground nodes. For
the pixel clock and obtain three eigenvalues λ0 , λ1 , λ2 (λ0 ≥λ1 ≥λ2 ) the “buildings and trees” nodes, if there are continuous trees in
and three eigenvectors e1 , e2 , e3 . The distribution of the point the building nodes, they are defined as the “building-tree” nodes.
clouds is divided three types according to the relationship be- Ground, trees, buildings and building-tree are considered as the
tween the three eigenvalues: spherical distribution, linear distribu- environmental nodes. According to the positions of the environ-
tion and plane distribution which are represented respectively by ment nodes relative to the UGV, the nodes on the right of the UGV
the values of λ0 , λ0 -λ1 and λ1 -λ2 . The twenty-third information is become the right environment nodes, and the nodes on the left
the directional characteristic of the local point clouds correspond- are called the left environment nodes. New road nodes are added
ing to the pixel blocks. It is defined as the eigenvector correspond- in the semantic map in two cases. The first is when one of the left
ing to the smallest eigenvalue of the local point clouds and it is or right environment nodes changes. The second is when the road
also the normal vector of this point clouds. turns or the UGV moves long enough along a straight road.
This paper uses the Gentle-AdaBoost classifier to classify each According to the left and right environmental information, GPS
point cloud with the feature vectors. After the classification pro- information and IMU information of the UGV, the algorithm gen-
cess, the “lower part of the building”, “vehicle”, and “trunk” which erates environment nodes and road nodes. In the semantic map, a
are defined as the “object to be optimized” are similar in height road node is connected to environment nodes on the left and right
F. Yan, J. Wang and G. He et al. / Neurocomputing 400 (2020) 333–342 337

Fig. 2. The Comparison between original map (left) and simplified map (right). Fig. 3. The intersection points of two circles, which represent the possible positions
of the UGV.

sides and an environment node can be connected to multiple road


nodes. The point clouds of the previous scene are always stored in tal nodes is obtained, which includes the categories of the left and
the memory of the UGV to ensure the continuity of the semantic right nodes and the distance between the current position and the
map. When new data is read in, the existing semantic map can be nodes. dis_l and dis_r are defined as the distances between the cur-
updated and maintained. rent position of the UGV and the left and right node respectively.
In the actual experiment of establishing the map, since the According to the categories of the left and right nodes corre-
same object is viewed from different angles, the error of the pose sponding to current position of UGV, we find all nodes with the
estimation will cause the same object to be represented by mul- same categories as the reference nodes in the map file. Two ref-
tiple nodes which are densely distributed. It is not conductive to erence nodes corresponding to left and right node categories form
robot positioning. Simplifying the semantic map facilitates posi- a combination. Assume that the left reference and right reference
tioning and navigation while also reducing the memory. nodes in a combination are sig_l and sig_r. sig_l, sig_l_x, sig_l_y and
We scan all the environment nodes to get their spatial loca- sig_r, sig_r_x sig_r_y are put into a container. Each combination is
tion and determine the position relationship between an environ- required to appear only once. The (sig_l_x, sig_l_y) represent posi-
ment node and other environment nodes with the same type. If tion of the left environmental node in the world coordinate system,
the distance between these two nodes is less than 20 m, the two and the (sig_r_x, sig_r_y) represent position of the right environ-
environment nodes are replaced by the midpoints of them and the mental node. sig_l and sig_r are the categories of the nodes.
road nodes connected by the previous two environment nodes are After all the combinations in the map file are obtained, the ab-
connected to the midpoint. The simplified map is shown in Fig. 2. solute distance of each combination is defined as dis_tem, which is
In the red area, there are many nodes which represent the same equal to the following equation
object. After simplification, the excess environment nodes area re-
moved in the yellow area. The blue line represents the route the dis_tem = |sig_l _x − sig_r_x|2 + |sig_l _y − sig_r_y|2 (11)
UGV uses to build the map. The blue points and numbers repre-
Comparing the dis_tem with the sum of dis_r and dis_l. If
sent the segmented road nodes.
dis_tem is larger than the sum of dis_r and dis_l, the matching com-
4. Relocalization in the semantic map bination is removed from all the matching results according to the
triangular relationship. Otherwise, the matching combination is re-
This section introduces an approach to store the semantic map tained.
and how to relocate the robot in the map. After obtaining all the effective matching combinations, draw
two circles with the dis_l and dis_r as the radius and the positions
4.1. Method of saving the semantic map of the nodes in a combination as the centers (as shown in Fig. 3).
Since dis_tem is not bigger than the sum of dis_r and dis_l, there
When the global semantic map is built, the information of the is at least one intersection point which is the possible position of
map is saved in a map file, which can be recalled to relocate the the UGV. In particularly, if both of two environmental nodes have
robot as a prior map. In the map file, each row stores an envi- the same category, because the orientation of the UGV can’t be de-
ronment node information. For each environment node, the first termined, there will be up to four intersection points. After all the
message is a character, which represents the category of the envi- matching combinations have been calculated, all the possible posi-
ronmental node. The second message is the position of this envi- tions will be obtained. The coordinate information of the possible
ronmental node in the world coordinate system. In this paper, the positions is stored in a vector and a score is given to each posi-
semantic map is built on the ground and the z direction is zero. tion. The score represents the probability of the UGV at this posi-
The other messages are the distances from other environmental tion and the initial score is 4. And so on, based on the remaining
nodes to the current one. So far, the information for this environ- matching combinations, the possible positions are obtained to get
ment node has been saved. For example: Signpost, Sign_x, Sign_y: the complete vector.
Dis_1, Dis_2, Dis_3… Dis_n. It should be noted that the data on the After updating the vector, if the newly obtained possible posi-
right of the colon from all the nodes can create a square symmet- tion has existed in the previous vector or the distance between the
ric matrix with a main diagonal of zero, in which Ii,j = Ij,i , i, j∈N. new position and the old position is in an acceptable range, the
N is the number of rows of this matrix and it is also the number mean coordinate of the old position and the new position is taken
of all environmental nodes. Ii,j represents the distance between the as the new coordinate which will replace the old one. The score of
ith node and the jth node. Ii,i is equal to zero. the coordinate is increased by 3 on the base of the original scores.
This method has a much smaller storage than other common If the newly obtained possible position does not satisfy the above
maps. Especially in the establishment of maps in large scale envi- requirements, they will be directly added to the end of this vector
ronments, there are more obvious advantages. and the score is set to 4. After loading all newly generated posi-
tions, all the scores of the positions are reduced by 3 to reduce
4.2. Relocate in the semantic map the possibility of the UGV in the wrong position. All the positions
with the score less than 0 are eliminated. Ultimately, considering
When the UGV relocates in the prior map, a local semantic map the positions with the highest score as the most likely positions of
is built and the information about the left and right environmen- the UGV.
338 F. Yan, J. Wang and G. He et al. / Neurocomputing 400 (2020) 333–342

Fig. 4. All the previous possible positions are moved according to the motion in-
formation. The blue points are the previous possible points, the red points are the
points moved. x is the distance the UGV moves in the abscissa direction and y
is the distance the UGV moves in the ordinate direction.

Fig. 6. The UGV used for experiment.

Fig. 7. The “main”, “location” and “map” represent the nodes of the program. The
“map.txt” is the file of map.

OpenGL is a widely used cross-platform 3D graphics interface. This


paper mainly uses the functions in the OpenGL library to visualize
the experimental results which includes the display of laser points
in the three-dimensional coordinate system, the understood 3D
point clouds scenes according to the classification labels and the
Fig. 5. The diagram of the relocalization algorithm. semantic map nodes, etc. OpenCV is a widely used cross-platform
computer vision library with many functions for processing im-
ages.
After the position estimation is completed, the motion informa- The program runs under ROS and it is divided into three nodes
tion of the UGV is obtained according to its own IMU system and according to the function. The first node is the main node which
the GPS. All the possible points are also moved from previous posi- can understand the objects in the scene and pass the results to
tions to new current positions according to the motion information the map node through the service communication method in the
of the UGV as shown in Fig. 4, and then the next position estima- ROS. The map node generates a semantic map of the identified ob-
tion is carried out using the position information of the current jects and stores the map in a file called map.txt. The third node
possible points. The relocalization is successful until only one pos- called location is independent of the first two nodes. It simplifies
sible location exists. The diagram of the relocalization algorithm is the map.txt generated by the map node and then reads the data
shown in Fig. 5. used for relocation. Fig. 7 shows the program structure.

5. Experiment 5.2. Semantic map building

5.1. Experimental condition A scene in the point cloud data is selected for analysis, as
shown in Fig. 8. This scene is transformed into range image, Bear-
In this experiment, the data used for the semantic map build- ing Angle image and ODVL image as shown in Fig. 9. Compared
ing experiment and relocalization experiment came from the DLUT with the traditional range image and the BA image, the ODVL im-
database. Fig. 6 shows our UGV with onboard sensors. age can be used for the wide range of data, adapt to the variety
The hardware system of the UGV is composed of laser sen- of scanning methods and improve the distinguishing degree and
sors (two LMS511), ADU5630 attitude and orientation integrated detail quality of the boundary of the image. It can also provide
navigation system and industrial computer. The LMS511 is a effective support for further research on subsequent image seg-
two-dimensional laser sensor that acquires a single column of mentation and recognition. The ODVL image is completely inde-
laser data per scan. With the movement of the UGV, the two- pendent of the scene viewpoint and does not use angle calcula-
dimensional laser data will be accumulated in the direction of tion, which is also the essential difference between the ODVL im-
travel, thereby three-dimensional point cloud data is obtained. age and the BA image. The results of segmentation correspond-
The ADU5630 attitude and orientation integrated navigation sys- ing to each figure are shown in Fig. 10. In the range image, the
tem provides position, heading angle and attitude angle accuracy boundary distinction of objects with similar distance is not obvi-
within the allowable range. It’s frequency is 100 Hz, which can ous, so the range image will lose the details of the image and the
match the highest scanning frequency of the LMS511 laser sensor. segmented texture information will also be lost. For example, the
The UGV’s computer is equipped with the ROS(Robot Operating car is not distinguished from the ground, and the left part of the
System) under Linux and the software system is written in C++. building is not distinguished from tree tops. Furthermore, the some
The open source OpenGL and OpenCV program interfaces are used. windows of the building are lost in the range image. Because he
F. Yan, J. Wang and G. He et al. / Neurocomputing 400 (2020) 333–342 339

Table 1
Error rate of SLIC-based Segmentation.

Range Image BA Image ODVL Image

Error Rate 12.41% 9.85% 7.13%

Fig. 8. Original laser point clouds of outdoor scene.

Fig. 11. (a) The result of the initial classification of the Gentle-AdaBoost classifier;
(b) The reclassification result. The buildings are brown. The trees are green and the
trunk of trees are blue. The cars are yellow and the ground is gray.

from the test scenes are selected randomly as shown in Fig. 11. The
Gentle-AdaBoost classifier is used to classify the test samples. It is
Fig. 9. Range image (left), BA image (middle), ODVL image (right). obviously that the point clouds classification based on the ODVL
image has the better classification effect on the “top half of the
building”, “vegetation” and “ground”. But the classification effect
of “vehicle”, “trunk” and “bottom of the building” is not ideal. The
results of the initial classification are reclassified and semantically
corrected. The results of the classification for all the test samples
are shown in Table 3. Through the classification based on ODVL
diagram, semantic reclassification and point cloud correction, the
final classification results get better accuracy and recall rate, with
a total accuracy rate of 0.868.
The continuous scene data in Fig. 11 is derived from the
database, which was collected based on the campus environment.
Fig. 10. SLIC-based segmentation in range image (left), BA image (middle) and The data format and the acquisition method of this data set are
ODVL image (right). similar to those of this paper. It is used to test the generalization
ability of the algorithm. The scene contains about 2346,500 laser
points. The length of the trajectory of the UGV in this database
BA image can’t distinguish between scenes with different depths, is 1155 m, and the laser points cover an area of 50,419 m2 s. It
many pixels belonging to different scenes which are close to each takes about 300 MB to store all the laser points. There are 13 scene
other in two-dimensional image space but far away in three- data on the left and right sides. Fig. 12(a) shows the point clouds
dimensional space are wrongly divided into a pixel block, such as data after coloring the classified objects and Fig. 12(b) shows the
trunk and building in middle of Fig. 10. The ODVL image has rel- schematic diagram of the semantic description of the scene. The
atively good details and texture features, which are beneficial to semantic map of the experimental area is established as shown
improve the accuracy of classification. In the experiment, 60 scenes in Fig. 13. This experiment is from the long route with 175 road
were randomly selected from the database and they are converted nodes which are considered as 175 steps. The size of the map file
into the range image, BA image and ODVL image. Each image is 9.82 KB for this experience, which is much smaller than the stor-
is divided into 500 pixel blocks. Define the segmentation error age space of the laser points.
rate as:
Ne
Er = (12) 5.3. Relocalization
N
where Ne is the total number of the pixel points which are di- The UGV start from different positions in the semantic map.
vided incorrectly in the pixel blocks and N is the total number of After several steps, when there is only one possible position, the
the pixel points. The Er of super pixel segmentation is shown in relocalization is completed. In the 175 road nodes, we choose one
Table 1. node as the starting point every 5 nodes. The relocalization results
For the method of testing the Gentle-AdaBoost classifier, 70% of are shown in Fig. 14. The abscissa represents the start position
the data are used for training and 30% of the data are used for of the UGV, and the ordinate represents the number of steps for
testing. The result is shown in Table 2. Two understanding results the UGV to relocate itself successfully after starting. Instead of
340 F. Yan, J. Wang and G. He et al. / Neurocomputing 400 (2020) 333–342

Table 2
Training set and test set information.

Number of scenes Vegetation Building Ground Trunk Vehicle Others

Training 70 26.31% 21.70% 34.65% 1.02% 5.13% 11.19%


Testing 30 22.58% 25.55% 35.43% 1.30% 4.42% 10.71%

Table 3
Scene understanding result for testing sample.

Vegetation Building Ground Trunk Vehicle Others Recall rate

Vegetation 526,379 29,375 10,092 2528 2067 38,107 0.865


Building 10,963 638,667 17,371 8386 3372 9903 0.921
Ground 4022 5297 939,990 1934 214 3376 0.982
Trunk 9558 5728 4102 14,985 139 595 0.427
Vehicle 5118 14,917 15,431 2585 75,668 5443 0.635
Others 80,393 28,822 18,714 5967 11,163 143,605 0.497
Accuracy Rate 0.827 0.884 0.934 0.412 0.817 0.714

Fig. 14. The results of relocalization. The abscissa is the starting position. The num-
ber of the blue points represents the number of steps for the successful relocaliza-
tion. In the direction of the ordinate axis, the distance from the blue point to the
yellow line indicates the number of steps successful relocalization and the distance
from the blue point to the red line indicates the number of steps in which the po-
sitioning failed.

Fig. 15. The positions of the relocalization results in the semantic map. The blue
lines and points indicates the trajectory when building the semantic map. The green
points are the environment nodes and the red points are the positons of relocaliza-
Fig. 12. (a) The 3D laser point clouds used for semantic mapping. (b) Large-scale tion.
outdoor semantic map for the same environment. The white lines represent the
constraint relationship between road nodes and environment nodes.

locating the UGV precisely, our focus is to use the sparse semantic
map to determine the general position of UGV in a large-scale
environment. Fig. 15 shows the positions of the relocalization
results in the semantic map, in which the position points are
distributed near the trajectory when building the prior semantic
map. But because there are few environmental nodes on the left
side of the map, the locating is inaccurate, and the robot needs to
move forward for a distance to find its own position.
The processing of building the semantic map is divided into five
parts: BA image generation, super pixel segment, feature extrac-
tion and classification, reclassification and topological map build-
ing. The duration of running the algorithm is shown in Fig. 16. As
Fig. 13. The semantic map of the experimental area.
can be seen from the figure, the feature extraction and classifica-
F. Yan, J. Wang and G. He et al. / Neurocomputing 400 (2020) 333–342 341

[2] J.G. Liu, D.Q. Liu, J. Cheng, Y.Y. Tang, Conditional simultaneous localization
and mapping: a robust visual SLAM system, Neurocomputing 145 (2014)
269–284.
[3] T. Sattler, B. Leibe, L. Kobbelt, Efficient & effective prioritized matching for
large-scale image-based localization, IEEE Trans. Pattern Anal. Mach. Intell. 39
(9) (Sep. 2017) 1744–1756.
[4] Y. Feng, Y. Wu, L. Fan, Real-time slam relocalization with online learning of
binary feature indexing, Mach. Vis. Appl. 28 (8) (Aug. 2017) 953–963.
[5] A. Angeli, D. Filliat, S. Doncieux, J.A. Meyer, Fast and incremental method for
loop-closure detection using bags of visual words, IEEE Trans. Robot. 24 (5)
(Oct. 2008) 1027–1037.
[6] N. Zeng, H. Zhang, Y. Li, J. Liang, A.M. Dobaie, Denoising and deblurring gold
immunochromatographic strip images via gradient projection algorithmsL,
Neurocomputing 247 (2017) 165–172.
[7] N. Zeng, Z. Wang, H. Zhang, K. Kim, Y. Li, X. Liu, An improved particle fil-
ter with a novel hybrid proposal distribution for quantitative analysis of gold
immunochromatographic strips, IEEE Trans. Nanotechnol. 18 (1) (2019) 819–
829.
[8] A. Kendall, R. Cipolla, Modelling uncertainty in deep learning for camera relo-
calization, in: 2016 IEEE International Conference on Robotics and Automation
(ICRA), Stockholm, Sweden, 2016, pp. 4762–4769.
Fig. 16. Time Cost of Scene Understanding and Semantic Mapping. [9] T. Röhling, J. Mack, D. Schulz, A fast histogram-based similarity measure for
detecting loop closures in 3-D lidar data, in: 2015 IEEE/RSJ International Con-
ference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 2015,
tion take the longest time. For the other process, they spend below pp. 736–741.
[10] B. Steder, G. Grisetti, W. Burgard, Robust place recognition for 3D range data
100 ms. based on point features, in: 2010 IEEE International Conference on Robotics
and Automation, Anchorage, AK, 2010, pp. 1400–1405.
6. Conclusion and future work [11] N. Zeng, H. Zhang, Y. Chen, B. Chen, Path planning for intelligent robot based
on switching local evolutionary PSO algorithm, Assem. Autom. 36 (2) (2016)
120–126.
This paper has focused on how to relocalize UGVs in outdoor [12] V. Pham, J. Juang, Robust and efficient slam via compressed H∞ filtering, Asian
environments based on a sparse semantic map. The laser points J. Control 16 (3) (2014) 878–889.
[13] S. Liu, Z. Wang, G. Wei and M. Li, Distributed set-membership filtering for
are divided into the pixel blocks according to the OVDL image multi-rate systems under the round-robin scheduling over sensor networks,
and classified by Gentle-AdaBoost classifier. The semantic map of IEEE Trans. Cybern., doi:10.1109/TCYB.2018.2885653.
the outdoor environment is built by generating topological rela- [14] Y. Chen, Z. Wang, Y. Yuan, P. Date, Distributed H∞ filtering for switched
stochastic delayed systems over sensor networks with fading measurements,
tions between the environment nodes, which are obtained accord- IEEE Trans. Cybern. 50 (1) (2020) 2–14.
ing to the classification results. Instead of the laser data or the [15] S. Liu, Z. Wang, Y. Chen, G. Wei, Protocol-based unscented Kalman filtering
features, only information of the environment nodes is saved for in the presence of stochastic uncertainties, IEEE Trans. Autom. Control 65 (3)
(2020) 1303–1309.
relocalization, that avoids the problem of storage in large-scale en-
[16] K. Granstrom, J. Callmer, F. Ramos, J. Nieto, Learning to detect loop closure
vironments. As for locating the UGV, a new positioning method is from range data, in: 2009 IEEE International Conference on Robotics and Au-
proposed. After getting the environmental object categories on the tomation, Kobe, Japan, 2009, pp. 15–22.
[17] Y.M. Li, S. Li, Y.J. Ge, A biologically inspired solution to simultaneous localiza-
both sides of the current position, we draw a circle and find the in-
tion and consistent mapping in dynamic environments, Neurocomputing 104
tersection point to get the possible positions of the UGV. Applying (2013) 170–179.
this position as the prior knowledge to the next position estima- [18] C. Galindo, J.A. Fernández-Madrigal, J. González, A. Saffiotti, Robot task plan-
tion process. The related relocalization experiments are carried out ning using semantic maps, Rob Auton. Syst. 56 (11) (Aug. 2008) 955–966.
[19] J. McCormac, A. Handa, A. Davison, S. Leutenegger, SemanticFusion: dense
in the outdoor environment and the results verify the effectiveness 3D semantic mapping with convolutional neural networks, in: 2017 IEEE In-
and robustness of the algorithm. ternational Conference on Robotics and Automation (ICRA), Singapore, 2017,
When the UGVs return to a visited position, if the environment pp. 4628–4635.
[20] C. Zhao, L. Sun, R. Stolkin, A fully end-to-end deep learning approach for
object is not completely observed by LiDAR, current environment real-time simultaneous 3D reconstruction and material recognition, in: 2017
node is not completely consistent with the environment node in 18th International Conference on Advanced Robotics (ICAR), Hong Kong, China,
the sematic map. It is a problem to fuse the new node into the 2017, pp. 75–82.
[21] L. Ma, J. Stückler, C. Kerl, D. Cremers, Multi-view deep learning for consistent
sematic map. So we focus on improve the map updating method semantic mapping with rgb-d cameras, in: 2017 IEEE/RSJ International Confer-
in the future. And how to improve the positioning accurate in the ence on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 2017,
area with few environmental nodes is also one of the future works. pp. 598–605.
[22] P. Wang, R. Yang, B. Cao, W. Xu, Y. Lin, DeLS-3D: deep localization and segmen-
tation with a 3D semantic map, in: 2018 IEEE/CVF Conference on Computer
Declaration of Competing Interest Vision and Pattern Recognition, Salt Lake City, 2018, pp. 5860–5869.
[23] Z. Xiao, K. Jiang, S. Xie, T. Wen, C. Yu, D. Yang, Monocular vehicle self-local-
ization method based on compact semantic map, in: 2018 21st International
The authors declare that they have no known competing finan-
Conference on Intelligent Transportation Systems (ITSC), Maui, United States,
cial interests or personal relationships that could have appeared to 2018, pp. 3083–3090.
influence the work reported in this paper. [24] S. Yang, Y. Huang, S. Scherer, Semantic 3D occupancy mapping through ef-
ficient high order CRFs, in: 2017 IEEE/RSJ International Conference on In-
telligent Robots and Systems (IROS), Vancouver, BC, Canada, 2017, pp. 590–
CRediT authorship contribution statement 597.
[25] L. Sun, Z. Yan, A. Zaganidis, C. Zhao, T. Duckett, Recurrent-OctoMap: learning
state-based map refinement for long-term semantic mapping with 3-d-Lidar
Fei Yan: Writing - original draft, Methodology, Conceptualiza-
data, IEEE Robot. Autom. Lett. 3 (3) (2018) 3749–3756.
tion, Methodology, Writing - original draft. Jiawei Wang: Software, [26] Y. Wang, T. Shi, P. Yun, L. Tai, M. Liu, PointSeg: real-Time semantic segmenta-
Writing - original draft. Guojian He: Data curation, Visualization, tion based on 3D lidar point cloud, arXiv preprint arXiv:1807.06288v8, 25 Sep.
Investigation. Huan Chang: Investigation, Validation. Yan Zhuang: 2018.
[27] Y. Liu, F. Wang, A.M. Dobaie, G. He, Y. Zhuang, Comparison of 2D image models
Supervision, Resources, Project administration. in segmentation performance for 3D laser point clouds, Neurocomputing 251
(2017) 136–144.
References [28] J. Jeong, T.S. Yoon, J.B. Park, Multimodal sensor-based semantic 3D mapping for
a large-scale environment, Expert Syst. Appl. 105 (2018) 1–10.
[1] Q.L. Li, Y. Son, Z.G. Hou, Neural network based fastslam for autonomous robots [29] W.C. Ma, T. Ignacio, Bârsan, et al., Exploiting sparse semantic hd maps for self-
in unknown environments, Neurocomputing 165 (2015) 99–110. driving vehicle localization, arXiv preprint arXiv:1908.03274v1, 8 Aug. 2019.
342 F. Yan, J. Wang and G. He et al. / Neurocomputing 400 (2020) 333–342

[30] Z.J. Chong, B. Qin, T. Bandyopadhyay, M.H. Ang, E. Frazzoli, D. Rus, Synthetic Jiawei Wang is a graduate student in the School of Con-
2D lidar for precise vehicle localization in 3D urban environment, in: 2013 trol Science and Engineering at Dalian University of Tech-
IEEE International Conference on Robotics and Automation, Karlsruhe, Ger- nology, P. R. China. He received a bachelor’s degree in Au-
many, May 2013, pp. 1554–1559. tomation from the School of Marine Electrical Engineering
[31] J. Collier, S. Se, V. Kotamraju, P. Jasiobedzki, Real-Time Lidar-Based Place Recog- of Dalian Maritime University in 2019. His-research inter-
nition Using Distinctive Shape Descriptors, in: Proceedings of the international ests are in robot navigation, and semantic scene under-
society for optics and photonics, 8387(1), 2012, pp. 1–11. standing.
[32] K. Sakaeta, K. Nonaka, K. Sekiguchi, Experimental verification of a vehicle lo-
calization based on moving horizon estimation integrating lrs and odometry, J.
Phys.: Conf. Ser. 744 (2016) 1–12.
[33] L. Zou, Z. Wang, Q.L. Han and D. Zhou, Moving horizon estimation for net-
worked time-delay systems under round-robin protocol, vol. 64, no. 12, pp.
5191–5198, 2019.
Guojian He received the bachelor and master degree in
[34] R.W. Wolcott, R.M. Eustice, Fast lidar localization using multiresolution Gaus-
Control Theory and Engineering from Dalian University of
sian mixture maps, in: 2015 IEEE International Conference on Robotics and
Technology, Dalian, China, in 2011 and 2014, respectively.
Automation (ICRA), Seattle, WA, USA, 2015, pp. 2814–2821.
Currently he is a Ph.D. candidate in the School of Control
[35] F. Cao, Y. Zhuang, H. Zhang, W. Wang, Robust place recognition and loop clos-
Science and Engineering at Dalian University of Technol-
ing in laser-based slam for UGVs in urban environments, IEEE Sens. J. 18 (10)
ogy, P. R. China. His-research interests are in robotics, 3D
(2018) 4242–4252.
LiDAR SLAM, mapping and semantic scene understanding.
[36] G. Kim, A. Kim, Scan context: egocentric spatial descriptor for place recog-
nition within 3D point cloud map, in: 2018 IEEE/RSJ International Confer-
ence on Intelligent Robots and Systems (IROS), Madrid, Spain, Oct. 2018,
pp. 4802–4809.
[37] P. Yin, Y. He, L. Xu, Y. Peng, J. Han, W. Xu, Synchronous adversarial feature
learning for lidar based loop closure detection, in: 2018 Annual American Con-
trol Conference (ACC), Milwaukee, WI, US, June 2018, pp. 234–239.
[38] P. Yin, L. Xu, Z. Liu, L. Li, H. Salman, Y. He, W. Xu, H. Wang, H. Choset, Stabilize Huan Chang received bachelor degree in Measurement
an unsupervised feature learning for lidar-based place recognition, in: 2018 and Control Technology and Instrumentation Program
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), from Huazhong University of Science and Technology,
Madrid, Spain, Oct. 1-5, 2018, pp. 1162–1167. China, in 2012, and master degree in Control Theory and
[39] H. Yin, Y. Wang, L. Tang, X. Ding, and R. Xiong, LocNet: global localization in Engineering from Dalian University of Technology, China,
3D point clouds for mobile robots, arXiv preprint arxiv:1712.02165, 2017. in 2016. His-research interests are in semantic scene un-
[40] M.A. Uy, G.H. Lee, PointNetVLAD: deep point cloud based retrieval for large-s- derstanding, 3-D mapping, and path planning.
cale place recognition, in: 2018 IEEE/CVF Conference on Computer Vision and
Pattern Recognition, Salt Lake City, UT, USA, June 2018, pp. 18–23.
[41] G. Kim, B. Park, A. Kim, 1-Day Learning, 1-year localization: long-term lidar
localization using scan context image, IEEE Robot. Autom. Lett. 4 (2) (2019)
1948–1955.
[42] L. Schaupp1, M. Bürki, R. Dubé, R. Siegwart and C. Cadena, OREOS: oriented
recognition of 3D point clouds in outdoor scenarios, arXiv preprint arXiv:1903. Yan Zhuang (M’11) obtained the B.Sc. and M.Sc. de-
07918v1, Mar. 2019. grees in Control Theory and Engineering from Northeast-
[43] M.H. Bharati, J.J. Liu, J.F MacGregor, Image texture analysis: methods and com- ern University, P. R. China, in 1997 and 20 0 0 respectively.
parisons, Chemom. Intell. Lab. Syst. 72 (2004) 57–71. He received the Ph.D. degree in Control Theory and En-
[44] Y. Zhuang, G. He, H. Hu, Z. Wu, A novel outdoor scene-understanding frame- gineering from the Dalian University of Technology, P. R.
work for unmanned ground vehicles with 3D laser scanners, Trans. Inst. Meas. China in 2004. He joined Dalian University of Technology
Control 37 (4) (2015) 435–445. in 2005 as a Lecturer and became an Associate Professor
in 2007. Currently he is a Professor in School of Control
Fei Yan (M’15) received the bachelor’s and Ph.D. degrees Science and Engineering, Dalian University of Technol-
in Control Theory and Engineering from the Dalian Uni- ogy. His-research interests include mobile robot 3D map-
versity of Technology, Dalian, China, in 2004 and 2011, ping, outdoor scene understanding, 3D-laser-based object
respectively. In 2013, he joined the Dalian University recognition, 3D scene recognition and reconstruction.
of Technology, as a Post-Doctoral and became a Lec-
turer in 2015, where he is currently an Associate Profes-
sor with the School of Control Science and Engineering.
His-current research interests include 3-D mapping, path
planning, and semantic scene understanding.

You might also like